Skip to content

Should we provide deeper (and distributed) tracing facilities? #65

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hyperthunk opened this issue Nov 7, 2012 · 8 comments
Open
Labels
Milestone

Comments

@hyperthunk
Copy link
Member

Erlang has tracing built in to the runtime system, which is very lightweight and has little runtime performance impact on the traced process. Traces can be set up to match processes (all, [pids...], named/registered, etc) and flags turned on to trace calls to specific modules/functions/etc. Traces are sent to one or more tracer processes, and these typically either throw the trace data straight on to a socket (to reduce impact on the traced system) or print to a file descriptor.

I'm not sure how much of this makes sense for Cloud Haskell, but it would be good to see if we can come up with some corollary mechanism that allows us to trace processes simply and efficiently. I don't think the typical traceEvent style would be useful here, but if the message queue for a process could be transparently used to forward messages to an additional tracer process (or process group) then that would be useful!

@edsko
Copy link
Member

edsko commented Nov 7, 2012

Yes, debugging support is definitely something that would be worthwhile to add. It would also be useful to add metrics such as message queue size (possibly which types).

@gbaz
Copy link

gbaz commented Nov 7, 2012

Erlang uses lamport clocks internally to give an ordering on the traces. As far as I can tell, the trick is that every message contains a (Maybe (Clock,Destination)) and the primitives will manage/pass that along if it exists. It would be relatively simple to extend the process internal state with an optional lamport clock, and expose it directly. There's of course a small overhead even when the clock isn't in use, but I imagine it could be useful in a number of circumstances. My suggestion would be to have the state contain both a Maybe Clock and a [Trace Destinations] so that we decouple the ordering functionality given by the clock which is useful even without a trace from the trace functionality which is a bit useful even without a clock.

@edsko
Copy link
Member

edsko commented Nov 7, 2012

I think we should separate out concerns about distributed ordering (like Lamport clocks), which can be implemented on top of the core infrastructure, with hooks into the guts of the system that allow to extract the relevant information. I'm not convinced that the core libraries need to do the former, but obviously they do need to do the latter.

@gbaz
Copy link

gbaz commented Nov 7, 2012

Sure -- my concern is just that traces are less useful if you don't have some ordering on causality. A trace mechanism that let userland lamport clocks be hooked in (i.e. some customizable action to generate the traces) would indeed probably be cleaner and more elegant.

@edsko
Copy link
Member

edsko commented Nov 7, 2012

This causality is also the main difficulty in implementing generic distributed logging. Perhaps that's the core concept that should be implemented (as a separate package, distributed-process-logging perhaps).

@hyperthunk
Copy link
Member Author

Erlang supports both kinds of tracing. The one @gbaz mentioned based on lamport clocks is http://www.erlang.org/doc/man/seq_trace.html, whereas the dynamic process tracing I mentioned is a separate, complimentary feature built into the runtime. A vclock based tracing feature would be nice, but should be a separate package IMO.

For an example of simple tracing facilities, see http://www.erlang.org/doc/man/dbg.html.

@hyperthunk
Copy link
Member Author

It would also be useful to add metrics such as message queue size (possibly which types).

Yes that would be nice. I'm going to split it out into a separate issue however, as it seems distinct from tracing/debugging.

@hyperthunk
Copy link
Member Author

There are some good ideas in this thread, so I'm going to change the title and move it to => Question, viz distributed tracing and hooks into the runtime.

@hyperthunk hyperthunk changed the title Tracing support Should we provide deeper (and distributed) tracing facilities? Feb 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants