Should we provide deeper (and distributed) tracing facilities? #65

hyperthunk · 2012-11-07T11:18:53Z

Erlang has tracing built in to the runtime system, which is very lightweight and has little runtime performance impact on the traced process. Traces can be set up to match processes (all, [pids...], named/registered, etc) and flags turned on to trace calls to specific modules/functions/etc. Traces are sent to one or more tracer processes, and these typically either throw the trace data straight on to a socket (to reduce impact on the traced system) or print to a file descriptor.

I'm not sure how much of this makes sense for Cloud Haskell, but it would be good to see if we can come up with some corollary mechanism that allows us to trace processes simply and efficiently. I don't think the typical traceEvent style would be useful here, but if the message queue for a process could be transparently used to forward messages to an additional tracer process (or process group) then that would be useful!

edsko · 2012-11-07T12:56:31Z

Yes, debugging support is definitely something that would be worthwhile to add. It would also be useful to add metrics such as message queue size (possibly which types).

gbaz · 2012-11-07T16:16:49Z

Erlang uses lamport clocks internally to give an ordering on the traces. As far as I can tell, the trick is that every message contains a (Maybe (Clock,Destination)) and the primitives will manage/pass that along if it exists. It would be relatively simple to extend the process internal state with an optional lamport clock, and expose it directly. There's of course a small overhead even when the clock isn't in use, but I imagine it could be useful in a number of circumstances. My suggestion would be to have the state contain both a Maybe Clock and a [Trace Destinations] so that we decouple the ordering functionality given by the clock which is useful even without a trace from the trace functionality which is a bit useful even without a clock.

edsko · 2012-11-07T16:24:30Z

I think we should separate out concerns about distributed ordering (like Lamport clocks), which can be implemented on top of the core infrastructure, with hooks into the guts of the system that allow to extract the relevant information. I'm not convinced that the core libraries need to do the former, but obviously they do need to do the latter.

gbaz · 2012-11-07T16:27:17Z

Sure -- my concern is just that traces are less useful if you don't have some ordering on causality. A trace mechanism that let userland lamport clocks be hooked in (i.e. some customizable action to generate the traces) would indeed probably be cleaner and more elegant.

edsko · 2012-11-07T16:28:19Z

This causality is also the main difficulty in implementing generic distributed logging. Perhaps that's the core concept that should be implemented (as a separate package, distributed-process-logging perhaps).

hyperthunk · 2012-11-07T16:30:05Z

Erlang supports both kinds of tracing. The one @gbaz mentioned based on lamport clocks is http://www.erlang.org/doc/man/seq_trace.html, whereas the dynamic process tracing I mentioned is a separate, complimentary feature built into the runtime. A vclock based tracing feature would be nice, but should be a separate package IMO.

For an example of simple tracing facilities, see http://www.erlang.org/doc/man/dbg.html.

hyperthunk · 2012-12-14T02:59:19Z

It would also be useful to add metrics such as message queue size (possibly which types).

Yes that would be nice. I'm going to split it out into a separate issue however, as it seems distinct from tracing/debugging.

hyperthunk · 2017-02-06T14:56:56Z

There are some good ideas in this thread, so I'm going to change the title and move it to => Question, viz distributed tracing and hooks into the runtime.

qnikst removed the distributed-process label Jun 18, 2015

hyperthunk added Question and removed Feature Request labels Feb 6, 2017

hyperthunk changed the title ~~Tracing support~~ Should we provide deeper (and distributed) tracing facilities? Feb 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we provide deeper (and distributed) tracing facilities? #65

Should we provide deeper (and distributed) tracing facilities? #65

hyperthunk commented Nov 7, 2012

edsko commented Nov 7, 2012

gbaz commented Nov 7, 2012

edsko commented Nov 7, 2012

gbaz commented Nov 7, 2012

edsko commented Nov 7, 2012

hyperthunk commented Nov 7, 2012

hyperthunk commented Dec 14, 2012

hyperthunk commented Feb 6, 2017

Should we provide deeper (and distributed) tracing facilities? #65

Should we provide deeper (and distributed) tracing facilities? #65

Comments

hyperthunk commented Nov 7, 2012

edsko commented Nov 7, 2012

gbaz commented Nov 7, 2012

edsko commented Nov 7, 2012

gbaz commented Nov 7, 2012

edsko commented Nov 7, 2012

hyperthunk commented Nov 7, 2012

hyperthunk commented Dec 14, 2012

hyperthunk commented Feb 6, 2017