[erlang-questions] Trace-Driven Development
Michael Turner
michael.eugene.turner@REDACTED
Tue Jun 5 12:42:27 CEST 2012
Ulf, when I write "seq_trace implements Lamport clocks", please try to
read it as you would "This ANSI Standard C compiler implements IEEE
arithmetic." That doesn't mean "this ANSI Standard C compiler *is*
IEEE arithmetic."
If people want to *add* new API elements to seq_trace, fine. I'd could
go for a different way of calling stuff, and of representing the
relevant data structures. Not important, but nice.
If people want to add new features to seq_trace, fine. Maybe it could use some.
Riak uses vector clocks, which are based on scalar (Lamport) clocks.
They can be problematic, since vector clocks involve n-length vectors,
where n is the number of processes concerned. Still, if somebody
wanted to add *that* to seq_trace, I'd be cool with it.
But changing "seq_trace" to "lamport" is
(a) semantically wrong, since seq_trace *implements* Lamport clocks
but is not *simply* Lamport clocks,
and
(b) pragmatically wrong, since it breaks any existing code that
depends on seq_trace, and also breaks anything out there that has
implemented a module called "lamport" independently.
By the way, I can't figure out why it's called "sequential tracing".
If somebody told me, "it has to be called 'X tracing', solve for X,"
I'd say, "X = 'parallel'", not "X = 'sequential'." Does the
"sequential" refer to the fact that the (single) tracer process
receives a stream of events? OK, but ... that's not the important
thing -- precisely because those events didn't necessarily happen in
the order received, in *real* time. (Whatever that is - Lamport points
out that understanding of the theory of relativity gave him some
insights into this problem). That's the point of having logical clocks
like Lamport's - to help sort out chronology -- and, you hope --
causality -- to the extent possible when you can't rely 100% on
real-time clocks.
Which brings up another point, already raised by Scott Fritchie, here:
http://erlang.org/pipermail/erlang-questions/2007-May/026822.html
and not adequately addressed in Kenneth Lundin's reply, here:
http://erlang.org/pipermail/erlang-questions/2007-May/026827.html
Scott writes of seq_traces real-time
(seconds/milliseconds/microseconds) timestamp:
"Inviso uses the (optional) timestamp, but that's the erlang:now()
value, and even an NTP time sync may not be good enough for a busy
system to avoid bogus event ordering. I have enough problems with NTP
on or lab machines -- it shouldn't be hard, but apparently it is,
because their NTP daemons aren't running 1 time in 8.
"Don't get me started about time drift in Linux virtual machines,
VMware and Xen both. {sigh}"
Yes. And don't get ME started about a terrestrial node drifting out of
sync with one that's orbiting at velocities where relativistic effects
start to add up.
Scott makes good points, but the documentation for seq_trace carries
no cautionary notices about relying on the real-time timestamps it
reports. This seems an odd omission to me, since seq_trace would seem
to be especially useful in cases where real-time clocks are
unreliable. You could even dispense with real-time timestamps in
seq_trace, and what's left would still have a substantial raison
d'etre.
-michael turner
On Tue, Jun 5, 2012 at 5:19 PM, Ulf Wiger <ulf@REDACTED> wrote:
>
> On 5 Jun 2012, at 09:47, Michael Turner wrote:
>
>> All I know is: if that's what you're doing, that's
>> what you should call it.
>
> An alternative, as I expanded on in my last email, which I sent just
> before reading this one, is that perhaps they should be doing something
> else instead. Seq_trace is not well understood for the purpose it was
> intended for. It should perhaps be reworked entirely.
>
> If so, it does seem like a good idea to change seq_trace to 'lamport',
> make it clearly a generic implementation of Lamport clocks, for
> whatever purpose.
>
> This could be done today. As it affects the VM, it should be an EEP, I think.
> The initial implementation of 'lamport' could be completely based on
> seq_trace, but renaming functions and changing the documentation so
> that it clearly references relevant papers and illustrates how it could be
> used. It ought to be perfect for e.g. "model tracing", similar to what 'et'
> does (another API that is woefully under-used since the documentation
> turns people away). Code could be inserted as "executable comments"
> basically indicating "we are now in <this state> in the model". With such
> code in place, one could do quite sophisticated visualizations of a
> running system
>
> It doesn't seem like such a module ought to have a system_tracer()
> function. Rather, tracing on Lamport clock events should then be
> more intuitively integrated into the tracing BIFs (halfway there already).
>
> Actually, 'et' handles seq_trace events and processes them for use in
> the visualization. However, the documentation doesn't make the
> connection. The seq_trace events are included in the type signatures,
> but never mentioned elsewhere.
>
> This is interesting. It seems as if 'et' could rely entirely on seq_trace.
> Instead, it more or less mandates global tracing. Why?
>
>> And speaking of what to call things: I don't think you should still be
>> calling seq_trace "beta", if (as Ulf says) it originated ca 1997. I'd
>> do the interface differently, but the more important thing to me now
>> is stability and correctness.
>
> I agree this is a problem, like with parameterized modules. You shouldn't
> have beta or unsupported features lingering for years. Either make them
> supported or remove them and possibly provide something better.
>
> BR,
> Ulf W
>
> Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
> http://feuerlabs.com
>
>
>
More information about the erlang-questions
mailing list