Line numbers in stack traces
Richard Cameron
camster@REDACTED
Wed Aug 10 12:33:27 CEST 2005
On 10 Aug 2005, at 06:16, Richard A. O'Keefe wrote:
> You are confusing a *source position* (which makes sense even if,
> as in
> Interlisp-D and Smalltalk, the code is technically held in a data
> base)
> with *line numbers*.
OK. This is perhaps straying away from the point though. Whether I
get a line number, or some other way of working out where my program
died, that's got to be good news. I've just spent a good five minutes
tracking down a badmatch inside a gen_server handle_call/3 function.
There's a clause for each message the server handles, so saying the
error is in handle_call/3 is effectively sending me off to look for a
needle in a haystack - especially when the virtual machine probably
knows (or could know) more information about where it went wrong and
is only giving me cryptic clues so I can work it out for myself.
> Note that if a programming language is intended for use with tail
> recursion optimisation, as Prolog and Scheme and Erlang are, then
> a debugger *must* discard most dynamic line number information or
> else choke on its own stack. (Prolog debuggers commonly choose the
> choke-on-your-own-stack option.)
Fine. I admit that optimising away tail-recursive function calls
complicates matters somewhat, but I don't think it makes it
impossible. "Programs are data" as you point out before, and Lisp is
an example of a language with far more complex macro substitution
than Erlang. If you imagine the compiler repeatedly transforming the
S-expressions of your raw source code down into simpler and simpler
forms which eventually converge into something which can be
translated down into bytecode or machine instructions, then there
*is* an inherent concept of source position which can be inferred
from the address of the crash - you simply unravel the compilation
process back to find out which expression in the user's code caused
the problem. If you partially unravel the compilation process you get
to see intermediate results of macro substitution.
Whether this gives a line number directly is irrelevant. In your Lisp
environment where all code is pretty printed you can simply ask the
pretty printer which line it's chosen to print that expression on
today. Of course, if I can a location of the error more detailed than
just a line number, I'd take that too.
> to line numbers. Basically, Smalltalk debuggers rely on finding
> 'call'
> instructions in the byte code and matching them up to message sends in
> the source. (In Smalltalk-80, Squeak, and Ambrai, stack traces do NOT
> include line numbers.)
So we're talking about an implementation issue here? Instead of, say,
bloating up your bytecode by adding a (filename, linenumber) tuple to
each instruction, can't we build up a mapping table (the one I talk
about above) at compile time which maps the bytecode address back up
into the original source code? From what I recall, this is how the
debug information is stored in a C executable.
Does this address your concerns about bloating the size of the
bytecode in a world where the VM can handle debugging information,
but the user has chosen to disable it? In that case, we'd just
eliminate the source<->bytecode mapping structure and the overhead of
"potentially" having debug information would be pretty much zero?
> If the compiler recorded the source position of the first token
> of each clause in each function, then knowing which clause of which
> function you were in would get you very close in terms of line
> numbers.
Yes. I think that's what I'm suggesting. Actually, if you're going
for broke, why not record the position of each expression in the
function in our separate (optional) mapping table.
> Like well-written Smalltalk, well-written Erlang is supposed to have
> lots of *small* clauses, no?
handle_call/3 is probably the worst case I've seen at the moment.
Lots of little function clauses which make up one big function.
Unfortunately the stack trace doesn't tell you which *clause* it
failed in... only the function name and arity. That's decidedly
annoying.
> It's sufficiently difficult that people have earned PhDs for doing it.
> I think there are much better things to spend the time and money on.
Well... if it's a choice between spending my time and money trying to
interpret somewhat vague stack traces, then I could see the
attraction of investing some of my time sorting this out.
> In any case, if the particular line that is reported contains a
> macro application, then it is obvious that the problem might be
> with the macro definition. I don't see why that would be
> surprising or difficult to deal with.
>
> Because it isn't "obvious" *where* the problem is within the
> cascade of
> macro applications.
But this problem has been satisfactorily dealt with in Lisp, which
has far more powerful and confusing macro support than Erlang.
> There is one important difference between Erlang and Java:
> the level of support.
> There is so much money and manpower behind Java that Sun can afford
> to do things (and so much muscle behind C# that Sun cannot afford NOT
> to do things) that are not necessarily a good use of resources for
> enhancing Erlang.
Well, I really don't think that expecting the stack trace to tell you
precisely where the error occurred is something that Sun/Microsoft
implement in their VMs purely for beauty parade purposes such that IT
consultant produce ticks in the appropriate boxes. It's highly
useful, and it's really winding me up that I can't see this
information at the moment.
My initial impressions of Erlang are that I'm far more efficient
writing "control" applications in it than I am in Python/Ruby/Tcl
etc, and that it seems to encourage me to think in way where I
produce fewer bugs. I'm generally much more confident that, once the
code compiles, it'll work first time than I am in pretty much any
other language. However, the really horrible part comes when I really
do have to start interpreting stack traces. I feel like I'm playing a
guessing game, and I'm just wasting my time.
It's possible I'm just not yet experienced enough in Erlang to think
in ways which don't require me to know where the code has crashed, or
to perhaps just grumpily accept the fact that the runtime doesn't
tell me that. But, for now, I'd have to rank it as the most
significant disadvantage of the language.
> So in order to support line number information (should that prove
> to be
> usefully more precise than {module,function,arity,clause}) in a
> debugging
> mode, it is NOT necessary to have any support for line numbers in
> the VM.
> (As noted above, QP was able to provide source positions in
> debugging mode
> without having any support for it in the WAM.)
I think this is exactly what I'm talking about. I just can't see why
you don't want it.
> It is large, because the information required basically amounts to
> undoing
> the transformations. Inlining is just the beginning.
Doesn't this just result in a more complex source<->bytecode mapping
table?
> My earlier mention of TRO doesn't seem to have sunk in.
> Iterative code in Erlang relies on turning the dynamically last
> call in a function into a jump.
>
> When an error (badmatch, badarith, badarg, &c) is reported, the line
> number of the actual error report doesn't tell you very much. Quite
> often it's inside some system function. *Your* function call which
> contains the error has very often disappeared completely from the
> stack.
So your function call disappears from the reported stack trace too. I
don't see how that makes the availability of more detailed code
position information relatively less useful.
I can (just about) live with functions disappearing from the stack
like that... and I really don't want to complicate things by throwing
yet another idea in, but how about creating a ring-buffer in the
virtual machine to keep track of the top n items of the "logical"
stack. So, instead of simply producing a jump instruction at a tail-
recursive call, why don't we write the stack frame which would have
been generated if we didn't have this optimisation into the buffer?
That might go some way into solving the other (but less annoying)
erlang stack trace problem of it omitting function calls which have
been optimised away - you'd generate the top n levels of the stack
trace from the ring buffer, and then glue on whatever else is sitting
below on the real stack?
> The important question is not "is adding line numbers a good way to
> improve Erlang stack traces" but "what is the best use of Erlang
> development resources to help people get their Erlang programs right?"
>
> As I think I've said, I'd rather have QuickCheck.
Oh, I don't think so. Programs crash, and all I want to know is where
they've crashed. This information is (almost) in the virtual machine
already, and it really annoys me to have to play a guessing game
every time.
Richard.
More information about the erlang-questions
mailing list