Line numbers in stack traces

Wed Aug 10 04:47:47 CEST 2005

Richard A. O'Keefe wrote:
> Informed that
>     The [Erlang] virtual machine has no information about line numbers.
>
> David Hopwood <david.nospam.hopwood@REDACTED> asked
> 
> 	Isn't that a design limitation of the virtual machine code format?
> 	I think language VMs should be designed to allow this at least in a
> 	debugging mode.
> 	
> There are a number of assumptions hidden here:
> (1) all VM code comes from source code
> (2) all source code is hand-written

I wasn't assuming either of these. I said that the VM format should be
*designed to allow this*. I.e. where source line numbers exist (not
necessarily Erlang source; the concept of line numbers applies to any
textual source language), the VM format should provide a way of encoding
them, at least in a debugging mode. And line numbers are still useful if
the source is machine-generated, to the writer of the generator program.

> (3) therefore all VM code *has* line numbers
>     which it is simply incompetent of the VM not to retain.

I didn't say that it was an incompetent decision, either. It is a
design limitation, not shared by many other VM designs (e.g. typical
Java and Smalltalk VMs).

> (4) the relationship between source lines and VM code is easy to maintain.

It's not trivial, but it's not rocket science either.

> Since Erlang/OTP provides several ways to generate and manipulate source
> code, (1) and (2) are false.  Even in C, macros place serious limits on
> the utility of line numbers:  a single "function call" in a line may
> expand to hundreds of tokens which cannot be stepped into,

True, but there is only so much code on a line, even if it expands to
"hundreds of tokens". The size of the expansion isn't actually very important;
what matters is whether a line number identifies the point at which the
exception occurred in a way that is usefully more precise than the current
stack traces. Stepping into the code isn't what we were talking about; line
numbers in stack traces do not replace a debugger.

> and the line where the error *is* may be in a completely different file
> from the reported line.  Erlang macros are regrettably similar to C macros.

This is true, but it doesn't seriously interfere with the utility of line
numbers, especially for programmers who make little or no use of macros.
In any case, if the particular line that is reported contains a macro
application, then it is obvious that the problem might be with the macro
definition. I don't see why that would be surprising or difficult to deal
with.

> Then we mustn't forget things like YECC and code generators like ASN.1
> (see 'asn1') and CORBA interfaces (see 'ic').

JVM classfiles have a way of specifying line numbers in multiple levels
of source language, e.g. both the input and the output of a code generator.
<http://java.sun.com/j2se/1.4.2/docs/guide/jpda/enhancements.html#debugotherlanguages>

> While this is not yet a major consideration for Erlang, consider the
> fact that compilers for functional programming languages tend to do
> *major* restructuring, starting with in-line expansion, simplification,
> reordering, and progressing to things like deforestation (which is
> certainly applicable to Erlang).  The practical consequence of this is
> that a few bytes of code (BEAM or other VM or HiPE-generated native
> code) may contain bits of many different functions, so that line number
> information would overwhelm actual useful code.

That's one reason why you might only want to support this in a debugging mode.
I am skeptical that the size of this information would be larger than the
actual code except in contrived pathological cases, though.

-- 
David Hopwood <david.nospam.hopwood@REDACTED>