[erlang-questions] idea: function meta data

Fri Nov 16 17:27:03 CET 2007

IMHO, there seems to be two different things being described, and  
hence they may have different requirements and solutions.

When I think of introspection, I think of mechanisms which enable  
code to manipulate other code at run time.

I want introspection to assist with developing tooling. For example  
test tools that will automatically find a bunch of functions and test  
them, or a tool that will autogenerate web service interfaces to a  
bunch of functions, etc. I'd like these tools to work even when  
source code is unavailable.

I would expect to have functions to get a list of function names +  
type signatures from a module at run time.
I would expect to be able to call any of the functions identified  
using that list. I would expect to be able to construct a (type)  
valid parameter list for any of the functions from that same function  
meta data. I do not expect the source code to be available.

In order for these types tools to be reasonably robust, the  
information they use needs to conform to a very well specified  
syntax, or type.
I would like those guarantees to be provided by the compiler, as much  
as practical. I would like the information to *always* be parsed and  
checked, and the compilation to fail if they do not conform to  
specification.
I would like the information embedded into the compiled file  
(e.g. .erl), so that I can be confident that it is in synch. with the  
source at the time of compilation, and to reduce the hassles of  
packaging and distributing code.

I am willing to have a small number of conventions layered on top,  
but only a few, please, I would like the information in code modules  
to be well defined and stable for the next 5+ years.

On the other hand, adding stuff for human beings, like helpful  
comments, seems to me to be different.

I would like human-readable information to be spell checked, but I  
wouldn't want compilation (by default) to fail if there were a  
spelling mistake, or a word that is simply not in the dictionary,  
e.g. autogenerate. I expect human-readable content to have the same  
quality as documentation.

I also don't feel that I need human-readable content in the compiled  
code because I am happy having it in the documentation! I think that  
it makes sense for the documentation to follow a different process to  
the compiled code, especially if there are multiple Human languages  
supported. I can see there is a lot of value in having some helpful  
comment conventions, but I don't see that these need to be as  
strictly managed as language syntax because they are intended to be  
consumed by humans, and not programs.

Put another way, I am happy for documentation to be extracted from  
source code and injected into the documentation process, but I don't  
feel the documentation process *needs* to depend on compiled code.  
Nor do I expect all documentation to come from source code files.

So, IMHO, there seems to be two different sets of requirements, and  
they don't need to have a single solution.

In my mind, machine-readable information being extracted from parsed  
source and being embedded the code is tidy, and logical. Human- 
readable documentation being extracted from comments is fine, but  
embedding it in compiled code seems to be optional. I don't feel it  
should be necessary to issue a new version of code modules just  
because there is documentation available in a new Human language.

I think that the function name (and type signature) tie tmachine  
readable and Human-readable information together, and ideally should  
not be repeated, and should be parsed at the highest level of  
quality, i.e. by the compiler.

So, putting issues of how to implement aside for a second, I would  
like the function type-signatures to become part of the source code  
(I'm okay using pre-processor syntax to delimit it), hence the type  
signature must follow Erlang syntax (or an extended Erlang syntax, I  
suppose); the type signature should not be hidden in a text string or  
comment. The type signature should be embedded in the compiled code.

I am happy if human-readable stuff stays in comments with some  
helpful conventions, and it may or may not end up in the compiled  
code, and I would like to choose.

Just my $0.02 worth
G Bulmer