[erlang-questions] [proposal] Declarative syntax for metadata (long!)

Thu Mar 18 14:06:48 CET 2010

I agree, this is one of the things that is missing from Erlang

I think the problem is one of introspection.

Basically every "thing" has three representations

    (1) The external print representation of the thing
        (ie the thing you type into a module, or in the shell)
    (2) The parse tree representing (1)
    (3) The internal compiled form of the thing

It should be possible to convert at run-time between any of these
three representations.

We should be able to introspect a module at run-time and recover a
list of funs, we should be able to take a fun and recover its syntax
tree or an ascii representation of the fun. The same should be true
for types, specs, edoc strings an so on.

All things should be invertible.

The simple things like atoms,floats etc can be converted to and from lists.
But funs and modules cannot.

We need modules, funs, attributes, types, specs edoc strings etc. all to be
introspective and to be able to convert them to and from lists at run
time.

We should be able to say

    mod2list(lists) and get back a list of forms

    (say)

    [{name,lists},
     {export,member,2},
     ...
     {function,member,2,F1},
     ...
    ]

Then we should be able to rip apart the funs

    fun2list(F1)

which might return

    "fun(H, [H|_]) -> true;..."

or

  fun2abs(F1)

which might return

    {function, member, 2,
     [{clause,...}]}
...

We would need to sort out the external printed representation of
a reference to a fun and so on ...

On Thu, Mar 18, 2010 at 12:34 PM, Vlad Dumitrescu <vladdu55@REDACTED> wrote:
> Hi all,
>
> I feel I should follow up on my rant from last night (I hope the tone
> wasn't too harsh!). I have been thinking about these things for a long
> while, but I'm sure I still miss some points.
>
>
> Erlang source code contains besides actual code even meta-data and
> configuration data, disguised as Erlang terms, comments or strings.
> Some of this data has been integrated into the language (type
> specifications, for example), but not all of it. I would like to
> present arguments for and against integrating all such data.
>
> First a non-exhaustive list of data that IMHO would benefit from
> becoming a first-class citizen of the language:
> - behaviour callback specifications
> - supervisor child specifications
> - match specifications
> - edoc
>
> Today, there are three ways to encode this data: as Erlang terms, as
> strings and as comments. (I know strings are terms, but they are
> different)
>
> Comments are used to store structured documentation info (edoc) and
> are the weakest case for "uplift" to citizenship. The argument for it
> is that there is a mini-language involved anyway, so almost everything
> is already in place. I think that there is a very simple way to go all
> the way: introduce block comments. The main problem for me is those
> pesky "%%" when reformatting the documentation. It would be nice if
> there were special edoc comments, so that it's easy for a tool to tell
> them apart from regular comments.
>
> Erlang terms are a very flexible representation that works fine at the
> lowest level, but I argue that programmers shouldn't be forced to
> think at that level. Using Erlang terms is in fact forcing the
> programmer to do the work of a parser and convert a high-level
> declaration into a bunch of terms with complex structure (thus easy to
> get wrong). If integrated in the language, the parser and compiler
> would be able to detect errors and inconsistencies that otherwise
> would result in run-time bugs.
>
> Strings are a special case because they can contain the high-level
> declarations I mentioned above and thus are easier to reason about,
> but they are still not properly parsed and any error will reveal
> itself at run-time.
>
> Some of the data encoded as terms is already partially integrated:
> some match specifications can be written using the fun_ms parse
> transform. What I argue for here is going all the way and provide this
> kind of support for all of them and for the other data mentioned here,
> hopefully without having to clutter all files with parse_transform
> declarations.
>
> A closely related issue is that some of this data is declarative but
> returned by magic functions. I can't see any use case (except for
> match specifications) where these specifications need to be generated
> dynamically, so the data could just as well be provided by an
> attribute. The compiler can generate the magic functions if needed, or
> (better IMHO) we could provide a better API to module attributes. If
> using attributes, the advantage is that we can use specific
> mini-languages that fit better the domain, because we're not limited
> to Erlang expressions.
>
> <exploratory_mode on>
> By emphasizing declarative features in the language we can start
> considering other things that can be handled in a similar way, thereby
> moving on a higher abstraction level. Like Mikael mentioned too,
> contract declarations (UBF or other kinds) come to mind easily. We can
> extend supervisor child specs to a description of all processes in an
> application or even the whole system.
>
> Of course, this can already be done today. The problem is that without
> a dedicated language, the declarations end up very difficult to read,
> to reason about and to debug. In some cases they can even be more
> verbose than the Erlang code that would achieve the same result.
> Working on a higher level gives better understanding
>
> Now, going even further into the future: if this will turn out to work
> as well as I hope it will, making everybody twice as productive and
> twice as happy, it might happen that more and more applications and
> tools will see benefits from going the same way, but for non-OTP
> applications it won't be possible to integrate "their" declaration
> mini-language. I see two ways to handle this, both being definitely
> not something one could throw together over a weekend:
>  - allow application-defined parsers to be called on parts of the
> source code. This could provide even cooler functionality, but I'm not
> going to dig into that right now :)
>  - define a single declaration language that can be extended with
> user code and with extensible syntax. By that I mean something in the
> spirit of Ruby, where it is extremely easy to write domain-specific
> languages that are at the same correct Ruby programs.
> </>
>
> == Conclusion ==
>
> The part in the beginning is something that I think is useful,
> relatively easy to specify and implement and without too many
> compatibility issues. Is there anybody sharing this opinion? Is it EEP
> material? Is even the exploratory_mode part worth detailing right now?
> (so that when 5 or 10 years from now someone wants to implement it,
> he/she doesn't discover it requires rewriting everything).
>
> Another way to put the question is: what parts (if any) should become
> part of the Erlang language, and what parts (if any) fit better in an
> Erlang-based language?
>
> I feel that it is important to sometimes raise one's eyes and try to
> get a glimpse of what the future holds and even to try to shape it.
>
> best regards,
> Vlad
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>
>