[erlang-questions] Rhetorical structure of code: Anyone interested in collaborating?

Thu Apr 28 17:02:12 CEST 2016

On 04/28, Richard A. O'Keefe wrote:
>I've been thinking for some time of writing a paper with the
>title "Why can't I see the structure?" based on the idea that
>modules in every programming language I know look like blobs.
>I'm aware of visual notations like UML, BON, SDL, and what
>was it, Visual Erlang?  But for me, those are just spaghetti
>with meatballs; once you get beyond just a handful of boxes
>in your diagram, all diagrams look much the same.  In any
>case, I'm interested in the medium scale.
>
>Why can't I see the structure in a 3000-line module, or even
>a 1000-line module?  (I am not asserting that Erlang is
>particularly bad here.  It isn't.)
>
>The kind of structure I'm interested in can, I think, be
>described as *rhetorical* structure, like relationships
>between paragraphs.
>

For me, the conceptual leap seems to be related to the same kind of 
challenges that exist when trying to explain technical material.

The structure of dependencies to understand a piece of content is based 
on a graph, where any item may require understanding of N dependencies 
at once to make sense out of it.

However, what we can display, explain, or absorb, is generally done 
linearly: reading a text a paragraph at a time, giving lessons on 
concepts one by one, and so on.

In the case of code, every function or method or dependency I hit is a 
fractal of new information to explore. In every single module, I can at 
most display code sequentially, with a few variations:

- bottom-up
- top-to-bottom
- divisions into API or Public / Protected / Private
- divisions into proper module hierarchies (where a 'util' module 
  eventually develops and eats up all organisation that once were)

There's use of ctags and other equivalents to be able to jump around 
code with ease, and then stuff that happened like code bubbles[1], which 
attempted to break up modules and classes into independent units that 
could be displayed on their own. There's probably a lot more equivalent 
that you know more than I do.

Mostly I've been sticking to vim for the last X years of my career and 
the only sensical steps I have found to traverse code have been to pick 
either a level-order, depth-first, or bottom-up traversal and stick with 
it.

In the worst cases, I've had to just whip out a piece of paper and draw 
bubbles and arrows of everything with at least a second dimension (I can 
use both vertical and horizontal space! to convey things!)

Erlang has made a few things very interesting by forcing an additional 
structure in supervision trees and well-defined behaviour, which lets me 
scan things at a glance in terms of project structure and infer what it 
does without actually needing to understand code.

But when I get to the code, then there's the well-defined API, then 
behaviour callbacks (which tie the API and private functions to do 
modifications together) and then the rest of modules as libraries that 
get to be used there.

>My *belief* is that if this structure were made explicit,
>perhaps by textual structure, perhaps by annotations, perhaps
>by some graphical form (but probably derived from annotations),
>it would be easier to understand medium-sized wodges of code.
>
>I'm aware of annotation support in languages like Java and C#
>and for that matter, Smalltalk, but with the exception of
>Smalltalk, nobody seems to be using annotations in this way
>(and that exception is me).
>
>I'd be very interested in hearing from anyone else who has been
>thinking in this area.
>

I've been trying to think about it for some time, as it also has an 
impact when writing documentation and tutorials: how much context is 
needed for this to make sense? Can the context be derived?

Another one for me has been trying to figure out how to enforce 
system-wide constraints within local pieces of code: this ID is assumed 
to be opaque and not something you can sort on, even if for now it is an 
integer. Breaking such an assumption can ruin systems over time, but 
there's no good way to communicate it outside of assiduous inline 
comment discipline.

One other interesting thing is that from past informal polls[2], the 
more experimented a developer is, the least they tend to want or desire 
comments, but newer developers think the opposite.

[1]: http://cs.brown.edu/~spr/codebubbles/
[2]: http://ferd.ca/poll-results-erlang-maintenance.html