[erlang-questions] EEP 40 - A proposal for Unicode variable and atom names in Erlang.

Thu Nov 1 22:15:10 CET 2012

On 2/11/2012, at 1:36 AM, Dmitry Belyaev wrote:

> I've looked through the proposal and don't understand why there are no proposal to add localized keywords?

Because that's actually an orthogonal concern.

Suppose for example that you want

	essayez		mapped to	try
	    ...				    ...
	attrapez			catch
	    ...				    ...
	fin				end

This has nothing to do with the character set.

The classic way to handle keywords in a tokeniser is FIRST to
recognise them (using an automatically generated or hand coded
deterministic finite state machine) as identifiers and LATER
to look them up in a table (possibly using perfect hashing) to
see if they are keywords.

There is no point in allowing people to plug Serbian keywords
into a table if they will never be recognised as identifiers to
start with.  We have to get that part right first.

I have three observations on the general idea.
(1) I have seen Pascal localised in exactly this way.
    That was French, which is why I used French in my example.
(2) When I mentioned EEP 40 to a colleague his immediate
    reaction was precisely the same, that *obviously* people
    should be able to plug their own keywords in too.
(3) Ada and Python have not done this.

Suppose we added a new directive:
-keywords(kw_set_id).
which looked in some path for a file containing
[{'essayez','try'},{'attrapez','catch'},{'fin','end'},...].
and used that to update a dictionary.
The lexical analyser 
Then the lexical analyser could report the English keywords
to the parser.  We might want two lists: one for keywords
and one for directives (other than -encoding and -keywords).

This is NOT an EEP; it is not a draft of an EEP; and I have
no intention of producing an EEP on this topic at this time.
Someone else can write that one.

> Suppose I will be using atoms and variables that are easy to read in my own language. Then I'll definitely be frustrated if I have to write keywords in any other language. More than that, it will be very annoying to anyone who has to switch keyboard layout from English to native.

One of the reasons that I have no intention of writing an EEP about this
is that flicking between two keyboards is for me a single keystroke.
(On the iPad: tap the globe.  On the desktop Mac: command space.)
Switching keyboard layouts is about as hard as switching from lower to
upper case and back.  It should also be possible to configure your
text editor, perhaps using abbreviation support, to turn
"@es" (or the equivalent in your language) into "try" and so on.

Until you've written your own wrappers around the library components
you use, you'll need to flick back into Latin script to call those
anyway.  Such wrappers _can_ be written, so the need to use some
Latin script in everyday work may not continue forever, but it
does mean there has to be a transition period in which people using
non-Latin keyboards have to learn to use Cmd-Space.