[erlang-questions] Erlang term to ASCII

Richard A. O'Keefe ok@REDACTED
Tue Feb 11 00:40:21 CET 2014


It seems to me that the basic issue is that
 - there is some subset of Erlang data
   ? Why do I say "subset"?
   ! Because there are things like funs and pids that do not
     have textual representations that read back
 - which the external system stores
 - and can convert to a string Erlang can read
 - and can hash (by first converting to a string)
 - but the built-in functions convert the same data
   to different strings
 - so the hash would be different.

1. Compute the hash directly from the data, not the string.
   Since MD5 is mentioned, this may not be practical.

2. Implement the same formatting algorithm in the CMS and Erlang.
   For example, quote all atoms, never generate ", ..., add no
   white space.  Beware also of issues in formatting floating
   point numbers.  Implementing your own algorithm also has the
   nice benefit of documenting precisely what subset of Erlang
   data is allowed.  Will maps be allowed?  What will you do to
   get a _canonical_ form for maps?

3. Use some other exchange format.
   Beware that the same issue can come up again.
   Two JSON writers might add different amounts or kinds of white
   space to the same term, or might sort slots of an object differently,
   or might display numbers differently (1000 vs 1.0e3, ...).
   Two XML writers might do different things, which is why
   Canonical XML exists.
   Whatever you choose, if there is any implementation freedom in
   the interface specification, you may need to make your own
   writer in Erlang or in the CMS or both.

4. Of course, Erlang being open source, you could always find the
   code that does ~p, clone it, and edit it.  (Which is one way to
   do alternative 2, but probably harder.)




More information about the erlang-questions mailing list