[erlang-questions] Did Erlang's grammar change in R16A?
Anthony Ramine
n.oxyde@REDACTED
Fri Feb 15 02:04:09 CET 2013
My (unfinished) implementation dates back from when Erlang didn't have UTF-8 atoms
and I didn't think they would be coming that fast.
So I didn't have to mess with the arity field and just used this structure:
typedef struct local_atom_ {
Eterm header;
Eterm equivrep;
Uint32 hash;
Uint32 len;
Eterm name[1]; // by the way can we use C99/C11 variable-length arrays in OTP?
} LocalAtom;
I didn't use the same structure as global atoms because they have a member specific
to their hash table structure.
When I make it handle UTF-8 atoms, I'll just split Uint32 len into two Uint16 bytes_len and
Uint16 char_len; 16 bits ought to be enough, right?
https://github.com/nox/otp/blob/bf3334c/erts/emulator/beam/erl_term.h#L561-567
That's an overhead of 3 words on 64-bit, 2 words on 64-bit with halfword emulator
and 4 words on 32-bit. Should we worry about 4 words when safety is concerned?
Should we worry about 4 words when the OTP XML parser cannot be used in production
with user input because it uses atoms for XML names? We shouldn't.
--
Anthony Ramine
Le 15 févr. 2013 à 01:48, Richard A. O'Keefe a écrit :
>> .. first thought you were messing with the arity thing meaning .. perhaps i should sleep. putting more stuff in the header .. seems good
>
> EEP 20 was written with no knowledge of Erlang's low level implementation details.
> The background for it is a WAM-like architecture, with a 2-bit tag
> 00 immediate
> 01 box-of-bits
> 10 pointer to [_|_] (which has no header)
> 11 pointer to box-of-tagged-words
> and boxes have an "arity" field (the size of a tuple, the length of a binary)
> that includes a few "supertag" bits that say what kind of box it is.
> The "arity" field used in EEP 20 holds the bits that say "I am a local atom"
> and a length, encoded to make *both* "number of bytes" and "number of
> Unicode characters" constant-time operations.
>
> The equivrep field is what enables atoms that have been found to be equal
> to be chained together; if a 3-word header is too big (despite being the same
> size as or smaller than a binary's header), that word could be sacrificed.
More information about the erlang-questions
mailing list