Strings (was: Re: are Mnesia tables immutable?)

Tony Finch <>
Wed Jun 28 17:30:54 CEST 2006


On Wed, 28 Jun 2006, Romain Lenglet wrote:
>
> Integer terms are encoded into different forms depending on their
> value. Since Unicode considers encoding every character into up
> to 32 bits, not more, let's consider the different encoding
> formats available for 32-bit integers:

Unicode characters have code points up to 0x10FFFF so you don't need to
worry about the SMALL_BIG_EXT binary format.

> (1) 0<=x<=255:
> byte 0: SMALL_INTEGER_EXT tag
> byte 1: x (8 bits)
> (2) -134217728<=x<=134217727
> byte 0: INTEGER_EXT tag
> bytes 1-4: x (32 bits)
> (3) all other cases:
> byte 0: SMALL_BIG_EXT
> byte 1: length (== 4)
> byte 2: sign (8 bits!)
> bytes 3-6: abs(x) (32 bits)

Tony.
-- 
f.a.n.finch  <>  http://dotat.at/
PORTPATRICK: STRONG SOUTH OR SOUTHEAST WINDS IN ROCKALL AND BAILEY SPREADING
TO MALIN, HEBRIDES, FAIR ISLE, FAEROES AND SOUTHEAST ICELAND DURING FRIDAY BUT
DECREASING BY SATURDAY, PERHAPS GALES AT FIRST IN WEST ROCKALL AND WEST
BAILEY.



More information about the erlang-questions mailing list