[erlang-questions] Vector instructions

Zvi <>
Mon Apr 7 04:05:24 CEST 2008

Richard A. O'Keefe wrote:
> On 7 Apr 2008, at 9:16 am, Zvi wrote:
> I cannot let this past.  It must be understood clearly that
> for many purposes, Erlang's use of lists of integers for strings
> is an EXCEPTIONALLY GOOD design.
> The number of programming languages with a "real Unicode-aware immutable
> string datatype" can be counted on the fingers of one ear.  Unicode is
> far more complicated than most people realise.  The Unicode 5.0 book
> (which I am currently slogging through) is >1400 pages.  Admittedly,
> much of that is code charts, but an astonishing amount of it is not.
> I am not excepting Java from the list of "not real Unicode" languages.
> Unicode has about a hundred thousand characters.  Java characters are
> 16 bits.  Need I say more?
> One language that _doesn't_ have Java's central defect is Haskell,
> where the native representation for a string is, ahem, a list of
> character( code)s.


In my opinion the reason, that Erlang needs specialized datatypes like:
binary, bitstring, string, matrix, etc. it's because it's dynamically typed.
When Erlang designers/implementers needed to process sequences of bytes,
they for some strange reason didn't used tuples or lists of integers, but
introduced a new datatype. Same goes for recently introduced bitstring
datatype: why not to use tuples or lists of true or false atoms? :-)

In fact I was wrong and Erlang "strings" are not just syntactic sugar, I
think there is a lot of optimization behind the scenes and runtime trying to
guess on every list if it's a string or not. If it's string it may use a
more compact representation. The question is why to guess, if we have
dynamically typed language, where each value tagged by a datatype anyway?

My point is Erlang tuples and lists are polymorphic collections and when you
want to have homogenous collections. Also all datatypes mentioned above also
must provide indexing and subslicing, which is somewhat not very effective
when underlying represntation is linked lists.

About Unicode-support in string datatype, it must be practical, not 100%
right, so something like utf16 is good enough for me.

In Haskel you specify type of element (I do not know this language, just
guessing the syntax, I have no idea how to represent vector in Haskel, so
everything is a list):

type  String  =  [Char]
type  Binary = [0..255]
type  Bitstring = [Bool]
type  Matrix = (Int,Int,[Double])

BTW:  I think Haskel's tuples are done right. I never understood why
Erlangs' functions have arity, when every function can have single argument
as a tuple. Haskel tuples must have at least 2 elements. So any single
element is a tuple of 1? It's the same like in Matlab - scalar is a matrix
of 1x1.
But in Erlang  x and {x} is not the same. Probably it's because Erlangs
tuples are both tuples AND polymorphic vectors.


View this message in context: http://www.nabble.com/Vector-instructions-tp16468138p16532464.html
Sent from the Erlang Questions mailing list archive at Nabble.com.

More information about the erlang-questions mailing list