are Mnesia tables immutable?

Wed Jun 21 07:36:04 CEST 2006

>
> Most list operations are not particulary costly.
>
> * traversing list e.g. "hd(List)" or  "[H | R]  =  List" is a O(1)
> operation.
> * building lists by adding to the front e.g. "NewList = [H | OldList]"
> is a O(1) operation.
> * reversing a list is a O(N) operation.
> * the various lists:map/foldl/foreach operations and list comprehension
> work over lists, in O(N) time.
>
> But you should carefull when using "NewList = List1 ++ List2"  (and
> other list append functions) especialy in loops where you build a lot of
> NewList entries, as it is a O(N) operation (N = length of List1).
>
> It is also generaly a bad idea to use nth(List, Nth) to do random list
> access, this is a O(N) operation as each nth(...) call traverses the
> list until it gets to the Nth list position.
>
>
> Tuples/Records (they are essentialy the same thing) have slightly
> different properties:
>
> * access is O(1)
> * updates are O(N), where N is the size of the tuple - but only a array
> of pointers (pointing to other erlang terms) is created when a new tuple
> is created, so if the tuple isn't very large (< 100 elements (?)) or the
> tuple isn't used often - there should be little performance cost.
> This means that tuples/records work well in the role of C structs or
> Java classes, but are a bad choice for large arrays that need random
> access updates (sequential data processing is best done with lists)
>
> If you needs random write access your basicly stuck with trees, ets
> tables or the even slower mnesia tables (slower due to their database
> transaction support).
>
> Note: that lists (sequential list traversal) and tuples/records are the
> fastest types for read operations, while adding elements to the front of
> a list - "NewList = [H | OldList]" is the fastest way to build a new
> data structure.
>
> >
> > It did occur to me that it would be useful to have a API for
> > efficiently handling large strings. It would have similar function
> > signatures to regular strings, but the implementation would involve
> > less copying. Does such an API exist?
>
> Erlang is somewhat weak in the area of strings (strings where not a main
> concern when erlang was begin designed to be used in Ericssons telecom
> products - switches ... etc):
> * erlang strings are implemented as lists of (8 bit) integers - while
> this makes it easy to use lists functions on strings this also means
> that strings take a lot of space, ~2 words (2 * 32 bit on a 32 bit CPU).
> * erlang strings and string operations assume that characters are 8 bit,
> latin-1 encodes characters.
>
>
> There are several ways to handle this if it becomes a performance/size
> issue:
> * chars can be stored more compactly - chars can for example be packed
> into binaries (allowing for a N byte sized char to fit in N bytes of
> memory).
> * string data could be preprocessed by other unix tools, better suited
> for fast text processing, before passing the data to erlang
> * lines-1.1 at http://erlang.org/user.html may be useful to handle large
> strings.
>
> You may also find stuff here:
> http://www.erlang-projects.org/
> http://planeterlang.org/
>

Thanks, Hokan. This is very useful. I did initially make the mistake
of abusing the ++ operator in iterative functions because I wanted to
end up with a single string, not a nested structure of lists. However,
I saw in the Erlang efficiency guide that Erlang knows how to
automatically serialize nested lists as a single character stream,
which made me realize that appending new lists to the beginning of an
accumulator is the right thing to do.

I looked at the lines module, and it does seem pretty close to what I
had in mind. I was thinking more along the lines of a generic string
container that allows efficient (less than O(n)) modification and
random access of strings, which would work by breaking the string up
into a tree of some sorts. Lines, if I understand correctly, is
similar, but it requires the user to do the chunking.

Are Erlang's drawbacks with regards to efficient string manipulation
shared among all functional languages? I was planning on writing a
Google killer in Erlang, and this new knowledge makes me rethink this
strategy :)

Best,
Yariv