[erlang-questions] 2D Map Arrays

Fri Sep 8 00:06:30 CEST 2006

Jeff Crane wrote:

>I am trying to implement a 2d map in erlang where
>dynamic elements will be traversing it via reference.
>I have no other way to get the "locations" into memory
>other than using a large array, explicitly for
>reading. 
>
>The bottom "terrain" layer is what the static array
>represents. Putting this in memory consists of reading
>from a text file in the example format:
>{ % map tuple start
>  {  % row start
>    {ID,{a1,a2}}, % column
>    ...
>  }, % row end
>  ...
>} % map tuple end
>
>Question: read it into memory as a large tuple or
>list?
>
>I have heard of a "binary" format in erlang that is
>used for this kind of read-without-copy on large
>amounts of data. I assume that I should read out a map
>export then convert it to binary, then read it
>cheaply. I haven't located a good usage tutorial for
>the binary format and could use a guide if this is the
>preferred method.
>  
>
There are a number of fairly obvious choices to implement a 2D array (to 
for example be able to do efficient random access):

* use a tuple of terms - this is probably the fastest way[1]  to do 
random access (based on a positional index) in a fixed size array.
O(1) read, O(N) update cost.

1: tuple reads have the same speed as reading from the head of a list.

* use a binary - similar to using a tuple, but may be somewhat more 
complicated to access, but has the benefit to be able to store small 
elements (< 1 word length) compactly.
Building new binaries (doing updates) may be costly.

* ets tables have reasonably fast reads and writes and may also be a 
good way to model spares arrays. The drawback (mainly for fast reads) is 
that ets reads are several times slower than tuple accesses.
Note: ets tables do destructive updates which introduce non-functional 
state concerns - the same ets table can't be passed to two functions, 
both which write in the ets tables without introducing execution 
dependencies between the two, they may for example break if run in parallel.
Note: ets tables are only visible on a single erlang node - this may 
introduce additional work if they need to be distribute on several 
machines. Some of the above issues can be fixed by using mnesia or 
another transactional database.

* various kinds of trees (e.g. binary ones) can be used - but read / 
write times for small trees are not noticeably better than ets tables 
and probably worse when the tree grows large  - due to the log(N) factor 
(ets is O(1)).
Note: trees suffer none of the issues mentioned in the notes for ets 
tables - this also makes it very simple to implement an undo function, 
simply keep a number of old trees around, only log(N) elements will 
differ between each write update, so storage will be = LevesOfUndo * log(N).

I would therefore recommend using tuples or binaries for fixed sized 
read only arrays, and trees or ets tables for more dynamic "arrays", 
depending on your needs and your likelihood of wanting to turn the ets 
table into a real database like mnesia.

>What do I do when I want to operate on a similarly
>sized dynamic layer that I want to read/write to?
>
>Tuple or List or Binary or something else?
>
>The idea of each "location" being a process might be a
>good idea for dynamic layers, but not for the base
>terrain layer, which has no behaviors IMO.
>
>
>
>
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around 
>http://mail.yahoo.com 
>_______________________________________________
>erlang-questions mailing list
>erlang-questions@REDACTED
>http://www.erlang.org/mailman/listinfo/erlang-questions
>
>  
>