linked-in drivers once again

Thu Jun 28 11:26:31 CEST 2001

On Wed, 27 Jun 2001, Karel Van Oudheusden wrote:

>And now more general questions:
>
>- I am building a distributed (multi-tasking) application
>written mainly in Erlang. However, the application contains a
>lot of large lookup-tables that are consulted very often. These
>lookup-tables can efficiently be implemented in different ways
>in an imperative (e.g. C language) approach. Therefore, I want
>to use Erlang for the multi-tasking, fault-tolerance, and
>control intelligence of the language. On the other hand, I want
>to use C or C++ for the total memory management of the lookup
>tables.

You should take a look at ets (Erlang Term Storage). Ets
implements efficient storage structures through built-in
functions. Currently supported access structures are:

- set (linear hash table, unique keys)
- bag (linear hash table, multiple objects per key)
- duplicate_bag (like bag, but multiple instances of each object)
- ordered_set (B+ tree)

Access times are usually in the order of 10-100 microseconds,
depending on the size of the object being read/written (data is
copied from/to the process heap). The ets tables are stored
outside the process heaps, and are not garbage collected.

Read more about it using "erl -man ets", or at 
http://www.erlang.org/doc/r7b/lib/stdlib-1.9.1/doc/html/ets.html

If these access types suit your needs, then you will not gain any
performance from writing a linked in driver.

The mnesia database uses these built in access structures, and
adds transaction semantics, replication, and lots of other stuff.
It is quite easy to start with ets tables and then transition to
mnesia, if your requirements change.

>Note that I am also assuming that the application I am talking
>about is mainly a data-dominated problem. Therefore I consider
>Erlang inappropriate for the data management (of the lookup
>tables). I am however wandering whether at the end of the day
>the concurrency in Erlang will be the most critical factor (in
>terms of performance) or will it (still) be the data management
>(e.g. retrieving different memory often) assuming that I am
>using the above stated approach?

I don't know if this can be answered without looking very closely
at your particular application. Using ets, data access is really
quite fast. Concurrency is also very efficiently implemented in
Erlang. Since your application will also be distributed, I'd
expect that the strategy for task distribution, the capacity of
the communication channels, and the general nature of the tasks
being distributed could become at least as important for overall
performance.

You should plan to do some prototyping, and develop your
application incrementally. You will find out soon enough where
your real bottlenecks are. Even if you then decide that you will
have to leave Erlang for really good performance (normally, this
is not the case), your work in Erlang will still have been
extremely valuable for your understanding of the application.

/Uffe
-- 
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB