[erlang-questions] VM & BEAM Specs : idea for improving the lists support
Denys Rtveliashvili
rtvd@REDACTED
Sun Aug 5 13:09:25 CEST 2007
>> I join to this question.
>>
>> I was also looking for the specifications once as I had an idea how to
>> improve the performance of list operations by changing the underlying
>> implementation.
>>
>
> It would be interesting to share your ideas about improving the
> performance of list operations. What improvements did you have in
> mind? If you have some good ideas
> that we have not thought about we might implement them.
>
> I am also curious about which list operation you don't find effiecient
> as it is today.
>
> /Kenneth (Erlang/OTP team at Ericsson)
Hi Kenneth,
As far as I understand, the underlying implementation of lists which is
used in Erlang is based on linked lists. So, each list is a number of
objects with a value (the n'th element of the list) or a reference to it
plus a link to the next element (if it exists).
My understanding is that only a number of operations take O(1) time to
complete. These are:
* Adding an element to the beginning of the list
* Taking the first element of the list or its remainder.
However, the following operations take O(n) time:
* Finding out the length of the list
* Adding an element at the end of the list
* Taking the last element of the list or everything before it
As a result, one of the common patterns in Erlang is to construct a list
by appending the elements from the beginning and then reversing the list.
The problems I see with the current implementation are:
- The computational complexity required to do certain list operations is
O(n), not O(1)
- The linked list is not CPU cache - friendly. It can be dispersed in
the memory, reducing the cache - hits and thus killing the performance.
- Each of the list items is a quite big object. It takes 64bits if I am
not mistaken. So, if you'd like to store a list of, say, 16bit
characters / ordinary Unicode strings it takes 4 times more memory.
------------------
I believe, the current implementation can be (theoretically) improved.
Of course, you will never know the exact numbers until someone
implements the ideas. And the main factor which stops me is that (as far
as I understand) Erlang uses that 64bit objects for almost everything so
the change will be enormous.
--- Idea #1:
"Use unrolled linked lists to improve the performance"
The unrolled linked lists
(http://en.wikipedia.org/wiki/Unrolled_linked_list) should give a
slightly better cache-hit ratio and consume less memory. Also, they will
maintain strict O(1) time complexity for operations which are O(1) now.
--- Idea #2:
"Use resize-able vectors to store lists"
I guess this change would give even more benefits comparing to the Idea
#1. These are:
- The _amortized_ computational complexity is O(1) for
adding/removing/taking a remainder no matter if it is done from the
beginning or from the end of the list.
- It takes O(1) time to find out the size of the list.
- Cache-hits ratio is seriously improved as the list is located in a
single chunk of memory.
- Much less work for a GC. However, I am not sure if it is an issue.
The drawback is that the time complexity is amortized. Again, it is not
clear what is better: an amortized O(1) or strict O(1) and strict O(n).
I'd prefer the amortized O(1).
A few words regarding the implementation:
If you take a look at Java LinkedList and ArrayList, the ArrayList is
generally 3 times faster. It can be used as a reference to some extent.
However, it does not allow O(1) adding/removing at the beginning of the
list. So, instead of implementation of a list which can grow only in one
direction, it is necessary to implement the one which can grow any way.
I am sure that you can also benefit from pureness or the functional
approach in Erlang and from the fact that only one thread accesses the
list all the time as you will not need to care about synchronization much.
--- Idea #3:
"Typed lists"
Both kinds of list implementations can benefit from knowing what type of
data is going to be stored in them. It would lead to a decrease in the
consumed space and improved performance.
It is possible to implement a "general" kind of list which can hold
anything plus some specific implementations for storing 8bit and 16bit
values.
If a value is appended to the list and it can not be (potentially)
stored within it, the list changes its type. The operation will take
O(N) time. However, it will benefit from the flat memory structure. The
type checks are not necessary to implement in runtime. They can be made
by analyzing the possible scenarios during the compile-time or during
the loading of the module.
However, in most cases it will not be necessary to change the type of
the list and lists of 8bit / 16bit values will do their job.
Also, you will not need to change the syntax of the programs as
everything can be done on VM/OTP side.
The benefit is again in both performance and memory consumption.
Imagine that a program has consumes a large text document with English,
Arabic and Russian characters. The current implementation will consume 4
times more memory than an automatically-typed vector-based version. In
case the document contains only ASCII characters, the memory consumption
will drop 8 times.
---
With kind regards,
Denys Rtveliashvili
More information about the erlang-questions
mailing list