[erlang-patches] Improving ETS performances

Francesco Lattanzio francesco.lattanzio@REDACTED
Mon Dec 2 09:01:25 CET 2013


Thank you for the reply and the tips.

This was really a proof-of-concept patch, made just to hear from you if
you were interested in it -- so optimising things was not a pressing
issue.

The code in the "erl_db_hash.[ch]" is getting too convoluted for me to
tollerate and adding more checks to minimize memory consumption would
make things worse.
Thus I was thinking to split those files in two: one for the 'set' type
and the other for the 'bag' and 'duplicate_bag' types.
I suppose that increasing code size is not an issue as it is increasing
memory consuption.

At the moment I'm working on something else and don't know when I can
start working on this patch again -- maybe a few weeks, maybe a couple
of months.

Regards.

On Fri, Nov 29, 2013 at 04:22:30PM +0100, Sverker Eriksson wrote:
> Hi
> 
> This patch has now been discussed by OTP technical board.
> 
> It's a desirable feature but the patch is not acceptable as-is,
> mainly due to too large memory overhead per object.
> 
> We would need
> 
> * No extra overhead for table type 'set'.
> * Minimized overhead per object for 'bag' and 'duplicate_bag',
>   especially when the nested hash feature is not used.
> 
> Another weakness is the lack of yielding when long key chains are
> traversed. BIFs should not execute too long but rather yield by
> returning back to the emulator loop to be continued later by the
> scheduler (as is done by the ets:select operations).
> The patch removes this problem for unrelated keys but it still arise
> at lookup/delete on a lot of objects with same key.
> If an otherwise accepted patch is supplied we can undertake the job
> to fix the yielding. It can be good to have in mind though while doing
> the design.
> 
> 
> 
> Some tips and tricks to minimize memory overhead:
> (some of them are probably mutually exclusive)
> 
> * Use lower bits in pointers as flags.
> * Combine next and kprevious as only one of them is used.
> * Move nkitems and lht into a separate root struct that is allocated
>   when the first duplicate key is inserted.
> * Move knext and kprevious to NestedDbTerm to only allocate them when
> needed.
> * Use really dirty trick to allocate a mutilated struct with an unused
> prologue.
> 
> 
> typedef struct {
>     HashDbTerm* previous;  /* NOT ACCESSABLE FOR SET */
>     HashDbTerm* next;
>     HashValue  hvalue;
>     DbTerm dbterm;
> }HashDbTerm;
> 
> /* For bags */
> ptr = malloc(sizeof(HashDbTerm));
> 
> /* For sets */
> ptr = malloc(sizeof(HashDbTerm) - offsetof(HashDbTerm,next));
> ptr = (char*)ptr - offsetof(HashDbTerm,next);
> 
> ptr->next = ...;
> ptr->hvalue = ...;
> ptr->dbterm = ...;
> 
> ptr->previous = NULL; /* MEMORY CORRUPTION!!! */
> 
> free((char*)ptr + offsetof(HashDbTerm,next));
> 
> 
> Maybe there are more sensible ways to accomplish the same thing.

-- 
FRANCESCO LATTANZIO : SYSTEM & SOFTWARE
A-TONO TECHNOLOGY : VIA DEL CHIESINO, 10 - 56025 PONTEDERA (PI) : T +39 02 32069100 : SKYPE franz.lattanzio
a-tono.com : twitter.com/ATonoOfficial : facebook.com/ATonoOfficial

Information in this email is confidential and may be privileged. It is intended for the addresses only.
If you have received it in error, please notify the sender immediately and delete it from your system.
You should not otherwise copy it, retransmit it or use or disclose its content to anyone. Thank you for your co-operation.



More information about the erlang-patches mailing list