[erlang-questions] ets:safe_fixtable/2 & ets:tab2file/{2, 3} question

Felix Gallo felixgallo@REDACTED
Thu Dec 17 17:38:10 CET 2015


You can take advantage of erlang's concurrency to get
arbitrarily-close-to-redis semantics.

For example, redis's bgsave could be achieved by writing as usual to your
ets table, but also sending a duplicate message to a gen_server whose job
it is to keep up to date a second, slave, ets table.  That gen_server would
be the one to provide dumps (via to_dets or whatever other facility).  Then
if it has to pause while it dumps, its message queue grows during the
duration but eventually flushes out and brings itself back up to date.
Meanwhile the primary ets replica continues to be usable.

It's not a silver bullet because, like redis, you would still have to worry
about the pathological conditions, like dumps taking so long that the slave
gen_server's queue gets out of control, or out of memory conditions, etc.,
etc.   But if you feel like implementing paxos or waiting about 3 months,
you could also generalize the gen_server so that a group of them formed a
distributed cluster.

F.


On Thu, Dec 17, 2015 at 8:16 AM, Benoit Chesneau <bchesneau@REDACTED>
wrote:

>
>
> On Thu, Dec 17, 2015 at 5:13 PM Benoit Chesneau <bchesneau@REDACTED>
> wrote:
>
>> On Thu, Dec 17, 2015 at 3:24 PM Fred Hebert <mononcqc@REDACTED> wrote:
>>
>>> On 12/17, Benoit Chesneau wrote:
>>> >But what happen when I use `ets:tab2file/2` while keys are continuously
>>> >added at the end? When does it stop?
>>> >
>>>
>>> I'm not sure what answer you expect to the question "how can I keep an
>>> infinitely growing table from taking an infinite amount of time to dump
>>> to disk" that doesn't require locking it to prevent the growth from
>>> showing up.
>>>
>>
>> well by keeping a version of the data at some point :) But that's not how
>> it works unfortunately.
>>
>>
>>>
>>> Do note that safe_fixtable/2 does *not* prevent new inserted elements
>>> from showing up in your table -- it only prevents objects from being
>>> taken out or being iterated over twice. While it's easier to create a
>>> pathological case with an ordered_set table (keeping adding +1 as keys
>>> near the end), it is not beyond the realm of possibility to do so with
>>> other table types (probably with lots of insertions and playing with
>>> process priorities, or predictable hash sequences).
>>>
>>> I don't believe there's any way to lock a public table (other than
>>> implicit blocking in match and select functions). If I were to give a
>>> wild guess, I'd say to look at ets:info(Tab,size), and have your
>>> table-dumping process stop when it reaches the predetermined size or
>>> meets an earlier exit. This would let you bound the time it takes you to
>>> dump the table, at the cost of possibly neglecting to add information
>>> (which you would do anyway -- you would just favor older info before
>>> newer info).  This would however imply reimplementing your own tab2file
>>> functionality.
>>>
>>>
>> Good idea, i need to think a little more about it.. I wish it could be
>> possible to fork an ets table at some point and only use this snapshot in
>> memory like REDIS does literally by forking the process when dumping it.
>> That would be useful...
>>
>> Thanks for the answer!
>>
>>
> side note, but i am thinking that selecting keys per batch also limit the
> possible effects of the concurrent writes since it can work faster that
> way. though writing to the file is slow.
>
> - benoit
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20151217/675e73eb/attachment.htm>


More information about the erlang-questions mailing list