[erlang-questions] ets:safe_fixtable/2 & ets:tab2file/{2, 3} question

Antonio SJ Musumeci trapexit@REDACTED
Thu Dec 17 19:43:55 CET 2015


What exactly is that you need behaviorally? You could also have a process
which just continuously iterates over the table placing the records into a
rotating `disk_log`. If you include a timestamp or have something to know
precisely which version of the record you can replay the log and recover
the state you want. If you need straight up snapshots then maybe a liberal
select would give you a dump. I don't recall however at what level the
select/match locks and if the table is large it'd be expensive memory wise.

It's hard to beat a COW setup if you need snapshots.

On Thu, Dec 17, 2015 at 11:38 AM, Felix Gallo <felixgallo@REDACTED> wrote:

> You can take advantage of erlang's concurrency to get
> arbitrarily-close-to-redis semantics.
>
> For example, redis's bgsave could be achieved by writing as usual to your
> ets table, but also sending a duplicate message to a gen_server whose job
> it is to keep up to date a second, slave, ets table.  That gen_server would
> be the one to provide dumps (via to_dets or whatever other facility).  Then
> if it has to pause while it dumps, its message queue grows during the
> duration but eventually flushes out and brings itself back up to date.
> Meanwhile the primary ets replica continues to be usable.
>
> It's not a silver bullet because, like redis, you would still have to
> worry about the pathological conditions, like dumps taking so long that the
> slave gen_server's queue gets out of control, or out of memory conditions,
> etc., etc.   But if you feel like implementing paxos or waiting about 3
> months, you could also generalize the gen_server so that a group of them
> formed a distributed cluster.
>
> F.
>
>
> On Thu, Dec 17, 2015 at 8:16 AM, Benoit Chesneau <bchesneau@REDACTED>
> wrote:
>
>>
>>
>> On Thu, Dec 17, 2015 at 5:13 PM Benoit Chesneau <bchesneau@REDACTED>
>> wrote:
>>
>>> On Thu, Dec 17, 2015 at 3:24 PM Fred Hebert <mononcqc@REDACTED> wrote:
>>>
>>>> On 12/17, Benoit Chesneau wrote:
>>>> >But what happen when I use `ets:tab2file/2` while keys are continuously
>>>> >added at the end? When does it stop?
>>>> >
>>>>
>>>> I'm not sure what answer you expect to the question "how can I keep an
>>>> infinitely growing table from taking an infinite amount of time to dump
>>>> to disk" that doesn't require locking it to prevent the growth from
>>>> showing up.
>>>>
>>>
>>> well by keeping a version of the data at some point :) But that's not
>>> how it works unfortunately.
>>>
>>>
>>>>
>>>> Do note that safe_fixtable/2 does *not* prevent new inserted elements
>>>> from showing up in your table -- it only prevents objects from being
>>>> taken out or being iterated over twice. While it's easier to create a
>>>> pathological case with an ordered_set table (keeping adding +1 as keys
>>>> near the end), it is not beyond the realm of possibility to do so with
>>>> other table types (probably with lots of insertions and playing with
>>>> process priorities, or predictable hash sequences).
>>>>
>>>> I don't believe there's any way to lock a public table (other than
>>>> implicit blocking in match and select functions). If I were to give a
>>>> wild guess, I'd say to look at ets:info(Tab,size), and have your
>>>> table-dumping process stop when it reaches the predetermined size or
>>>> meets an earlier exit. This would let you bound the time it takes you to
>>>> dump the table, at the cost of possibly neglecting to add information
>>>> (which you would do anyway -- you would just favor older info before
>>>> newer info).  This would however imply reimplementing your own tab2file
>>>> functionality.
>>>>
>>>>
>>> Good idea, i need to think a little more about it.. I wish it could be
>>> possible to fork an ets table at some point and only use this snapshot in
>>> memory like REDIS does literally by forking the process when dumping it.
>>> That would be useful...
>>>
>>> Thanks for the answer!
>>>
>>>
>> side note, but i am thinking that selecting keys per batch also limit the
>> possible effects of the concurrent writes since it can work faster that
>> way. though writing to the file is slow.
>>
>> - benoit
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20151217/36a73952/attachment.htm>


More information about the erlang-questions mailing list