DETS table auto_save behaviour

Frank Muller frank.muller.erl@REDACTED
Thu May 27 18:39:42 CEST 2021


Awesome, thanks!
I thought the WAL was implemented in C.



> The logic is spread out, but a starting point is where the actual commit
> is logged.
>
>
> https://github.com/erlang/otp/blob/master/lib/mnesia/src/mnesia_tm.erl#L284-L291
>
> But there are several different places where stuff happens. Check also the
> mnesia_tm:do_commit()
> function:
> https://github.com/erlang/otp/blob/master/lib/mnesia/src/mnesia_tm.erl#L1781-L1797
>
> and the mnesia_dumper.erl module (which reads the commit log and disperses
> the data into the
> different tables, both at startup, and periodically, to avoid having the
> commit log grow too large.)
>
> BR,
> Ulf
>
> On Thu, May 27, 2021 at 4:31 PM Frank Muller <frank.muller.erl@REDACTED>
> wrote:
>
>> Thanks for the info Ulf.
>>
>> Could you please point me to the WAL source code?
>> Curious to know how it’s implemented.
>>
>>
>>> Mnesia has a WAL (Write-Ahead Log), in which it writes data safely. It
>>> then writes to dets (if that's the chosen table type).
>>>
>>> At startup, dets files are repaired if they don't appear to have been
>>> properly closed. Then the transaction log is applied, making sure that the
>>> database is consistent.
>>>
>>> Repairs of dets files have been known to take time in the past, but I
>>> think OTP has optimized it, Klarna optimized the mnesia end of it, and both
>>> computers and disks are insanely faster now.
>>>
>>> I'd say that the most glaring issue with disc_only_copies in mnesia is
>>> not even the 2 GB limit, but the fact that if you get there, dets will
>>> simply discard the update, and mnesia won't even notice. That is, your
>>> application must ensure that you never exceed the dets limit.
>>>
>>> Most people use disc_copies for persistence, since they have better
>>> performance and better reliability than disc_only_copies. The downside is
>>> that the table will also fit in RAM. A different approach would be to use a
>>> backend plugin. There are three alternatives to choose from, as far as I
>>> know: leveldb, leveled, and rocksdb. There may be issues building leveldb
>>> on newer OTP versions. Leveled is (almost) entirely erlang-based, so it
>>> wins hands-down on build time. Rocksdb should be the fastest, although the
>>> difference isn't dramatic.
>>>
>>> BR,
>>> Ulf W
>>>
>>>
>>>
>>> On Thu, May 27, 2021 at 8:52 AM Frank Muller <frank.muller.erl@REDACTED>
>>> wrote:
>>>
>>>> How about Mnesia and persistence to disk?
>>>>
>>>>
>>>>> It's always tricky with open files during some abrupt crashes.
>>>>> OS-level file system caching means that not all written data may have been
>>>>> physically written to disk.
>>>>>
>>>>> To detect this, dets has a flag indicating whether the file was
>>>>> properly closed. As I understand it, the 'auto-save' does the same thing as
>>>>> when the file is closed, except the file stays open.
>>>>>
>>>>> BR,
>>>>> Ulf W
>>>>>
>>>>> Den ons 26 maj 2021 23:10Mikael Pettersson <mikpelinux@REDACTED>
>>>>> skrev:
>>>>>
>>>>>> On Tue, May 25, 2021 at 8:43 AM Nicolas Martyanoff <khaelin@REDACTED>
>>>>>> wrote:
>>>>>> > I was hoping to use DETS as a local persistent buffer in case data
>>>>>> > cannot be written to a remote database, but it seems impossible to
>>>>>> > guarantee that every entry is being sync-ed to disk.
>>>>>>
>>>>>> I'm not too familiar with the internals of DETS, but basically data
>>>>>> goes straight to/from disk while meta-data about allocated and free
>>>>>> areas of the file are cached in memory. I don't know if writes are
>>>>>> sync or not. In our experience, DETS files are somewhat fragile, plus
>>>>>> they have a hard 2GB size limitation which made them extremely awkward
>>>>>> for our use case (large mnesia tables). That's part of the reason we
>>>>>> migrated most of our mnesia tables to eleveldb.
>>>>>>
>>>>>> If I had to have a standalone (not mnesia) local persistent store I'd
>>>>>> probably go with eleveldb (or one of its spinoffs) if I needed lookups
>>>>>> by key, or a disk_log if I just needed a FIFO buffer. disk_log allows
>>>>>> you to choose how sync or async your writes are. _I_ wouldn't use
>>>>>> DETS.
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210527/7052cec5/attachment.htm>


More information about the erlang-questions mailing list