[erlang-questions] mnesia storage types

Noah Schwartz noah.schwartz1@REDACTED
Fri Mar 28 14:38:44 CET 2014


Christopher, awesome. I searched fairly aggressively for the answer to this
question. I am surprised I didn't come across that thread.

Anyway, this is great to know and answers all of my questions.

A few follow ups:
- Can/should the mnesia documentation be updated? Am I perhaps just reading
it incorrectly? I got the distinct impression from the literature that I
read that the limit was 4 GB for disk tables of any sort and that
disc_copies tables were implemented using a combination of ets & dets which
doesn't seem to be the case. It looks like its ets and some more
sophisticated data file with with a private transaction log.

- Last week we ran into the 2 GB limit in a fragmented table. Inexplicably,
on one of our nodes the 1-th fragment (the fragment thats just the table
name without _fragN) grew to 2 GB over an 8 hour period. It was just one
shard on one node with no unusual disk activity over that time period. Our
guess is that something happened to mnesia_gvar and resulted in the
mnesia_frag module defaulting to the plain table name. See:
https://github.com/simplegeo/erlang/blob/master/lib/mnesia/src/mnesia_frag.erl#L1213.
We didn't see any errors reported at any level until the shard actually
reached 2 GB. We stopped the node, removed it from our load balancer, and
then restarted it with the intention of debugging but, the restart seemed
to cause it to resync with the other nodes and put the shard into a normal
state.


On Fri, Mar 28, 2014 at 9:06 AM, Phillips, Christopher <
Christopher.Phillips@REDACTED> wrote:

>   I'll defer toŠwell, any and everyone on the current state of this, but I
> know Ulf Wiger answered this a while back on this mailing list (I had the
> very same question and so did a fair bit of searching). His answer to
> whether 4 GB was the max size -
>
>
> If you keep data in ram_copies or disc_copies, and run a 64-bit VM, tables
> can be larger than that. How much data you can have in practice will e.g.
> depend on your tolerance for long startup times (disc_copies must be
> loaded into RAM at startup).
>
> Disc_only_copies are in fact limited to 2 GB, and you _really_ don't want
> to exceed that, as mnesia doesn't have a nice way of handling errors at
> that level - the result will likely be inconsistencies in the database.
> And even if you fragment your disc_only_copies tables, Dets repair can
> result in unacceptably long restart times. You need to measure and
> consider your requirements.
>
>
>
>
>
> On 3/28/14, 7:00 AM, "erlang-questions-request@REDACTED"
> <erlang-questions-request@REDACTED> wrote:
>
> >
> >Date: Thu, 27 Mar 2014 13:23:40 -0400
> >From: Noah Schwartz <noah.schwartz1@REDACTED>
> >To: erlang-questions@REDACTED
> >Subject: [erlang-questions] mnesia storage types
> >Message-ID:
> >       <
> CAMEFtU-u4OHQAR47XOhZWMJVny0ghHQ646D14YkpSROBbWAbyQ@REDACTED>
> >Content-Type: text/plain; charset="iso-8859-1"
> >
> >Hello,
> >
> >We use mnesia fairly extensively in our application and are interested in
> >keeping track of table growth. We have one very big table thats already
> >sharded and want to know well ahead of time if/when others need to be
> >sharded.
> >
> >We started with a fairly simple approach of calling mnesia:table_info(Tab,
> >memory) on each table on a regular interval. It seems that for
> >disc_only_tables mnesia defers to dets:info/2 and for disc_copies or
> >ram_copies mnesia defers to ets:info/2. dets returns the size in bytes and
> >ets returns the size in words. We noticed that the words returned by ets
> >didnt match up with the file sizes on disk when converted to bytes. This
> >got us wondering a little more about the different file types -- DAT for
> >dets and DCD/DCL for disc_copies. We also noticed that if you called
> >dets:info(Tab, memory) on a disc_copies table it returned undefined. I was
> >always under the impression that a disc_copies table used ets and dets
> >under the hood but, after looking at the mnesia code that doesn't seem
> >like
> >the case.
> >
> >So my first question is, do tables with a storage type of disc_copies use
> >dets? If so, are they limited by the same 4GB limit that dets is? If so,
> >how can I properly measure the size on disk of a disc_copies table.
> >
> >Thanks in advance
> >
> >--
> >Noah
>
>


-- 
Noah
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140328/865a37a3/attachment.htm>


More information about the erlang-questions mailing list