[erlang-questions] Mnesia questions

Igor Ribeiro Sucupira igorrs@REDACTED
Tue May 18 08:08:52 CEST 2010


On Mon, May 17, 2010 at 11:02 PM, Chris Hicks
<silent_vendetta@REDACTED> wrote:
> I'm not sure I understand exactly what you are describing for the table you
> recommend, I think the dual PersonKey is what is throwing me off. What you
> are recommending is a separate item table, fragmented, where the primary key
> is the name of that separate (from the person table) table and the specific
> item ID. Then putting in another field that contains the ID of the
> associated person record in the person table and indexing on that. This way
> each item can be looked up singularly or as the group that belongs to that
> single player. Correct?
> What is the advantage to having the items spread out over multiple fragments
> as opposed to having them all located in a single table in the case of a
> bag?

I was making some assumptions here (which may be wrong):
- That you are using fragmented tables anyway.
- That you are spreading your fragments across several nodes.
- That, if one of the nodes is temporarily unavailable, you prefer to
serve partial data to many "person"s than to serve nothing to some and
everything to some other.

As for the advantages of using fragmented tables, I can only think of two now:
- Being able to spread your data across several servers. One can also
say that having data spread across different files will also improve
the fault tolerance of your system.
- Overcoming the 2 GB limit of dets (or not allowing dets files to
grow too big, for performance reasons).

I should also point that I have both variants in production (using
bags and using the scheme I described above), both holding big amounts
of data (I can't say numbers, but it's more than what you described)
and both using dets (since the data doesn't fit in memory, even with
several servers).
If I could go back in time, I wouldn't be using bags. Although they
give me a little bit more performance (but again: all of this depends
on the operations you need to do) and save some space, I strongly
prefer to not have users with all of their data unavailable because
one server is down (it hasn't happened so far, but I already hate bags
 ;-)).

Best regards.
Igor.

>> From: igorrs@REDACTED
>> Date: Mon, 17 May 2010 21:35:35 -0300
>> To: silent_vendetta@REDACTED
>> CC: erlang-questions@REDACTED
>> Subject: Re: [erlang-questions] Mnesia questions
>>
>> To model your data, I think it's a good idea to first decide what are
>> the operations you're going to need.
>>
>> Also, I can't understand whether each item is specific of a person or
>> the same item can belong to more than one person.
>>
>> Supposing that each item belongs to only one person and that you need
>> to be able to:
>> - Retrieve all the items of a person.
>> - Retrieve a specific item of a person.
>> - Insert an item for a person.
>>
>> I think you would be fine with a table that has:
>> - {PersonKey, ItemKey} as the primary key.
>> - Another field for PersonKey.
>> - An index on the field above.
>>
>> This way, the items of the same user may be distributed among several
>> fragments, what is probably what you want.
>>
>> Using a bag with the person as the primary key, all the items for the
>> same user will be on the same fragment. Also, accessing a specific
>> item will be slower.
>>
>> Good luck.
>> Igor.
>>
>> On Mon, May 17, 2010 at 8:32 PM, Chris Hicks
>> <silent_vendetta@REDACTED> wrote:
>> >
>> > First off thank you to everyone who responded publicly and privately. I
>> > do have a couple more specific ones about Mnesia and fragmented tables (yes,
>> > more of THOSE questions).
>> > 1) Well my first question is actually about tables in general. When one
>> > inserts/deletes a record/row from a table, is the whole table locked(didn't
>> > see this anywhere but could have missed it)?
>> > 2) I understand the general concept behind fragmented tables, and the
>> > answer to the above question may render part of this moot, but what are the
>> > performance pros/cons for using fragmented tables? I don't mean disk space
>> > but read/writes which occur mostly in transactions. Is each one any slower?
>> > Is there any advantage in parallelization?
>> > 2a)If a whole table is locked for an update does that just mean
>> > (hopefully) that the specific fragment of the larger table would only be
>> > locked, meaning a 50 fragment table could have 50 different locks/writes
>> > going? Of course if a whole table is not locked for an update that doesn't
>> > matter.
>> > One of the data requirements for my project is that one "person" record
>> > will be associated with possibly as many as 1000 "item" records. Lets take a
>> > possible person population of 10,000, meaning there could be as many as 10M
>> > item records. I want to keep each record small as I will be accessing
>> > different parts of a whole "object" at different times and will only need
>> > one or two small pieces of data at a time. One of the ways, and some of this
>> > is going to rely on the answers above, I was thinking of doing this was
>> > setting up a bag type table and associating all of those items (because they
>> > will be different) with the same key, the id of the person record. The one
>> > thing I'm wondering about, however, is some of these items might be in table
>> > fragment 1 and others in fragments 13, 27 and 42 (as far as I understand it)
>> > and I was wondering how much of a problem this would create as far as
>> > performance.
>> > Having each item with a different primary key, with each key held in a
>> > field in the person record, would mean I have a ton of queries if I want to
>> > get detailed info about each item and that doesn't work for me. Is there
>> > another approach that would be better that I am not thinking of?
>>
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>
>
> ________________________________
> The New Busy is not the too busy. Combine all your e-mail accounts with
> Hotmail. Get busy.



-- 
"The secret of joy in work is contained in one word - excellence. To
know how to do something well is to enjoy it." - Pearl S. Buck.


More information about the erlang-questions mailing list