[erlang-patches] New index type in mnesia (new feature)

Aleksander Nycz Aleksander.Nycz@REDACTED
Fri Aug 23 10:22:10 CEST 2013


W dniu 2013-06-28 12:22, Fredrik pisze:
> On 06/19/2013 06:51 PM, Aleksander Nycz wrote:
>> Hello,
>>
>> Mnesia gives possibility to create table indexes, when
>> the user wants to frequently use some other field
>> than the key field to look up records.
>>
>> Current index solution in mnesia uses ets table (type bag or 
>> duplicated_bag) to maintain mapping:
>> Indexed field value -> Primary key value.
>>
>> Unfortunatelly current solution has very significant disadvantage:
>> operation performance (loading table, insert new records,
>> delete records, etc.) is very low when index is set on 
>> 'Low-cardinality column'
>>
>> http://en.wikipedia.org/wiki/Cardinality_%28SQL_statements%29
>>
>> In such case operation complexity is O(n) when n is number
>> of Primary Key Values. For small n performance can be acceptable for 
>> some application,
>> but when n is the hundreds, thousands or even more such index
>> are useless. New index type provides O(1) complexity.
>>
>> This patch introduces new index type in mnesia database.
>> Main concept is to maintain all Primary Key Values not direcly in
>> bag/duplicated_bag ets but in set of ets.
>> For each Indexed field value new ets is created
>> and Primary Key Values are strored in this ets.
>> For 'Low-cardinality column' there is only a few Indexed key value 
>> (eg. isActive (true/false), state (new/pending/suspended/active), ...)
>> so memory overhead for ets is not significant.
>>
>> Standard index:
>>     Indexed field value -> [Primary key value]
>>
>> New index based on ets:
>>     Indexed field value -> ets, that contains Primary key value
>>
>> Restrictions:
>>
>> 1. New index can be created on disc_copies or ram_copies tables only. 
>> Tables disc_only_copies are not supported.
>> 2. Index type can't be changed. The only way to change existing index 
>> idx_list to idx_ets and vice versa
>>      is to delete existing index and create new one by 
>> mnesia:add_table_index/3 (new function, see below)
>>
>>
>> New API:
>>
>> 1. Define index type when table is created:
>>
>> create_table(Name, TabDef) -> {atomic, ok} | {aborted, Reason}
>>
>> New TabDef value:
>> {index_type, [{atom() | int(), 'idx_std' | 'idx_ets'}]} - 'idx_std' 
>> is default when index is created
>>
>> Example:
>>
>> -type(poolId() :: integer()).
>> -type(bucketId() :: integer()).
>> -type(resourceState() :: free | reserved | gracePeriod).
>>
>> -record(rmResource, {id                                 :: {poolId(), 
>> any()}
>>                     ,state                              :: {poolId(), 
>> bucketId(), resourceState()}
>>                     ,availableFrom                      :: integer()
>>                     ,availableTo                        :: integer()
>>                     ,requestorId                        :: any()
>>                     ,reservedFrom                       :: integer()
>>                     ,reservedTo                         :: integer()
>>                     ,isDeleted      = false             :: boolean()
>>                     ,mTime                              :: integer()}).
>>
>>      {atomic,ok} = mnesia:create_table(tRMResources
>>                                       ,[
>>                                          {disc_copies, []}
>>                                         ,{ram_copies, [node()]}
>>                                         ,{type,set}
>> ,{attributes,record_info(fields, rmResource)}
>>                                         ,{record_name, rmResource}
>>                                         ,{index, [state, requestorId, 
>> mTime]}
>>                                         ,{index_type, [{state, 
>> idx_ets}, {requestorId, idx_std}]}
>>                                        ]),
>>
>> 2. Add new index to existing table:
>>
>> mnesia:add_table_index(Tab, AttrName, IndexOpts) -> {aborted, R} | 
>> {atomic, ok}
>>
>> This function creates a index on Mnesia table called Tab on AttrName
>> field according to the argument IndexOpts.
>> This list must be a list of {Item, Value} tuples, currently only one
>> option is allowed:
>>      {index_type, 'idx_std' | 'idx_ets'}
>>
>> Example:
>>
>> mnesia:add_table_index(tRMResources, isDeleted, [{index_type, 
>> 'idx_ets'}])
>>
>> 3. New match_object/4, dirty_match_object/3 functions:
>>
>> match_object(Tab, Pat, Limit, LockKind) -> [Record] | transaction abort.
>> dirty_match_object(Tab, Pat, Limit) -> [Record] | exit({aborted, 
>> Reason}).
>>
>> Similar to match_object/3 and dirty_match_object/2, but returns no 
>> more than Limit records.
>>
>>
>> 4. New index_match_object/5, dirty_index_match_object/4 functions:
>>
>> index_match_object(Tab, Pat, Attr, Limit, LockKind) -> [Record] | 
>> transaction abort.
>> dirty_index_match_object(Tab, Pat, Attr, Limit) -> [Record] | 
>> exit({aborted, Reason}).
>>
>> Similar to index_match_object/4, dirty_index_match_object/3 but 
>> returns no more than Limit records.
>>
>>
>> 5. New index_read/4, dirty_index_read/4 functions:
>>
>> index_read(Tab, Key, Attr, Limit) -> [Record] | transaction abort.
>> dirty_index_read(Tab, Key, Attr, Limit) -> [Record] | exit({aborted, 
>> Reason}).
>>
>> Similar to index_read/3, dirty_index_read/3 but returns no more than 
>> Limit records.
>>
>>
>> 6. New select_limit/3, select_limit/4, dirty_select/3 functions;
>>
>> select_limit(Tab, MatchSpec, NObjects [, Lock]) -> [Object] | 
>> transaction abort.
>>
>> Similar to select(Tab, MatchSpec [, Lock]) but returns maximum NObjects
>> records, of course empty list can also be returned.
>> Continuation (see select/4) is not possible. This function can also use
>> indexes to find matching records
>> as contrasted with select/4.
>>
>> dirty_select(Tab, Spec, Limit) -> [Object] | exit({aborted, Reason}.
>>
>> Similar to dirty_select/2 but returns no more than Limit records.
>>
>> And git links:
>>
>> git fetch git://github.com/nyczol/otp.git mnesia_new_index
>>
>> https://github.com/nyczol/otp/compare/erlang:master...mnesia_new_index
>> https://github.com/nyczol/otp/compare/erlang:master...mnesia_new_index.patch 
>>
>>
>> Regards,
>> Aleksander Nycz
>>
>>
>>
>> _______________________________________________
>> erlang-patches mailing list
>> erlang-patches@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-patches
> Hello,
> Please rebase this patch upon current maint. No mergeing ;)
> Thanks,
>
> -- 
>
> BR Fredrik Gustafsson
> Erlang OTP Team

Hello,

It lasts quite long but finally is done.
I move mnesia_new_index functionality to maint.

Below git links:

git fetch git://github.com/nyczol/otp.git mnesia_new_index2

https://github.com/nyczol/otp/compare/erlang:maint...mnesia_new_index2
https://github.com/nyczol/otp/compare/erlang:maint...mnesia_new_index2.patch

Best regards
Aleksadner Nycz

-- 
Aleksander Nycz
Senior Software Engineer
Telco_021 BSS R&D
Comarch SA
Phone:  +48 12 646 1216
Mobile: +48 691 464 275
website: www.comarch.pl

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20130823/c697a2ea/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2182 bytes
Desc: Kryptograficzna sygnatura S/MIME
URL: <http://erlang.org/pipermail/erlang-patches/attachments/20130823/c697a2ea/attachment.bin>


More information about the erlang-patches mailing list