ets

MODULE

ets

MODULE SUMMARY

Built-in Term Storage

DESCRIPTION

This module acts as an interface to the Erlang built-in term storage BIFs. The module provides the ability to store very large quantities of data in an Erlang runtime system, and to have constant access time to this data (or in the case of the ordered_set data-type access time proportional to the logarithm of the number of elements in the table). Data is organized as a set of dynamic tables. Each table is created by a process. When the process terminates, the table is automatically destroyed. A table can store tuples. Every table has access rights set at creation.

The number of tables stored on one Erlang node is limited. The current default limit is approximately 1400 tables. The upper limit can be increased by setting the environment variable ERL_MAX_ETS_TABLES before starting the Erlang runtime system (i.e. with the -env option to erl/werl). The actual limit may be slightly higher than the one specified, but never lower.

Tables are divided into four different types, set, ordered_set, bag and duplicate_bag. A set or ordered_set table can only have one tuple associated with each key, a bag table can have multiple tuples associated with a single key whereas a duplicate_bag table can have multiple identical objects in the same table.

In the current implementation, every object insert and look-up operation results in one copy of the object.

This module provides very limited support for concurrent updates. No locking is available, but the safe_fixtable/2 function can be used to guarantee that a sequence of first/1 and next/2 calls will traverse the table without errors even if another process (or the same process) simultaneously deletes or inserts elements in the table.

If desired, locking and transactions must be implemented on top of these functions. This is done by the mnesia database system.

There is no automatic garbage collection for tables. The table is not destroyed automatically if there are no references to it from a process. The table has to be destroyed explicitly at user level. It is destroyed if the owner terminates, or with delete/1.

'$end_of_table' should not be used as a key since this atom is used to mark the end of the table when using first/next.

In general, the functions will exit with reason badarg if any argument is of the wrong format, or if the table ID is invalid.

EXPORTS

new(Name, Type)

Creates a new table and returns a table identifier which can be used in subsequent operations. This table ID can also be sent to other processes so that a table can be shared between processes. It is completely location transparent and can be sent to processes at other nodes. Accordingly, the table identifier can be used as a location transparent store. Large amounts of data can be distributed to locations where it can be stored.

The parameter Type is a list which defaults to [set, protected] if [] is specified. The list may contain the following atoms:

set The table is a set table - one key, one object, no order among elements.

ordered_set The table is a ordered_set table - one key, one object, ordered in Erlang term order, which is the order implied by the < and > operators. Tables of this type behave slightly differently in some situations. Each API function of concern notes this different behaviour.

bag The table is a bag table which can have multiple objects per key.

duplicate_bag The table is a duplicate_bag table which can have multiple copies of the same object.

public The table is open to both read and write operations. Any process may read or write to the table. If this option is used, the ets table can be seen as a shared memory segment which is shared by all Erlang processes.

protected The owner can read and write to the table. Other processes can only read the table.

private Only the owner process can read or write to the table.

named_table If this option is present, the table can be accessed by name. With this option, it is possible to have globally accessible tables without passing the table identifier around.

{keypos, Pos} By default, the first element of each tuple inserted in a table is the key. However, this might not always be appropriate. In particular, we do not want the first element to be the key if we want to insert Erlang records in a table. When creating a table, it is possible to specify which tuple position is the key.

Warning!
Do not assume anything about the datatype of the table identifier.

insert(Tab, Object)

Inserts Object into the table Tab. The object must be a tuple with a size equal to or greater than one. If the table was created with the keypos option, the size can also be supplied there. By default, the first element of the object is the key of the object. Returns true.

lookup(Tab, Key)

Searches the table Tab for object(s) with the key Key and returns a list of the found object(s). Insert and look-up times in tables of type set, bag and duplicate_bag are constant, regardless of the size of the table. For the ordered_set data-type, the look-up time is proportional to the (binary) logarithm of the number of elements (it is implemented as a tree).

The following example illustrates:

1>T=ets:new(mytab, [bag, public]).      
{6, <0.19.0>}
2> ets:insert(T, {a, 2, xx, yy}).           
true
3> ets:insert(T, {a, 2, {peter, pan}, 77}).
true
4> ets:lookup(T, a).
[{a, 2, xx, yy}, {a, 2, {peter, pan}, 77}]
5> ets:insert(T, {b, 123, {peter, pan}, 77}).
true
6> ets:lookup(T, b).                       
[{b, 123, {peter, pan}, 77}]

If the table is of type set or ordered_set, the function returns either [], or a list of maximum length of one (there can be only be one object with a single key in a set).

If the table is of type bag or duplicate_bag , a look-up returns a list of arbitrary length. It is also worthwhile to note that bag tables have the following two properties.

The same object cannot occur twice in the same table (no duplicates).

The time order of object insertions is preserved. If object {x, X} is inserted before object {x, Y}, the call ets:lookup(T, x) is guaranteed to return the list [{x, X}, {x, Y}], as opposed to the list [{x, Y}, {x, X}]

lookup_element(Tab, Key, Pos)

This function looks up the Pos'th element of the object in table Tab, with key Key. If no such object exists, the function exists with reason badarg. If the table is of type bag or duplicate_bag, a list of the elements is returned.

delete(Tab, Key) -> true

Deletes object(s) with the key Key in the table Tab. Returns true, or exits with reason badarg if Tab is not a valid Table.

delete(Tab)

Deletes the table Tab. Returns true, or exits with reason badarg if Tab is not a valid Table.

update_counter(Tab, Key, Incr)

In a table of type set or ordered_set, an efficient way of managing counters is to use an object with one or more integers to associate one or more counters with Key. The function update_counter/3 destructively changes the object with key Key by adding the integer value Incr to the counter. The return value is the new value of the counter. Incr can be either:

An integer that is added to the (integer) element directly following the key in the tuple (i.e. at position <keypos> + 1)

A tuple {Pos, Increment} where Pos is the position of the counter element in the tuple and Increment is the integer value to be added to that element.

This function fails with badarg if:

no object with the right key exists

the object in the counter position is not an integer

the table is of type duplicate_bag or bag

the object in the table has the wrong arity.

first(Tab)

Returns the 'first' Key in the table Tab. There is no apparent order among the objects in tables of other types than ordered_set, but there is always an internal order known only by the table itself. In the case of the ordered_set table type, the first key in Erlang term order is returned. Returns '$end_of_table' if there is no first key (the table is empty).

next(Tab, Key)

Returns the 'next' table key after Key. '$end_of_table' is returned if the object associated with Key is the 'last' object in the table. As with first/1 the only table type where the order has a meaning is ordered_set. For the table types set, bag and duplicate_bag the function fails with badarg if there is no object with the key Key, except for the case when the object with the associated key has been deleted from a (still) fixed table (see safe_fixtable/2 below). If the table is of type ordered_set the function returns the next object in order, disregarding the fact that the key Key may or may not exist.

last(Tab)

Works exactly as first/1 but returns the last object in Erlang term order for the ordered_set table type. For all other table types, first/1 and last/1 are synonyms.

prev(Tab, Key)

Returns the previous table key, which only has meaning for the ordered_set table type. For all other table types, next/2 and prev/2 are synonyms, one cannot backup to an 'object passed earlier' in a table of other type than ordered_set.

slot(Tab, I)

This is another way of traversing a table. The first slot of a table is 0 and the table can be traversed with consecutive calls to slot/2. Each call returns a list of objects. '$end_of_table' is returned when the end of the table is reached. This function fails with badarg if the I argument is out of range.

While consecutive calls to slot may look like a safe way to traverse a table even if it is concurrently updated by another process, it is not so. A sequence of calls to slot/2 may result in unexpected badarg's if the table is internally resized as an effect of deletes made from another process (or the traversing process itself). By using safe_fixtable/2, the table will not resize, but then again a sequence of first/1 and next/2 can be used safely on a fixed table, so slot is not safer than first/1 and next/2.

For the ordered_set data-type, this function has even more limited usage. It will return a list containing the I:th element in the table (in Erlang term order). Concurrent updates can make a traversal of an ordered_set using slot/2 behave very unexpectedly. Calls to slot/2 on ordered_set's with the index given (I) equal to the number of objects in the table will return the atom '$end_of_table'. Calls with indexes larger than the number of elements will result in a badarg exit.

Do not use this function. It may be removed in a future release.

fixtable(Tab, true|false)

This function toggles the table ability to "rehash" itself. It is primarily used by the Mnesia DBMS to implement functions which allow write operations in a table, although the table is in the process of being copied to disk or to another node.

The function keeps no track of when and how tables are fixed, it is actually more to be regarded as an internal interface used from the safe_fixtable/2 function. It is retained only for backward compatibility, use safe_fixtable/2 instead.

safe_fixtable(Tab, true|false)

This function implements limited concurrency support for tables of the set, bag and duplicate_bag table types. When a process 'fixes' a table, it remains fixed until that process either 'releases' the table or the process dies. If several processes 'fixes' a table, the table will be released when the last process releases it (or exits). A reference counter is also kept on a per process basis, so N consecutive 'fixes' of a table requires N 'releases' to actually release the table.

When a table is 'fixed', a sequence of first/1 and next/2 calls are guaranteed to succeed, that is without generating exits due to deleted keys used in the next/2 call. An example follows:

    clean_all_with_value(Tab, X) ->
            safe_fixtable(Tab, true), % Make sure the table is
                                      % not rehashed.
            clean_all_with_value(Tab,X,ets:first(Tab)),
            safe_fixtable(Tab,false).
          
          clean_all_with_value(Tab,X,'$end_of_table') ->
            true;
          clean_all_with_value(Tab,X,Key) ->
            case ets:lookup(Tab,Key) of
              [{Key,X}] ->
                ets:delete(Tab,Key);
              _ -> % This may be either [{Key,_}] or [] due to
                   % concurrent updates
                true
            end,
            clean_all_with_value(Tab,X,ets:next(Tab,Key)).

The above example would have generated a badarg exit if the table had not been 'fixed' before the loop clean_all_with_value/3.

Note that a table which is 'fixed' does not actually remove the elements deleted until it is 'released' by all processes that have 'fixed' it. If a process 'fixes' the table and never releases it, the memory used by the deleted objects will never be freed. The performance of operations on the table will also degrade significantly.

By using calls to info/2, one can inspect which processes are 'fixing' the table and when it was first 'fixed'. A system where a lot of processes are 'fixing' tables may need a process that monitors those tables and sends alarms when tables have been 'fixed' for to long.

For tables of the ordered_set type, 'fixing' has no usage, consecutive calls to first/1 and next/2 will always succeed, regardless of if the table is 'fixed' or not.

all()

Returns a list of all tables on this node.

match(Tab, Pattern)

Tries to match the object(s) in table Tab with the pattern Pattern. Pattern may contain '_' , which matches any object, bound parts, and variables. Pattern variables have the form of atoms beginning with a '$' sign and followed by a number, e.g., '$0' or '$31'. If successful, the result of the call is a list of variable bindings. The reason for providing a matching function is to scan large portions of a table, searching for a particular object without having to copy the entire table from the table space to the user space.

The following interaction with the Erlang shell illustrates how to use the match/2 function:

7> ets:match(T, {a, 2, '$1', '$2'}).
 [[{peter, pan}, 77], [xx, yy]]

The call to match/2 returned an ordered list of the variable bindings which is the first object that matched the pattern, bound the variable $1 to {peter, pan}, and the variable $2 to 77. The second object which matched the pattern bound the variable $1 to xx, and the variable $2 to yy. The pattern '_' can be used as a wild-card. It matches everything, but it does not bind any variables.

8> ets:match(T, {a, 2, '$1', '_'}). 
 [[{peter, pan}], [xx]]

[] is returned if no match is found.

The first part of the objects are used as keys in the tables and a match request with the first part of the bound pattern - not a variable or an underscore - is very efficient. However, if the key part of the pattern is a variable, the entire table must be searched. The search time can be substantial if the table is very large.

The special case where the pattern is a single variable will collect the entire table.

9> ets:match(T, '$1').             
 [[{a, 2, {peter, pan}, 77}], [{a, 2, xx, yy}], 
  [{b, 123, {peter, pan}, 77}]]

On tables of the ordered_set data-type, the result is in the same order as in a first/1, next/2 sequence.

match_object(Tab, Pattern)

Tries to match the object(s) in table Tab with the pattern Pattern. Pattern may contain '_' , which matches any object, bound parts, and variables. Pattern variables have the form of atoms beginning with a '$' sign and followed by a number, e.g., '$0' or '$31'. The result is a list of matching objects (i.e complete table objects). This function differs from match/2 in that it returns complete objects and does not return any variable bindings. It is thus not very meaningful to use pattern variables, it will have exactly the same effect as using '_'.

The following interaction with the Erlang shell illustrates how to use the match_object/2 function:

7> ets:match_object(T, {a, 2, '_', '_'}).
 [{a, 2, peter, pan}, {a, 2, captain, hook}]

The call to match_object/2 returned an ordered list of objects that matched the pattern,

[] is returned if no match is found.

The special case where the pattern is a single variable or '_' will collect the entire table.

On tables of the ordered_set data-type, the result is in the same order as in a first/1, next/2 sequence.

match_delete(Tab, Pattern)

Deletes object(s) which match Pattern in the table Tab. This can be especially useful in combination with bag type tables. If the first element of Pattern is a variable, the entire table must be searched. Returns true.

rename(Tab,NewName)

Renames a (preferably) named table to the name NewName. NewName has to be an atom. Renaming a table that is not named will succeed, but is of course quite useless. The old name of a named table can no longer be used to access it after it is renamed.

info(Tab)

Returns a tagged structure which describes the table with the following tags:

memory The number of words allocated to the table.

owner The Pid of the owner of the table.

size The number of objects inserted in the table.

type Type bag, duplicate_bag or type set.

protection Public, protected, or private.

node The name of the node where Tab is actually stored.

name The name of the table, as given to new/2.

named_table true or false.

keypos The position of the tuples which are the key position. The default is 1.

info/1 returns undefined if the table does not exist.

info(Tab, Item)

Same as above, but only for the information that is associated with Item.

Except for the items mentioned above, these to items can be specified in calls to info/2:

fixed Returns true if the table is fixed by any process, otherwise false. If the table identifier is no longer valid (deleted) the atom undefined is returned.

safe_fixed If the table is 'fixed' using the safe_fixtable interface, the call returns a tuple: {FixedNowTime,[{Pid,RefCount}]}, where FixedNowTime is the time when the table was fixed by the first process (which may not be one of the processes fixing it now), Pid is a process 'fixing' the table right now and RefCount is the reference counter for 'fixes' done by that process. There may be any number of processes in the list.
In all other cases, the atom false is returned.
One can use this to write a monitor for 'fixed' tables if desired.

tab2file(Tab, Filename)

Dumps a table in the Erlang external term format to the file called Filename. Returns ok, or {error, Reason}. The function may crash if bad arguments are specified. The implementation of this function is not efficient.

file2tab(Filename)

Reads a file produced by the tab2file/2 function and returns {ok, Tab} if the operation is successful, or {error, Reason} if it fails.

The error {error, nofile} is returned whenever the file cannot be read. This will be changed in future releases so that {error, nofile} is only returned when the file really does not exist, otherwise another error code will be returned. For applications that want to difference between errors, using the routines in the file module to detect if the file is nonexistent or inaccessible is to be preferred until this interface is changed.

tab2list(Tab)

Returns a list of all objects in the table.

i()

Displays a list of all local ets tables on the tty.

i(Item)

Browses an ets table on the tty. The Item argument is the identifier displayed in the left most field by the i() function.

ets

MODULE

MODULE SUMMARY

DESCRIPTION

EXPORTS

AUTHORS