dets
is a disk based version of the module ets
. New users should read the documentation for the ets
module before reading this description. In places where no description is given for the behavior of a function in this module, then the function behaves exactly as its corresponding function in the ets
module.
This module provides a term (tuple) storage on file. It is possible to insert, delete, and search for specific terms in a file. The implementation is based on linear hashing. This module is used as the underlying file storage mechanism of the Mnesia DBMS. The module is provided as is, and without Mnesia, for users who are interested in an efficient storage of Erlang terms on disk only. Many applications only need to store some terms in a file. Mnesia adds transactions, queries, and distribution.
A file must be opened and closed. If a file has not been properly closed, the dets
module will automatically repair the file. This might take some time if the file is very large. By default, files are closed if the process which opened the file terminates. If several Erlang processes open the same dets
file, they will all share the file.
The file is properly closed when all users has either terminated or closed the file.
dets
files are not properly closed if the Erlang runtime system is terminated abnormally.
A ^C command abnormally terminates an Erlang runtime system in a Unix environment with a break-handler. |
Since all operations in this module are disk operations, it is important to realize that a single look-up operation might involve a series of disk seek and read operations. For this reason, the operations in this module are much slower than the corresponding operation in ets
, although this module exports a similar interface.
All functions in this module fail and return {error, Reason}
if an error occurs.
The size of an empty dets
file is approximately 34 kilobytes. This may seem large, but this is the price paid for searching for an object in an arbitrarily large file with almost constant search time.
The implementation of dets
is based on the
principle of the ets
module. Data is organized
as a linear hash list and the hash list grows gracefully the
more data is inserted into the file. Space management on the
file is performed by what is called a buddy system.
It is worth noting that the ordered_set data-type present in
ets
tables is not yet implemented in dets
, neither is
the limited support for concurrent updates which makes a
first
/next
sequence safe to use on 'fixed' ets
tables. Both these features will be implemented for dets
in a
future release of the Erlang/OTP system. Until then, the
Mnesia
DBMS (or some user implemented method for locking) has
to be used to implement safe concurrency. No supplied library in
Erlang/OTP currently has support for ordered disk based term
storage.
open_file(Name, Args) -> {ok, Name} | {error, Reason}
This function opens a dets
file. An empty dets
file is created if no file exists.
The Name
argument is the name of the table.
The table name must be provided in all subsequent
operations on the file. This means that dets
files
have atomic names. The name can be used by other processes as well,
and several process can share one dets
file.
This behavior is similar to the named_table
option in ets
.
If two processes open the same file, give the file the same name
and provide the same arguments, then the file will have two
users. If one user closes the file,
it still remains open until the second user closes the file.
The Args
argument is a list of {Key, Val}
tuple where the following values are allowed:
{type, Type}
, where Type
must be either
of the atoms set
, bag
or duplicate_bag
.
If a file
is of type set
, it means that each key uniquely
identifies either one or zero objects. Thus, if a second
object is inserted with a key that is already
present in the file, then the first object will be overwritten.
On the contrary, a file of type bag
can have multiple
objects with same key. However, identical instances of the same
object cannot occur in the same file.
If the type is set to duplicate_bag
multiple identical
objects may occur in the file.
The default value is set
.
{file, Filename}
is the name of the file
to be opened. The default value is the name of
the table.
{keypos, Pos}
. Only tuples can be inserted in
a dets
file. This attribute specifies which position
in each tuple to use as the key field. The default
value is 1
. The ability to change the key position
is most convenient when we want to
store Erlang records in which the first position of the
tuple/record is the name of the record type.{repair, Value}
Value
can be either a boolean (true
or false
), or the atom force
.
The flag specifies if the dets
server invokes
the automatic file repair algorithm.
The default is true
.
If false
is specified, there is no attempt to
repair the file and the error
{error, need_repair}
is returned.
force
means that repair should be
done even if it is not needed. This can be used to convert
dets files from an older version of stdlib. An example is
files hashed with the deprecated erlang:hash/2
BIF. Files
created with dets from a stdlib version of 1.8.2 and
later uses the new erlang:phash/2 function, which may be
preferred. An older dets file can only be converted by a
repair of the file, why forced repairs can be of use.
{cache_size, Integer}
The dets process can keep a cache of elements read (or
written) to the file. The cache is "write-through",
i. e. the data is always saved to disk when inserting.
bag
or
duplicate bag
). The atom infinity
can be
supplied as cache_size
, which indicates that the
cache can grow infinitely (and be as large as the disk based
table itself). A infinite cache may be an alternative to
manually (or via Mnesia) shadowing a dets table in an
ets ditto.
{auto_save, Time}
If auto_save
is specified, the dets table is
flushed to disk whenever it is not accessed for Time
milliseconds. A dets table that is flushed will require
no repair when reopened after an uncontrolled emulator
halt. Time
value of infinity
will disable
auto save.{ram_file, Bool}
The dets
file is kept in
RAM memory if this flag is set. This may sound like an anomaly, but
this flag can enhance the performance of applications which
open a dets
file, insert a set of objects, and then close
the file. When the dets
file is closed, its contents
are written to the real disk file.
The default value is false
.
{estimated_no_object, Int}
Application performance can be enhanced with this flag by specifying, when the file is created, the estimated number of objects that will occupy the dets
file. The default value as well as the minimum value is 256
.{access, Access}
. It is possible to open existing dets
files
in read-only mode. The value of the parameter Access
is either read
or read_write
. The default value is read_write
. A dets
file which is opened in read-only mode is not marked as opened, and consequently it is not subjected to the automatic repair process if it is later opened.The dets
server keeps track of the number of users of
each file. If a file is opened twice, it must be closed twice.
open_file(Filename) -> ok | {error, Reason}
This function opens an existing dets
file.
If the file is not properly closed, it fails
with {error, need_repair}
. This function is
most useful for debugging purposes.
close(Name) -> ok | {error, Reason}
This function closes a file.
Only the owner of a dets
file (i.e., the process which
opened it) is allowed to close it.
All open files must be closed before the system is stopped. If
we attempt to open a file which has not been properly closed, the
dets
module tries to automatically repair the file.
insert(Name, Object) -> ok | {error, Reason}
This function inserts an Object
in table Name
.
lookup(Name, Key) -> ObjectList | {error, Reason}
This function searches the table Name
for object(s) with the key
Key
and returns a list of the found object(s).
Insert and look-up times in tables are constant.
For example:
2> dets:open_file(abc, [{type, bag}]). {ok,abc} 3> dets:insert(abc, {1,2,3}). ok 4> dets:insert(abc, {1,3,4}). ok 5> dets:lookup(abc, 1). [{1,2,3},{1,3,4}]
If the table is of type set
, the function returns
either [ ], or a list with a maximum length of one
(there can be only be one object with a single key
in a set). If the table is of type bag
, a look-up returns
a list of arbitrary length.
This function makes it possible to traverse a whole dets
file and perform some operation on all or some objects in the file. Different actions
are taken depending on the return value of Fun
. The following Fun
return values are allowed:
continue
fun(X) -> io:format("~p~n", [X]), continue end.
{continue, Val}
Val
. The following function is supplied in order to collect all objects in a file into a list:
fun(X) -> {continue, X} end.
{done, Value}
[Value | Previously_accumulated]
.
This function deletes all objects with a specific key from a table.
delete_object(Name, Object) -> ok
This function deletes a specific object from a table.
If a table is of type bag
, the delete/2
function
cannot be used to delete only some of the objects with a specific key.
This function makes this possible.
first(Name) -> Key | '$end_of_table'
This function returns the 'first' object in a table.
next(Name, Key) -> Key | '$end_of_table'
This function returns the next key in a table.
slot(Name, I) -> $end_of_table | ObjList
This function return the list of
objects associated with slot I
.
This function returns a list of all open files on this node.
This function ensures that all data written to Name
is written
to disk. This also applies to files which have been opened
with the ram_file
flag set to true
. In this case, the
contents of the RAM file is flushed to disk.
match_object(Name, Pattern) -> ObjectList
This function matches objects and returns a list of all objects which match
Pattern
. If the keypos'th element of Pattern
is
unbound, a full search of file is performed. On
the contrary, if the keypos'th element is not a variable, this
function only searches among the objects with the right key.
match(Name, Pattern) -> BindingsList
This function matches objects and returns a list of all bindings which match
Pattern
. If the keypos'th element of Pattern
is
unbound, a full search over the whole file is performed. On
the contrary, if the keypos'th element is not a variable, this
function only searches among the objects with the right key.
match_delete(Name, Pattern) -> ok
Deletes all objects which matches
Pattern
from Name
.
This function returns a list of {Tag, Value}
pairs describing the file.
The following list of items is returned.
{type, Type}
, where Type
is either of the
atoms set
or bag
.
{keypos, Pos}
.
{size, Size}
, where Size
is the number
of objects which reside in the file.
{file_size, Fz}
, where Fz
is the size of
the file in bytes.
{users, U}
. where U
is list of the Pids
which currently use the file.
{filename, F}
, where F
is the
name of the actual file being used.
safe_fixtable(Name, true|false)
This function works as the corresponding function in
ets
, except that it does not guarantee that
first
/next
sequences during concurrent deletes
work as expected. The limited support for concurrency
implemented in ets
tables is not yet implemented in
dets
. This interface currently only disables resizing
of the hash area in a table. Until concurrent
deletes are supported, the interface is of
limited usage for others than the Mnesia
DBMS. It is
documented here for completeness.
Returns one of the possible information fields which
are available by means of info/1
.
Additionally, the following Key
s can be
specified:
fixed
. Returns true
if rehashing is
disabled either by the Mnesia
internal
fixtable/2
interface
or by the safe_fixtable/2
interface.Key
is special in that
it returns the atom
undefined
if Name
is not an open
table. Other Key
s will generate an exit signal
(badarg
) in the same situation, which is not
compatible with ets
and may be subject to change
in future releases.safe_fixed
.
If the table is 'fixed' using safe_fixtable/2
,
the call returns
a tuple: {FixedNowTime,[{Pid,RefCount}]}
, where
FixedNowTime
is the time when the table was
fixed by the first process (which may not be one of
the processes fixing it now), Pid
is a process
'fixing' the table right now and RefCount
is
the reference counter for 'fixes' done by that
process. There may be any number of processes in the
list. In all other cases, the atom
false
is returned.
hash
.
Determines which BIF is used to calculate the hashes in
the dets table. Possible return values are hash
,
which means the erlang:hash/2
BIF, or
phash
, which means the erlang:phash/2
BIF. Files
created with this version of dets always uses
erlang:phash/2
. Older dets files may need
conversion, which is done by using the {repair,
force}
argument when opening.
hash
.
Determines which BIF is used to calculate the hashes in
the dets table. Possible return values are hash
,
which means the erlang:hash/2
BIF, or
phash
, which means the erlang:phash/2
BIF. Files
created with this version of dets always uses
erlang:phash/2
. Older dets files may need
conversion, which is done by using the {repair,
force}
argument when opening.
ets(3), mnesia(3)