[Erlang Systems]

3 Building A Mnesia Database

This chapter details the basic steps involved when designing a Mnesia database and the programming constructs which make different solutions available to the programmer. The chapter includes the following sections:

3.1 Defining a Schema

The configuration of a Mnesia system is described in the schema. The schema is a special table which contains information such as the table names and each table's storage type, (i.e. whether a table should be stored in RAM, on disc or possibly on both, as well as its location).

Unlike data tables, information contained in schema tables can only be accessed and modified by using the schema related functions described in this section.

Mnesia has various functions for defining the database schema. It is possible to move tables, delete tables, or reconfigure the layout of tables.

An important aspect of these functions is that the system can access a table while it is being reconfigured. For example, it is possible to move a table and simultaneously perform write operations to the same table. This feature is essential for applications that require continuous service.

The following section describes the functions available for schema management, all of which return a tuple:

3.1.1 Schema Functions

3.2 The data model

The data model employed by Mnesia is an extended relational data model. Data is organized as a set of tables and relations between different data records can be modeled as additional tables describing the actual relationships. Each table contains instances of Erlang records and records are represented as Erlang tuples.

Object identifiers, also known as oid, are made up of a table name and a key. For example, if we have an employee record represented by the tuple {employee, 104732, klacke, 7, male, 98108, {221, 015}}. This record has an object id, (Oid) which is the tuple {employee, 104732}.

Thus, each table is made up of records, where the first element is a record name and the second element of the table is a key which identifies the particular record in that table. The combination of the table name and a key, is an arity two tuple {Tab, Key} called the Oid. See Chapter 4: Record Names Versus Table Names, for more information regarding the relationship between the record name and the table name.

What makes the Mnesia data model an extended relational model is the ability to store arbitrary Erlang terms in the attribute fields. One attribute value could for example be a whole tree of oids leading to other terms in other tables. This type of record is hard to model in traditional relational DBMSs.

3.3 Starting Mnesia

Before we can start Mnesia, we must initialize an empty schema on all the participating nodes.

When running a distributed system, with two or more participating nodes, then the mnesia:start( ). function must be executed on each participating node. Typically this would be part of the boot script in an embedded environment. In a test environment or an interactive environment, mnesia:start() can also be used either from the Erlang shell, or another program.

3.3.1 Initializing a Schema and Starting Mnesia

To use a known example, we illustrate how to run the Company database described in Chapter 2 on two separate nodes, which we call a@gin and b@skeppet. Each of these nodes must have have a Mnesia directory as well as an initialized schema before Mnesia can be started. There are two ways to specify the Mnesia directory to be used:

To start our Company database and get it running on the two specified nodes, we enter the following commands:

  1. On the node called gin:
                gin % erl -sname a -mnesia dir '"/ldisc/scratch/Mnesia.company"'
              
  2. On the node called skeppet:
    skeppet %erl -sname b -mnesia dir '"/ldisc/scratch/Mnesia.company"'
              
  3. On one of the two nodes:
    (a@gin1)>mnesia:create_schema([a@gin, b@skeppet]).
              
  4. The function mnesia:start() is called on both nodes.
  5. To initialize the database, execute the following code on one of the two nodes.
    dist_init() ->
        mnesia:create_table(employee,
                             [{ram_copies, [a@gin, b@skeppet]},
                              {attributes, record_info(fields,
                                                       employee)}]),
        mnesia:create_table(dept,
                             [{ram_copies, [a@gin, b@skeppet]},
                              {attributes, record_info(fields, dept)}]),
        mnesia:create_table(project,
                             [{ram_copies, [a@gin, b@skeppet]},
                              {attributes, record_info(fields, project)}]),
        mnesia:create_table(manager, [{type, bag}, 
                                      {ram_copies, [a@gin, b@skeppet]},
                                      {attributes, record_info(fields,
                                                               manager)}]),
        mnesia:create_table(at_dep,
                             [{ram_copies, [a@gin, b@skeppet]},
                              {attributes, record_info(fields, at_dep)}]),
        mnesia:create_table(in_proj,
                            [{type, bag}, 
                             {ram_copies, [a@gin, b@skeppet]},
                             {attributes, record_info(fields, in_proj)}]).
    

As illustrated above, the two directories reside on different nodes, because the /ldisc/scratch (the "local" disc) exists on the two different nodes.

By executing these commands we have configured two Erlang nodes to run the Company database, and therefore, initialize the database. This is required only once when setting up, the next time the system is started mnesia:start() is called on both nodes, to initialize the system from disc.

In a system of Mnesia nodes, every node is aware of the current location of all tables. In this example, data is replicated on both nodes and functions which manipulate the data in our tables can be executed on either of the two nodes. Code which manipulate Mnesia data behaves identically regardless of where the data resides.

The function mnesia:stop() stops Mnesia on the node where the function is executed. Both the start/0 and the stop/0 functions work on the "local" Mnesia system, and there are no functions which start or stop a set of nodes.

3.3.2 The Start-Up Procedure

Mnesia is started by calling the following function:

          mnesia:start().
        

This function initiates the DBMS locally.

The choice of configuration will alter the location and load order of the tables. The alternatives are listed below:

  1. Tables that are stored locally only, are initialized from the local Mnesia directory.
  2. Replicated tables that reside locally as well as somewhere else are either initiated from disc or by copying the entire table from the other node depending on which of the different replicas is the most recent. Mnesia determines which of the tables is the most recent.
  3. Tables that reside on remote nodes are available to other nodes as soon as they are loaded.

Table initialization is asynchronous, the function call mnesia:start() returns the atom ok and then starts to initialize the different tables. Depending on the size of the database, this may take some time, and the application programmer must wait for the tables that the application needs before they can be used. This achieved by using the function:

This function suspends the caller until all tables specified in TabList are properly initiated.

A problem can arise if a replicated table on one node is initiated, but Mnesia deduces that another (remote) replica is more recent than the replica existing on the local node, the initialization procedure will not proceed. In this situation, a call to to mnesia:wait_for_tables/2 suspends the caller until the remote node has initiated the table from its local disc and the node has copied the table over the network to the local node.

This procedure can be time consuming however, the shortcut function shown below will load all the tables from disc at a faster rate:

Thus, we can assume that if an application wishes to use tables a and b, then the application must perform some action similar to the below code before it can utilize the tables.

          case mnesia:wait_for_tables([a, b], 20000) of
            {timeout,   RemainingTabs} ->
              panic(RemainingTabs);
            ok ->
              synced
          end.
        

Warning!

When tables are forcefully loaded from the local disc, all operations that were performed on the replicated table while the local node was down, and the remote replica was alive, are lost. This can cause the database to become inconsistent.

If the start-up procedure fails, the mnesia:start() function returns the cryptic tuple {error,{shutdown, {mnesia_sup,start,[normal,[]]}}}. Use command line arguments -boot start_sasl as argument to the erl script in order to get more information about the start failure.

3.4 Creating New Tables

Mnesia provides one function to create new tables. This function is: mnesia:create_table(Name, ArgList).

When executing this function, it returns one of the following responses:

The function arguments are:

As an example, assume we have the record definition:

      -record(funky, {x, y}).
    

The below call would create a table which is replicated on two nodes, has an additional index on the y attribute, and is of type bag.

      mnesia:create_table(funky, [{disc_copies, [N1, N2]}, {index,
      [y]}, {type, bag}, {attributes, record_info(fields, funky)}]).
    

Whereas a call to the below default code values:

mnesia:create_table(stuff, [])

would return a table with a RAM copy on the local node, no additional indexes and the attributes defaulted to the list [key,val].


Copyright © 1991-1999 Ericsson Utvecklings AB