Mnesia not suitable for time series storage?

Joel Reymont <>
Wed Jul 13 13:10:37 CEST 2005


Folks,

I'm trying to create a tick database on top of Mnesia. Ticks are  
price quotes (trades in a stock) sent by the exchange a few times a  
second for active securities.

My two requirements for tick storage are very high-speed writes and  
high-speed sequential retrieval.

I basically issue one type of query and that is to give me all the  
ticks in a particular security that happened between this and that time.

An issue that presents a problem is that exchanges might send many  
quotes per second but they do not give time in milliseconds, so I  
need to keep track of the order in which the quotes came in.

My unoptimized tick record looks like this:

-record(tick, {
       symbol,
       datetime,
       timestamp,
       price,
       size
      }).

Symbol is a string (list) of 4 characters, datetime is a bignum  
formed from a date and time 20040901,09:39:38 by stripping the comma  
and the colons. Timestamp is now(), price and size are float and  
integer respectively.

The timestamp field is what keeps track of the order in which price  
quotes come in. I do not know yet how to tell Mnesia to return the  
records sorted by timestamp, at least not how to do it efficiently.

Table is created like this:

mnesia:create_table(tick,
                  [
                   {disc_copies, Nodes},
                   {index, [datetime]},
                   {type, bag},
                   {attributes, record_info(fields, tick)}])

The table is a bag because ticks are unique only on the combination  
of symbol, datetime AND timestamp. I tried making the table a set but  
only got one record in it.

Now, here's the kicker... I parse through a file of historical quotes  
and insert ticks into Mnesia one by one and time the insert. Inserts  
start at around 1ms and quickly grow to 5, 7, 8, etc. I'm at 24ms per  
insert right now and I killed my previous attempts to load about  
120,000 after an hour or two.

I would like to be able to store at least 500 ticks per second in  
Mnesia and have good lookup speed as well. Is this possible?

     Thanks, Joel

--
http://wagerlabs.com/uptick






More information about the erlang-questions mailing list