Storing time series data
Joel Reymont
joelr1@REDACTED
Fri Jul 8 20:51:31 CEST 2005
Here's how I think time series can be stored in Mnesia or on disk...
A tick is tuple with a fixed size, indexed on a time stamp. I think
that rather than storing a regular list of tick in Mnesia I should
store compressed term binaries. A single binary could be a list or
tuple of 1 week's worth of tick data, maybe a month's worth.
A separate index will need to be kept to map time stamp to binary
chunk and offset within the binary.
The storage has to be optimized for fast retrieval and ticks are
normally appended as opposed to inserted. Inserting a new tick
somewhere in the middle would require rehashing the whole database
but I think this will be quite rare. Normally, ticks will be appended
to binary chunks and chunks appended to the database.
A trading strategy will need a sliding window of N ticks where the
first tick is always dropped and the window is advanced by one tick.
I think it will be quite easy to slide through binary chunks and if I
understand it correctly binaries will not be copied but references
will be created. Unless of course the sliding window spans a few
binary chunks.
What do you think?
Thanks, Joel
--
http://wagerlabs.com/uptick
More information about the erlang-questions
mailing list