New trading systems platform

Vlad Balin vlad@REDACTED
Fri Jul 8 14:10:45 CEST 2005


> HP Wei wrote:
> >   I do not use Erlang often enough to make comments on
> >   data storage using Mnesia.
> >   I just want to point out that to store tick data
> >   of many years needs a LOT of disk space plus
> >   the capability of compression (when writing)
> >   (and decompression when reading)  in your database design.
>
> Some form of archiving mechanism seems to be needed.
> Re. compression, there is of course a very convenient
> form available to erlang programmers:
>
>    Bin = term_to_binary(Data, [compressed]).
>
> Decompression is transparently done by binary_to_term(Bin).
>
> This uses LZW compression on the binary. Perhaps this is
> not good enough?
It depends. If you represent tick sequence as list of relative price
differences, it _probably_ will work. I would say, this is the thing one
should start from. But usually custom huffman-alike compression scheme more
effective in such applications in terms of combination of compression speed
and ratio.

> Perhaps one could come up with a caching scheme in Mnesia,
> where table segments can be converted from ram_copies
> to disc_only_copies after a certain time? One can use a
> custom fragmentation scheme that divides objects into time
> periods. How many fragments could one have in one table?
In case of ATS we need disk tick storage for the backtesting purpose only.
It implies very simple access scenario (rare look-ups, sequential [forward
in time] read of large data volumes, "joins" of several tables by time
field). It make sense to develop file-based storage for tick storage (one
file per instrument, data stored sequentially by chunks of compressed
ticks), and avoid mnesia.

Vlad




More information about the erlang-questions mailing list