[erlang-questions] High volume CDR analysis

Wed Jan 16 10:57:29 CET 2008

I'm not sure if Erlang is a good fit for CDR analysis.  It depends on
what you're trying to achieve.  For accepting CDR records and storing
them, I wouldn't think mnesia would be that good a fit, due to the
sheer volume.  But if you were to design a more suitable data
structure for CDR record storage, then that might be really quite
interesting.  You may not need random access and complex queries,
maybe something that only allows sequential access (maybe with the
ability to filter) and is designed for quick roll-ups would be more
suitable. Column based table representations come to mind.

If you want to actually analyse (i.e., summarise in some way) in real
time, without storing the raw incoming records in between, then Erlang
could be a good platform.  But even then, you would probably want to
investigate in your own data structures for the summary tables, since
the incoming data rate would probably mean way too many transactions on
your summary data to be really comfortable for mnesia.

As far as I know, stuff like this is the forte of systems like K, J
and APL.  These usually come with nice examples about how they can
process enormous amounts of stock market ticker information in real
time.  That would be closer to the kind of data you get from CDR, I
would guess

Robby

PS. By "real time" in the above, I just mean that roll up of data is
done as it appears, not in a btach way later on; it has nothing to do
with real time programming.