[erlang-questions] HOW TO IMPROVE THE PERFORMANCE OF DETS

Christian S chsu79@REDACTED
Thu Dec 21 12:51:35 CET 2006


On 12/21/06, chamila piyasena <tchamila@REDACTED> wrote:
> Thank you again Cristian,
>
> actually it only stores the CDR records of the SMS(from number, to number ,
> delivered or not, date,  etc.. ) not the SMS it self and a separate dets
> file is created every hour .
>
> If some one want to see records  related to a particular phone number in a
> given period,  suppose  in  a duration of month  there will be many rows in
> corresponding files.
>
> going through all these records to select what we want is not efficient and
> very slow.
> thats why I thought about indexing (but there is no support for that in
> dets)

Are you calling select once per file for every phone number you want a
report on? If this is for billing, dont you typically want a report
for all phone numbers over the requested period?

No there is no support for indexing but that can be worked around, but
assuming you still want to scan each hourly dets file:

I suggest you begin using dets:foldl or dets:foldr  [other readers,
what is the difference, order isnt guaranteed anyway?]. Fold doesnt
make you have to care about the optimal N. With fold you will process
record after record and it allows you to accumulate information, or
perform side effects, such as building up a report file.

As for working around the lack of indexing. You can add information
about the messages stored in an hourly dets file and write that down
before you close it when replacing it with a new one. Just pick a
unique dets key to store that information under. Imagine storing the
phone numbers seen during the hour, and using this information to see
if it is worth scanning the dets file for a given phone number at all.
Imagine using fold as above to keep a report file for each hourly log,
a file that maps phone number to a list of the records seen for that
number. You can even store that in the same dets file if you make sure
the keys do not collide.

Similarly, you can keep a map up to date that maps phone numbers to
dets files which contain messages with it in. From scanning 30*24 log
files, touching ALL data, you will be looking up a phone number to
find which dets files it has contributed to. And then looking up the
phone number in those dets files to find which records that was. Only
touching a fraction of the log data.



More information about the erlang-questions mailing list