[erlang-questions] pre-load large data files when the application start

Garrett Smith g@REDACTED
Fri Mar 25 19:05:55 CET 2016


On Fri, Mar 25, 2016 at 12:09 PM Benoit Chesneau <bchesneau@REDACTED>
wrote:

> Hi all,
>
> I have a large data file provided as comma separated values (unicode data)
> I need to load and parse it ASAP since it will be used by all the
> functions.
>

What's the interface?


> The current implementation consists in parsing the file and generate
> either a source file or an include file that will be then compiled. My
> issue with it for now is that the compilation will use more than 1GB and
> then crash on small machines or containers.
>
> Other solutions I tried:
>
> - use merl + `-onload` to build a module on first call of the module (too
> long the first time)
> - store an ets file and load it later, which can be an issue if you need
> to create an escript will all modules later
> - load an parse in a gen_server (same result as using merl)
>
> Thinks I have in mind:
>
> - generate a DETS file or small binary tree on disk and cache the content
> on demand
> - generate a beam and ship it
>
> Is there anything else I can do?  I am curious how others are doing in
> that case.
>

I think this depends entirely on your interface :)

Do you have to scan the entire table? If so why? If not, why not treat this
as a indexing problem and start from disk, assuming you can defer loading
of any data until it's read?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160325/94e4ab83/attachment.htm>


More information about the erlang-questions mailing list