[erlang-questions] pre-load large data files when the application start

Fri Mar 25 19:11:20 CET 2016

On Fri, Mar 25, 2016 at 7:06 PM Garrett Smith <g@REDACTED> wrote:

> On Fri, Mar 25, 2016 at 12:09 PM Benoit Chesneau <bchesneau@REDACTED>
> wrote:
>
>> Hi all,
>>
>> I have a large data file provided as comma separated values (unicode
>> data) I need to load and parse it ASAP since it will be used by all the
>> functions.
>>
>
> What's the interface?
>
>
>> The current implementation consists in parsing the file and generate
>> either a source file or an include file that will be then compiled. My
>> issue with it for now is that the compilation will use more than 1GB and
>> then crash on small machines or containers.
>>
>> Other solutions I tried:
>>
>> - use merl + `-onload` to build a module on first call of the module (too
>> long the first time)
>> - store an ets file and load it later, which can be an issue if you need
>> to create an escript will all modules later
>> - load an parse in a gen_server (same result as using merl)
>>
>> Thinks I have in mind:
>>
>> - generate a DETS file or small binary tree on disk and cache the content
>> on demand
>> - generate a beam and ship it
>>
>> Is there anything else I can do?  I am curious how others are doing in
>> that case.
>>
>
> I think this depends entirely on your interface :)
>
> Do you have to scan the entire table? If so why? If not, why not treat
> this as a indexing problem and start from disk, assuming you can defer
> loading of any data until it's read?
>

Sorry I should have just posted the code I was working on (the advantage of
working on opensource stuffs).

The code I'm referring is here : https://github.com/benoitc/erlang-idna
and the recent change I describe:
https://github.com/benoitc/erlang-idna/tree/precompile

The table really need to be in memory somehow or need to be accessed very
fast while reading it, since it will be used to encode any domain names
used in a requests (can be xmpp, http..) .

It basically check the code for each chars in a string and try to
compose/decompose  it.

- benoît
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160325/69ef2145/attachment.htm>