<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Fri, Mar 25, 2016 at 11:19 PM Michael Truog <<a href="mailto:mjtruog@gmail.com">mjtruog@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <div>On 03/25/2016 02:33 PM, Benoit Chesneau

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr"><br>

        <br>

        On Friday, March 25, 2016, Michael Truog <<a href="mailto:mjtruog@gmail.com" target="_blank">mjtruog@gmail.com</a>> wrote:<br>

        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

          <div bgcolor="#FFFFFF" text="#000000">

            <div><br>

            </div>

            <tt>Having the build process generate the module file and

              the beam file seems decent.  There isn't a need to build

              the module dynamically (during runtime, upon startup) or

              store the unicode data in global storage due to the

              unicode changes being infrequent.   Then, if you do need

              to update due to unicode changes, you can always hot-load

              a new version of the module, during runtime and the usage

              of the module shouldn't have problems with that, if it is

              kept as a simple utility/library module.  This problem

              reminds me of the code at </tt><a href="https://github.com/rambocoder/unistring" target="_blank">https://github.com/rambocoder/unistring</a>

            and there might be overlap in the goals of these two

            repositories.<br>

          </div>

        </blockquote>

        <div><br>

        </div>

        <div><br>

        </div>

        <div>this is what the current release (1.2) does. But it doesn't

          compile in containers or machines =< 1GB. The build crash.

          This is why i'm looking at shipping a pre-compiled beam. or

          maybe include the data in a db. but for now my tests with a db

          file (ets) shows it's really slower 30-40ms vs 6ms using maps<span></span> and

          a pre-compiled beam. Also maps use less storage compared to

          simply using function pattern matching in the beam.</div>

        <div><br>

        </div>

        <div>- benoît</div>

        <div><br>

        </div>

      </div>

    </blockquote>

    </div><div bgcolor="#FFFFFF" text="#000000"><tt>I think you need to switch to using function pattern matching,

      when keeping it in a module to keep memory usage down.  Storing

      everything in a map has to deal with a big chunk of map data, but

      storing everything in the module as function pattern matching

      cases is just part of the module data (should be better for GC due

      to less heap usage and should be more efficient).</tt>  You

    probably want to try and keep all the function pattern matching

    cases in-order, though it isn't mentioned as helpful at

    <a href="http://erlang.org/doc/efficiency_guide/functions.html#id67975" target="_blank">http://erlang.org/doc/efficiency_guide/functions.html#id67975</a> (might

    affect the compiler execution, if not the efficiency of the pattern

    matching).  If you used more formal processing of the unicode CSV

    data it will be easier, perhaps with a python script (instead of

    awk/shell-utilities, also portability is better as a single script),

    to create the Erlang module.  If necessary, you could use more than

    a single Erlang module to deal with separate functions, but a single

    function should require a single module to keep its update atomic

    (not trying to split a function into multiple modules based on the

    input).<br></div></blockquote><div><br></div><div>I agree pattern matching should be probably better than the maps for GC (they are only 1ms faster on lookup). But the problem is really not generating the module:</div><div><a href="https://github.com/benoitc/erlang-idna/blob/v1.x/src/idna_unicode_data1.erl">https://github.com/benoitc/erlang-idna/blob/v1.x/src/idna_unicode_data1.erl</a><br></div><div><br></div><div>The current issue with above is the amount of RAM needed to compile the beam. If the application is built on a machine with RAM => 1GB it will fail.  I guess I could just generate the beam with pattern matching and ship it like  I do in the "precompiled" branch . Unless some come with a better idea, i think i will go with it. WWhat do you think? The annoying thing is having to do the `-on_load` hack (just cause i'm lazy). Using rebar or <a href="http://erlang.mk">erlang.mk</a> i wouldjust generate and ship it in ebin dir. But rebar3 doesn't copy any content from it to its _build directory :| </div><div><br></div><div><br></div><div>- benoît</div><div> </div></div></div>