<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-text-flowed" style="font-family: -moz-fixed;

      font-size: 12px;" lang="x-unicode">That works fine in the simple

      case, but I'm contemplating repeatedly adjusting weights deep

      within a nested data structure. Your approach would result in

      creating an altered copy of the entire structure for each

      recursion.  This is probably only about 1KB or so of information,

      so doing this a few times isn't a problem, but doing it millions

      of time quickly becomes a problem.

      <br>

      <br>

      This can be addressed by either ets or the process directory, and

      those allow the internal structure to be safely modified.  In the

      process directory it's safe because the information is never

      exported from the process (except for i/o, which must be special

      cased).  Similarly a private ets can handle it without problems.

      And so can a global ets, as then a unique process specific id (NOT

      pid, as this needs to survive restarts) can be used as a part of

      the key.  So those three methods would work.  The question in my

      mind is how to predict the tradeoffs as it scales up.  I suspect

      that the process directory would use the least memory, though

      possibly it would be the global ets table.  A private ets table

      seems the most natural approach, but it looks, to my naive eyes,

      as if it would scale poorly WRT memory use.

      <br>

      <br>

      What I'd really like is to use a Mnesia system which kept a cache

      of active entries, but didn't require everything to be rolled in

      from disk.  AFAIKT, however, my choices with a Mnesia table are to

      keep everything in memory or to keep everything rolled out to

      disk.

      <br>

      <br>

      I also haven't been able to determine whether processes that are

      waiting to receive a message can be rolled out to inactive

      memory.  There are some indications ("use enough processes, but

      not too many") that they can't.  This means that I need to adapt

      my memory use to the systems that are being run on rather

      carefully.  If background processes keep activating every live

      process to check it's status I could easily end up with severe

      thrashing.  And <b class="moz-txt-star"><span class="moz-txt-tag">*</span>THAT<span

          class="moz-txt-tag">*</span></b> will affect the design.  If I

      need to hand manage the caching, then I loose a lot of the

      benefits that I'm hoping to get from Erlang.

      <br>

      <br>

      The basic design calls for a huge number of "processes" to be

      doing n x m communication, and the simple design calls for each

      "process" to be able to send messages to each other process,

      though only a subset of the messages would be actually sent.  My

      first sketch of a design called for each "process" to be mapped to

      a separate Erlang process, but this doesn't work, because Erlang

      doesn't like to have that many processes.  Even this simple

      design, however, required to figure for allowing 1000 inputs and

      1000 outputs to each "process", and probably well over 100,000

      "processes".  Most of them would be idle most of the time, but all

      would need to be "activatable" when messaged, and all would need

      to become dormant when just waiting for a message.  The idea is

      not a neural net, but it has certain similarities.

      <br>

      <br>

      Now if I could actually have one process per "process", then your

      proposal, which I recognize as the normal Erlang approach, would

      make sense, but that isn't going to work.  This could be done in

      that case by having lots of variables, so that there wouldn't be

      the need to have any modifications of deeply nested items, so not

      much would need to be copied.

      <br>

      <br>

      As for KISS, that's a great approach, but it doesn't reveal

      scaling problems.  When one is adapting an approach one should

      always KISS, but when designing which approach to try it's

      important to pick one that will work when the system approaches

      its initial design goal.

      <br>

      <br>

      <br>

      On 02/07/2018 03:45 PM, <a class="moz-txt-link-abbreviated"

        href="mailto:zxq9@zxq9.com">zxq9@zxq9.com</a> wrote:

      <br>

      <blockquote type="cite" style="color: #000000;">On 2018年2月7日水曜日

        8時56分01秒 JST Charles Hixson wrote:

        <br>

        <blockquote type="cite" style="color: #000000;">...so passing

          the state as function parameters would

          <br>

          entail huge amounts of copying.  (Essentially I'd be modifying

          nodes

          <br>

          deep within trees.)

          <br>

          <br>

          Mutable state would allow me to avoid the copying, and the

          state is not

          <br>

          exported from the process...

          <br>

        </blockquote>

        You seem to be confused a bit about the nature of mutability. If

        I set a variable X and in my service loop alter X, the next time

        the service loop recurses (loops) X will be a different value --

        it will have mutated, but within the context of a single call of

        the service loop function the thing labelled X at the time of

        the function call will be immutable.

        <br>

        <br>

        -module(simple).

        <br>

        -export([start/1]).

        <br>

        <br>

        start(X) ->

        <br>

           spawn(fun() -> loop(X) end).

        <br>

        <br>

        loop(X) ->

        <br>

           ok = io:format("X is ~p~n", [X]),

        <br>

           receive

        <br>

             {add, Y} ->

        <br>

               NewX = X + Y,

        <br>

               loop(NewX);

        <br>

             {sub, Y} ->

        <br>

               NewX = X - Y,

        <br>

               loop(NewX);

        <br>

             stop ->

        <br>

               ok = io:format("Bye!~n"),

        <br>

               exit(normal);

        <br>

             Unexpected ->

        <br>

               ok = io:format("I don't understand ~tp~n", [Unexpected]),

        <br>

               loop(X)

        <br>

           end.

        <br>

        <br>

        <br>

        1> c(simple).

        <br>

        {ok,simple}

        <br>

        2> P = simple:start(10).

        <br>

        X is 10

        <br>

        <0.72.0>

        <br>

        3> P ! {add, 15}.

        <br>

        X is 25

        <br>

        {add,15}

        <br>

        4> P ! {sub, 100}.

        <br>

        X is -75

        <br>

        {sub,100}

        <br>

        <br>

        <br>

        That is all there is to state maintenance, and this is how

        gen_servers work. This is also the form that has the least

        mysterious memory management model in the normal case, and the

        form that gives you all that nifty memory isolation and fault

        tolerance Erlang is famous for. Note that X is <b

          class="moz-txt-star"><span class="moz-txt-tag">*</span>not<span

            class="moz-txt-tag">*</span></b> copied every time we enter

        loop/1. If we send a message containing X to another process,

        though, <b class="moz-txt-star"><span class="moz-txt-tag">*</span>then<span

            class="moz-txt-tag">*</span></b> X is copied into the

        context of the process receiving that message.

        <br>

        <br>

        It doesn't matter at all what sort of a structure X is. Here it

        is a number, but it could be anything. Gigantic tuples chock

        full of maps and gb_trees and other process references and lists

        of things and queues and whatnot are the norm -- and none of

        this causes trouble in the normal case.

        <br>

        <br>

        As for mucking around in deep tree structures, altering nodes in

        trees does not necessarily entail making a copy of the whole

        tree. To you as a programmer there are two versions of the data

        which are effectively distinct, but that does not necessarily

        mean that they are two complete versions of the data in memory.

        The nature of copying (or whether copying happens at all under

        the hood) and how fast things can be garbage collected has to do

        with the nature of the task and what kind of data structures you

        are using. Because of immutability you <b class="moz-txt-star"><span

            class="moz-txt-tag">*</span>actually<span

            class="moz-txt-tag">*</span></b> get to share more data in

        the underlying implementation than otherwise.

        <br>

        <br>

        Fred provided a great explanation a while back here:

        <br>

        <a class="moz-txt-link-freetext"

href="http://erlang.org/pipermail/erlang-questions/2015-December/087040.html">http://erlang.org/pipermail/erlang-questions/2015-December/087040.html</a>

        <br>

        <br>

        The general approach to performance issues -- whether memory,

        I/O bottlenecks, messaging bottlenecks, or raw thunk time -- is

        to start out writing your processes in the vanilla way using

        state variables in a loop and only stepping away from that when

        some extreme deficiency is demonstrated. If you are going to be

        spawning a ton of processes at once to do things then you've

        really got no way of knowing what is going to break first until

        you actually have some working code and can see it break for

        yourself. People get themselves into trouble with the process

        dictionary, ETS, NIFs, etc. all the time because the use cases

        often do not warrant the use of these techniques.

        <br>

        <br>

        So keep it simple. Write an example of what you want to do. Try

        it out. You might wind up just saturating your processor or

        memory bus way before you hit an actual space problem. If

        something breaks try to measure why -- but right now without

        telling anyone the kind of data you're dealing with or what

        kinds of operations you're doing or any example code that is

        known to break in a certain way at a certain scale we can't

        really give you much helpful advice.

        <br>

        <br>

        -Craig

        <br>

        _______________________________________________

        <br>

        erlang-questions mailing list

        <br>

        <a class="moz-txt-link-abbreviated"

          href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a>

        <br>

        <a class="moz-txt-link-freetext"

          href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>

        <br>

      </blockquote>

      <br>

    </div>

    <br>

  </body>

</html>