<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On 27 Aug 2020, at 06:51, Oliver Korpilla <<a href="mailto:oliver.korpilla@gmx.de" class="">oliver.korpilla@gmx.de</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">Hello,<br class=""><br class="">I have some data that's between 100K and 1M in size, depending if I use<br class="">the whole data set or just a part. Access is read-only.<br class=""><br class="">So far we kept a copy in each process for latency reasons and<br class="">performance has been quite good. There can be quite a lot of processes<br class="">(1,000s) but we have some big machines to run them on...<br class=""><br class="">So far we haven't considered ETS or Mnesia because all these processes<br class="">would have to go through a single bottleneck in rather short order (they<br class="">are truly parallel and independent from each other and we have lots of<br class="">cores to schedule them on) - or am I wrong? How well does it scale?<br class=""></div></div></blockquote><div>Ets lookups would rarely be a bottleneck in my experience (and you can</div><div>speedup reads using the `read_concurrency` option, if the data is mostly static). </div><div>*but* if the data you are trying to retrieve from ets is large, that *could* be a bottleneck since</div><div>matching data need to be copied from ets to the process heap on each lookup.</div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div class=""><br class="">That said, we had good experiences with moving some static configuration<br class="">information to code. Performance is really good, but that data was<br class="">roughly of the format of a keyed map or smaller lists.<br class=""></div></div></blockquote><div><br class=""></div><div>This might be of your interest.  We use it exactly for this case, a large complex</div><div>data structure that need to be consulted on every request.</div><div><a href="https://erlang.org/doc/man/persistent_term.html" class="">https://erlang.org/doc/man/persistent_term.html</a></div><div><br class=""></div><div>However, if your doubt is just regarding function returns, there is nothing to worry</div><div>see below.</div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class="">The data we're looking at are big lists (potentially 1,000s of entries)<br class="">of medium-sized maps, or maybe a map serving as index into these other maps.<br class=""><br class="">My question is - how do I efficiently return a big static value (a list<br class="">of maps with no parameters to change their construction) from a<br class="">function? Does BEAM optimize this? Or is the value constructed when the<br class="">function is called? And is there anything I can do to improve it?<br class=""><br class=""></div></div></blockquote>Data returned from function call is already on the process’ heap, so no copy</div><div>is needed, you basically return a pointer.  Same is true for the parameters you pass</div><div>to the function.</div><div>The size of the struct matters when you need to pass it from one process to another.</div><div><br class=""></div><div><div><br class=""></div><div><br class=""></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div class="">Thank you!<br class="">Oliver<br class=""><br class=""><br class="">--<br class="">Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.<br class=""><a href="https://www.avast.com/antivirus" class="">https://www.avast.com/antivirus</a><br class=""><br class=""></div></div></blockquote></div><br class=""></body></html>