Efficiency of big return functions?

Thu Aug 27 16:05:00 CEST 2020

> On 27 Aug 2020, at 06:51, Oliver Korpilla <oliver.korpilla@REDACTED> wrote:
> 
> Hello,
> 
> I have some data that's between 100K and 1M in size, depending if I use
> the whole data set or just a part. Access is read-only.
> 
> So far we kept a copy in each process for latency reasons and
> performance has been quite good. There can be quite a lot of processes
> (1,000s) but we have some big machines to run them on...
> 
> So far we haven't considered ETS or Mnesia because all these processes
> would have to go through a single bottleneck in rather short order (they
> are truly parallel and independent from each other and we have lots of
> cores to schedule them on) - or am I wrong? How well does it scale?
Ets lookups would rarely be a bottleneck in my experience (and you can
speedup reads using the `read_concurrency` option, if the data is mostly static). 
*but* if the data you are trying to retrieve from ets is large, that *could* be a bottleneck since
matching data need to be copied from ets to the process heap on each lookup.

> 
> That said, we had good experiences with moving some static configuration
> information to code. Performance is really good, but that data was
> roughly of the format of a keyed map or smaller lists.

This might be of your interest.  We use it exactly for this case, a large complex
data structure that need to be consulted on every request.
https://erlang.org/doc/man/persistent_term.html <https://erlang.org/doc/man/persistent_term.html>

However, if your doubt is just regarding function returns, there is nothing to worry
see below.

> 
> The data we're looking at are big lists (potentially 1,000s of entries)
> of medium-sized maps, or maybe a map serving as index into these other maps.
> 
> My question is - how do I efficiently return a big static value (a list
> of maps with no parameters to change their construction) from a
> function? Does BEAM optimize this? Or is the value constructed when the
> function is called? And is there anything I can do to improve it?
> 
Data returned from function call is already on the process’ heap, so no copy
is needed, you basically return a pointer.  Same is true for the parameters you pass
to the function.
The size of the struct matters when you need to pass it from one process to another.

> Thank you!
> Oliver
> 
> 
> --
> Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
> https://www.avast.com/antivirus
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200827/54ababcd/attachment.htm>