[erlang-questions] refactoring a very large record

Thu Oct 20 16:14:19 CEST 2011

On Thu, Oct 20, 2011 at 14:37, Joel Reymont <joelr1@REDACTED> wrote:
> I have a record with about 80 fields.

I always hesitate when I hear about large records of this size. If
they are only read, or mostly read, they tend to be fast. But they
don't support updates very well as it requires you to write a new
record object of size 80. It leads to the fact that I often break such
big records into smaller pieces and then take the performance hit of
accessing the smaller pieces, but gives me faster updates.

> I would like to make this record opaque and completely encapsulate it in a single file.
> Should I export getters and setters for each field?

I tend to have modules that operate on records. That is, the module
contains functions that operate on the record and other modules then
call into this module to carry out work. I rarely access the record
directly, but I export "views" of the data in the record which i can
pattern match on outside.

The reason I prefer this solution is that it keeps records somewhat
local modules. It makes it way easier to change internal record
representations later when you decide to move stuff to/from ETS,
introduce a process as the role of the data and so on.

> Would these be inlined by the compiler?

No. Erlang in general does not inline across module boundaries. It is
possible to do, but it requires you to have a proper deopimization
pass which can replace the stack properly when you load new code.
V8/Crankshaft does this for Javascript with success but it is not
there for Erlang (yet).

> Is ETS a suitable replacement for a record of this size?

In some cases yes. You can store the 80-element tuple in ETS which is
expensive as you copy the tuple to the ETS store. But then
ets:lookup_element/3 gives you fast (parallel/concurrent) access to
the data. In the case of updating single elements, you have
ets:update_element/3 which is much faster at mutating a single element
since you don't have to alter the rest of the record. ETS is really a
mutation store with no persistence.

Another viable option is to make the 80-record tuple into a process.
Then one can move some of the work to the tuple itself rather than
querying for it and then acting upon it locally in processes. The
pattern to spot is the "queue processor" pattern, where a process
dequeues the data and then impersonates the data as a process. In that
case, you can make the data into a process itself and get much simpler
code.

-- 
J.