[erlang-questions] Caching server

Tue May 17 18:04:26 CEST 2011

On 2011-02-19 03:19, Nicholas Wieland wrote:
> Hi *, I'm reading the Manning book. My question is very simple: a
> full chapter of the book is devoted to the implementation of a simple
> caching server. The author, if I got it correctly, at one point
> states that in Erlang, thanks to its lightweight processes, it's ok
> to have a caching server that spawns a process for every key/value
> pair. Of course I don't expect that the example in the book is a
> production ready implementation, but I would like to ask if it would
> be possible for an architecture like this to be production ready
> (say, something like Redis), or if I should take it with a pinch of
> salt, only as a demonstration. This thing made me curious because
> there's no language or technology that would permit something like
> this, hence my question :)
>
> TIA,

This was also asked (and answered) in a thread on the Manning forums: 
http://www.manning-sandbox.com/thread.jspa?threadID=40749&tstart=0

As Martin said there, "this overall design has been used in production 
but the one used in production is a bit more complex. The current design 
[in the book] was specifically formulated to illustrate a number of 
design principles and Erlang features. It is really more for 
instructional purposes as opposed to direct production use." (If you 
read the thread, it might look as if Eric is contradicting Martin by 
saying "the cache has not been used or tested in production", but then 
Eric is referring to the implementation as it stands in the book.)

In summary: yes, a similar design has actually been used in production. 
It's certainly an _easy_ way of implementing a cache, because it solves 
the storage and eviction issues by simply leveraging Erlang's processes 
and timeouts. However, processes are a finite resource: you'll need to 
set the upper limit with "erl -P..." if you want to be able to have more 
than the default maximum of 32768 processes. Whether this implementation 
is efficient enough for you depends on your use case. Measure and see.

Processes don't magically solve all problems, even if it sometimes might 
look like it - there are always tradeoffs, in this case between ease of 
implementation on one hand and speed of access and memory overhead on 
the other. As a cache to avoid much more expensive network 
communication, and storing reasonably large chunks of data (like a web 
page) per key, its performance should be quite good. As a cache to avoid 
relatively quick disk operations with small data items for each key, 
it's not so good (but perhaps better than you might expect).

    /Richard