[erlang-questions] Caching server

Tue May 17 19:10:16 CEST 2011

On Tue, May 17, 2011 at 12:04 PM, Richard Carlsson
<carlsson.richard@REDACTED> wrote:
>
> Processes don't magically solve all problems, even if it sometimes might
> look like it - there are always tradeoffs, in this case between ease of
> implementation on one hand and speed of access and memory overhead on the
> other. As a cache to avoid much more expensive network communication, and
> storing reasonably large chunks of data (like a web page) per key, its
> performance should be quite good. As a cache to avoid relatively quick disk
> operations with small data items for each key, it's not so good (but perhaps
> better than you might expect).

I can speak from experience that a process per cache entry works well
if your cache hits are reasonably balanced across all entries. But for
very popular cache entries, such as for popular web pages or web
content, using a process per cache entry can negatively affect
performance under load. Assuming the web server can handle thousands
of concurrent client requests, the process per cache entry approach
becomes a bottleneck if the majority of those clients are all
requesting the same page or small set of pages, due to the
serialization of the requests from all those concurrent connections
into just one or a few cache process message queues. I wrote about
this issue recently in Internet Computing:

http://steve.vinoski.net/pdf/IC-Erlang_Web_Process_Bottlenecks.pdf

Depending on what you're caching, one alternative is to just use an
ets table instead, as the read concurrency is quite good and eviction
is simple. Or, take a look at Redis or other such approaches.

--steve