[erlang-questions] high performance cache

Sat Sep 4 14:24:51 CEST 2010

On Sat, Sep 4, 2010 at 12:07 AM, Senthilkumar Peelikkampatti
<senthilkumar.peelikkampatti@REDACTED> wrote:

> I am looking for LRU caching framework in Erlang with thresholds to control
> the amount of data written to it. I explored memcached Erlang derivatives
> and ETS based solutions but not sure about which way to go. Please share
> your experience using caching framework. I am not looking for distributed
> cache.

Etorrent has a very very simple LRU janitorialization on file
descriptors. The principle is this: Every FD is governed by a process.
These processes enters themself into a janitor ETS table. Upon
reaching a high watermark, we do a full scan of the ETS table, find
the oldest beasts not accessed recently and then kill them. It is
viable because the number of open descriptors tend to be fairly small,
so we are not paying that much for a full table scan. Code:

http://github.com/jlouis/etorrent/blob/master/src/etorrent_fs_janitor.erl

If the table ends up being very big, I guess one should clean it up
based on a timer and a heuristic on how much stale data is in the
table. Also, beware of storing too much data in ETS as it is copied
into the memory space of the process in question. Same caveats apply
as when you try to send a 32mb binary tree from one process to the
other (in the default GC/allocator).

My guess is you can take the code above and use it for inspiration to
build an LRU ETS cache. It shouldn't take more than a couple of days
max to get it into shape.

-- 
J.