LRU queue or heap data structure implementation?
Richard Cameron
camster@REDACTED
Wed Nov 23 18:29:54 CET 2005
On 23 Nov 2005, at 15:56, Jay Nelson wrote:
> I have wanted to write a similar thing for a while now. I would do
> it with processes:
>
> 1) Create a server to manage the ets table (eventually you would
> want this distributed), which handles the following messages:
> - New: create a new process and insert in ets with key
> - Locate: see if a key leads to an existing process
> - Reap: remove key and process from ets
>
> 2) Each process has a receive which does the following:
> - GetData: returns the state and loops
> - after LRU_Timeout: calls ets manager with Reap(SelfKey) then dies
... so this sounds like a purely in-memory database where each "row"
of data is actually a process, with a mind of its own. I'm sure if
you took that to its logical conclusions you could build a
particularly wacky variant of a relational database (where foreign
keys between "rows" are paths you can send messages over). There's
probably a weird variant of database theory in which you model select
and update operations by sending a request to an object in the
system, recursively having it create a cascade of messages out to
every "row" the query needs to consider, and waiting for asynchronous
responses to come back. Sounds quite insane for someone who's grown
up with SQL, but I wonder if it's a more natural way of thinking
about data. Who knows?
Keeping my feet on the floor for a bit though, all I'm trying to do
as extend an ets table to store data and keep track of infrequently
accessed items. Potentially we could be talking about 10^7 or 10^8
entries in the cache, so I'm not sure how happy the current Erlang
runtime would be about spawning that number of processes. My guess is
that it wouldn't be very amused at all. Either way, I'd just be
pushing out the problem of trying to come up with sensible data
structure to whichever part of the emulator is responsible for
maintaining the schedule of which process is due to receive "after"
timeouts and when.
Francesco suggested that it might be possible just to use an ets
table which maps access times to keys, updating the key with
erlang:now() on every read operation. I suspect there's going to be a
hell of a lot of binary-tree rebalancing going on, but I'll prototype
that up and see how it performs. Maybe I don't even need to worry
about optimisation and everything will magically run fast.
... that said, I'll probably actually go away, make a cup of tea, and
think about this intriguing idea of building a database where queries
are evaluated by pinging lots of messages around lots of little data-
as-persistent-process things. It might even be a refreshing way of
thinking about OLTP systems (and probably a nightmare for doing DSS
queries).
Richard.
More information about the erlang-questions
mailing list