[erlang-questions] "actor database" - architectural strategy question

Mon Feb 17 23:03:09 CET 2014

“How to handle older largely inactive processes”
 What we do (did?) was to basically flush/hibernate the process.  To generalize, if the process hasn’t “done” anything in a while, save its state somewhere, and then hibernate (which has the added benefit of dealing with GC issues :-) 
Mind you, there is a wealth of info buried in the phrase “save its state somewhere”.  This really, really depends  on how big, how scalable, how fault-tolerant, how geographic, how….. you intend on getting. In short, it can range from “write out a text file” to “pay Riak gobs-o-money and have nodes worldwide”. YMMV

“Heroku gang” <— My apologies, I meant “those fine fine folks at Heroku who happen to be doing erlang”

cheers

Mahesh Paolini-Subramanya
That tall bald Indian guy..  
Google+  | Blog   | Twitter  | LinkedIn

On February 17, 2014 at 4:40:22 PM, Miles Fidelman (mfidelman@REDACTED) wrote:

Mahesh,

Mahesh Paolini-Subramanya wrote:
> “Large number of processes with very long persistence”
>
> You *will* run into GC issues here, and of all kinds
> - design artifacts (“hmm, the number of lists that I manipulate  
> increases relentlessly…”)
> - misunderstanding (“But I passed the binary on, without  
> manipulating it at all!”)
> - Bugs (Fred has a great writeup on this somewhere)

Very good points - though to a degree they sound more like dependency  
hell than traditional garbage collection to reclaim memory.

Given the document-oriented view, I'm viewing garbage collection more in  
the sense of filing and archiving - the same way that paper documents  
migrate to filerooms then to archives; or email and computer files  
simply get buried deeper and deeper in one's file system; sometimes you  
buy a larger drive; sometimes stuff migrates to off-site backup - but  
you generally don't throw stuff away (though when working on  
multi-author documents, one always comes back to how many intermediate  
copies to retain "for the record" after the final version goes to print).

In one sense, this ends up looking a lot like managing a git repository  
- more and more versions and branches accumulate, and so forth. And  
once starts thinking about storing only change logs.

This is also what motivates my question about how to handle older,  
largely inactive processes. It's one thing to bury a file deeper and  
deeper in a file system - and still be able to find and access it (and  
these days, search for it). It's another to think about migrating an  
actor from RAM to disk, in a way that retains its ability to respond to  
the infrequent message.

The other area I worry about is exponential growth in network traffic  
and cpu cycles - assuming that a lot of documents will never completely  
"die" - maybe an update will come in once week, or once a month, or  
they'll get searched every once in a while - as the number of processes  
increases, the amount of traffic will as well.

> Just keep in mind that in the end, you will almost certainly end up  
> doing some form of manual GC activities. Again, the Heroku gang can  
> probably provide a whole bunch of pointers on this…
>

Can you say a bit more about what it is about Heroku that I should be  
looking at? At first glance, it seems like a very different environment  
than what we're talking about here (or are you thinking about manual  
housekeeping for the virtual environment?).

And.. re. "Bugs (Fred has a great writeup on this somewhere)" - Fred  
who? (Maybe I can find it by googling!)

Thanks Very Much,

Miles

--  
In theory, there is no difference between theory and practice.
In practice, there is. .... Yogi Berra

_______________________________________________
erlang-questions mailing list
erlang-questions@REDACTED
http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140217/3fb8e06e/attachment.htm>