[erlang-questions] [ANN] Erlang/SP v0.0.1

Sun Sep 16 11:21:37 CEST 2012

On Sep 16, 2012, at 10:18 AM, Zabrane Mickael wrote:
> 
> Could you please provide us with a more real world examples that could be solved by Erlang/SP?

I plan to add a simple tutorial to show how to use the existing patterns,
based on parsing an HTTP request sent by a browser.

> 
> Finally, what advantages the esp-cache (https://github.com/duomark/erlangsp/tree/master/apps/examples/esp_cache) offers compared to ordinary cache implementations?
> 

Flexibility, concurrency and a lesson that what works in imperative
approaches is not necessarily the best approach for functional
languages. I believe most people don't have the hardware to see
an advantage of this type of cache in the general case yet.

This was a proof-of-concept that exercised the library and tried to
challenge the assumptions built in the initial API. It was based
on https://github.com/mattsta/pcache which was in turn based on an
earlier paper I had written for ICFP 2006 in response to an erlang list
question of how to implement an LRU cache (my answer was don't,
think in terms of erlang's preferences and make a concurrent cache
with each datum implementing an idle timeout).

The esp_cache implementation is not yet complete as it does not
have the default timeout for expiration (this should be generated in
the M:F(A) that generates the datum so that it has the opportunity
to be based on the cost of the generation function and memory size
and can be set independently for each datum).

I chose this example to test co-ops because:

1) It has a fanout design
2) It can quickly use a lot of processes
3) It required both static nodes and dynamic nodes
4) It required a circular data path (inserting a new node to the directory)
5) It will be interesting to watch when visualization is available
6) It demonstrates simple concurrency control via a configurable number of workers
7) The directory is a serial bottleneck that can declaratively be made parallel in a future co-ops release
8) The release project (http://release.softlab.ntua.gr/bencherl/index.html) may provide insight to how it scales

I'm not sure that this caching scheme is particular beneficial in a production
environment, but it is flexible and allows for experimenting with the replacement
policy and adaptiveness of the cache as well as completely separating the
logic of managing the active data items from the logic of caching. I suspect it
is invaluable as a teaching tool once visualization is available as students
could make many easily isolated alterations and observe the resultant
change in caching behavior. The use of a process for a cached entry allows
more than just static data to be the target of caching.

Some examples of lessons that can be taught:

1) The M:F(A) delay to generate a cached item is out of band from other requests
2) What happens when a new, slow to obtain datum spikes in requests and directory races arise?
3) How do different directory structures compare (a single process/function is all that is changed)
4) Replacement policy other than idle expiration is not provided, add external policy which injects events
5) Mirror request statistics in the population of cached items by replicating a datum process after N requests
6) Show how a requester of a cached item can asynchronously cause it to be handed to a different receiver
7) Implement a lookup API on the resulting delivered datum process, so that each cached item is a collection of data

This approach _may_ be useful in production when most of the following are true:

1) You have many cores (at least 10% of the cache population?) 
2) Extracting a datum is very expensive either in time or memory
3) The structure of each datum is large, complex or possibly is another entire data structure with its own lookup API
or 3) The datum is a dynamically changing value with the latest value needed by multiple receivers
4) Communicating the datum via erlang messages is not too costly in terms of copying costs
5) The rate with which data is desired does not exceed the erlang messaging speed
6) Most concurrent requests route to different cached datum instances (use replicated processes to balance hotspots)

Don't forget that co-ops can output to other co-ops, so you may cascade any of the above enhanced caches!

jay