[erlang-questions] How to do this in Erlang -- any suggestions ?

Mihai Balea mihai@REDACTED
Sun Jun 12 18:18:52 CEST 2011


On Jun 12, 2011, at 11:38 AM, Banibrata Dutta wrote:

> gr8 questions, and they certainly need clarification.
> cc'ing the group s.t. others could contribute too.
> 
> On Sun, Jun 12, 2011 at 8:48 PM, Mihai Balea <mihai@REDACTED> wrote:
> 
> On Jun 12, 2011, at 10:51 AM, Banibrata Dutta wrote:
> 
>> Prematurely sent.
>> 
>> On Sun, Jun 12, 2011 at 7:59 PM, Banibrata Dutta <banibrata.dutta@REDACTED> wrote:
>> 
>> What would be a good way to correlate asynchronous events, spot patterns over a sliding window (s.a. of no. of events elapsed or time elapsed), with millions of events occurring simultaneously, using Erlang ?
>> 
>> The set of possible events is known, and any unknown event is just flagged as 'unknown' (so all unknowns are similar). The set of possible event patterns can be enumerated, but is possibly quite a large set of patterns.
>> 
>> Was wondering as to what could be the approach taken to implement such a thing in pure Erlang. My initial thoughts were along the line of maintaining FSMs per event source, but with so many events and so many possible/valid patterns, the thing seems kind of unwieldy. Also, I'd like a non-programmer to be able to define new events and valid event patterns.
>> 
>> I believe 'Complex Event Processing' is quite likely to be the standard approach for such things, as I've found from some posts, and solutions exist in Java world for same, but both as an academic exercise (for the fun of learning) and for a potentially simpler + better solution, would like to try doing this is Erlang.
> 
> I think you need to define your problem better.  
> 
> Sure, let me try.
>  
> What exactly do you mean by "millions of events occurring simultaneously"?
> 
> Okay, so I can say something like 500 events/second handled for correlation would be a more realistic number.
>  
> At exactly the same time?
> 
> Yes... some of the events might be from same source, but spaced by as little as 50ms, but mostly from different sources. There could be some heirarchical relationship between sources. Very typical case of network management scenario. E.g. a fault port on a switch, could probably cause hundreds of destination unreachable events, application response timeouts, heartbeat losses etc..
>  
> Millions of events per second? Minute? Is that peak rate, average rate or minimum rate?
> 
> Okay, I got over-enthusiastic :-) . Say 100 events/second typical, 500 events/second peak, no real minimum.
> 
> What exactly is a pattern?
> 
> Node-A failed, Power in room-X where Node-A is kept failed, Nodes B,C,D which are served thru Node-A became unreachable, due to which Services L & M became unavailable, and due to which another dependent service N started giving inconsistent answers. So this is a pattern. However in this case, there's a possibility that Power-failure had nothing to do with Noda-A's failure, as backup power was available.
> 
> Another pattern is, Power in room X failed, then Noda A failed, leading to failure of only Node D, because somehow Nodes B & C were dynamically configured to reroute. This is another pattern.
> 
> What do you mean by "quite a large set of patterns"? Hundreds, thousands, millions?
> 
> Several hundreds is a distinct possibility, and thousands are not impossible, but millions -- probably not.
>  
> How long is that sliding window?
> 
> From few minutes (for certain type of events), to few days (for another type of events).
>  
> Can patterns encompass events coming from multiple sources or just one source?
> 
> Yes, indeed. However in this case, there needs to a "relationship" between the event sources, that is pre-defined. E.g. some sense of "topology" exists. However it is likely that only 2% of the event sources are interrelated.
>  
> Are patterns concerned only with event ordering and occurrence or there are timing issues involved as well?
>  
> Ordering, Timing, or any kind of causal relationship.


Okay, that's a bit more descriptive :)

First of all, don't dismiss your idea of using FSMs, however you might want to make them a bit more flexible. Maybe have a bunch of processes running pattern recognizers, say one process per pattern or class of patterns. Maybe you can filter events based on source so certain recognizers will only get certain events. If you have a low correlation between event sources maybe you can even design your system to distribute the processing of unrelated events to different nodes in a cluster (big scalability gain if you can do that). Another way of approaching this would be to duplicate the event stream to multiple nodes in your cluster and have each node only look for certain subsets of patterns.  

However, for a peak of 500 events/sec, you most likely can get away with running everything on a decently powerful machine (speaking from experience, we had a relatively similar, though simpler, system running just fine on a dual core server, handling up to 800 events/sec)

If you intend to let non programmers define events and patterns you'll probably want to define some sort of DSL. Try as best as you can to make it declarative only, this way you can probably get away with files containing Erlang terms.

Just a bunch of thoughts, hopefully this helps

Mihai

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110612/de53ed57/attachment.htm>


More information about the erlang-questions mailing list