[erlang-questions] [ANN] Rivus - Erlang Complex Event Processing

Darach Ennis darach@REDACTED
Mon Jan 6 03:31:36 CET 2014


Hi Vasco,

[Resent to all]

Looks like you're off to a good start here towards a SQL based CEP engine
for Erlang/OTP.
There are arguably 4 key constructs that Rivus would need to support to
qualify as a CEP engine:

1. Continuous Query
2. Windows (aggregation)
3. 'Complex' pattern matching (eg: temporal patterns across streams, stream
combinators, ...)
4. A domain specific language.

So, Rivus would basically qualify in principle under 1, 2 and 4 here but
not entirely under 3. I saw
no support for combinators, temporal pattern matching, state (tables or
variables), concurrency or
data parallelism. However, as the syntax allows the definition of multiple
correlations it could be said
that simultaneity (of queries) is supported - and this is essential.

A potential issue depending on your target audience is that the facilities
in core erlang for
expressive pattern matching and tuple processing aren't leveraged, nor is
the native support
for concurrency and parallelism through providing concurrency and data
parallelism of queries
in the DSL. StreamBase (note: I used to work for them so I'm probably
biased) support concurrency
and (data) parallelism in their SQL and visual (flow-based) languages and
in the hands of an
expert (eg: someone intimate with the runtime) is very powerful.

Apama is worth looking it - its runtime was inspired by the Erlang
concurrency model (allegedly)
and its monitorscript language supports processes.

In CEP languages and environments where these and more features are
provided - it doesn't
actually help the poor CEP developer - it hinders. Large complex CEP
algorithms are difficult to
evolve, maintain and support in most organisations. The tools typically
have weak debugging and
refactoring support if at all and the DSLs aren't standard and often have
constructs peculiar to the
lineage of the technology (it may have started out life as an active
database or as a log processing
tool or as a captured packet analyser ...).

StreamBase, by far, has the best tooling (yup, I'm biased), IDE, debugging
and refactoring support.
But a significant component of any successful CEP solution is native code.
A question I've been
grappling with is, what are the useful bits in CEP that would be useful
within Erlang as a library or
service? I've experimented with what I think are the most useful two,
namely: aggregate window
processing and data flow algorithm definition. Rather than define a SQL to
enshrine the conditions
under which they can be leveraged - plain old erlang provides a richer
environment for these to be
used fruitfully. Why write a DSL and Erlang is a 'real language' - one that
makes pattern matching
easy with distribution and concurrency built in if you need it?

https://github.com/darach/beam-erl

Embeddable data flow library. Branch, Pipe, Combine, Filter and Transform
data.

https://github.com/darach/eep-erl

Sliding, Tumbling, Periodic and Monotonic aggregation.

In the case of aggregation once the window semantics are defined the window
functions
can be provided by extension (use an erlang behaviour). So you could use
this to allow
user defined (aggregate) functions in your SQL dialect. The SQL statements
and clauses
could be compiled to an intermediary form that supports a more primitive
flow language
and runtime making building out the CEP engine a little bit easier. This
also favors plugging
in user defined constructs allowing the language itself to be extended
through user defined
operations.

I haven't implemented temporal pattern matching or interesting complex
combinators because I haven't
needed them myself of late. Even the window aggregation is simple. Some CEP
engines allow aggregates of multiple
dimensions (data, time, value, predicate expression) but these can
typically be (far) more easily defined
through composition.

Judging from the SQL dialect syntax the need so far is for expressivity in
filtering / detection
based on fairly simple windowed events or windowed temporal processing of
event data. Complex
processing of multiple event sources against complex scenarios (eg: near
real-time collusion or layering
fraud detection in capital markets market abuse and surveillance) without
extending the language dialect
would be hard/impossible in its current form.

It would be interesting to hear what the plans for the language are and
if/where plain old Erlang/OTP
could be leveraged to extend the capabilities of the engine. Once concern
though is that aggregation
windows are based on folsom, which in turn depends on ETS. If any of your
queries contain large
numbers of aggregate operations - you'll spend a lot of time in ETS. If
your intermediate operation
results aren't critical - thats a lot of overhead you can avoid by
providing your own aggregation/windowing
logic (or fork/steal eep-erls)

Promising start to a SQL based CEP in Erlang/OTP! Thanks for sharing so
early in its development!

Cheers,

Darach.


On Mon, Jan 6, 2014 at 1:02 AM, Vassil Kolarov <vasco@REDACTED> wrote:

> Hi all,
>
> I'd like to announce a project called 'Rivus'.  The goal of the project is
> implementation of complex events processing application in Erlang, which
> uses a DSL similar to ESPER's EPL. It is in a very early stage, but could
> be considered sort of a "MVP".
>
> Here is the GitHub repository: https://github.com/vascokk/rivus_cep
>
> Hope you'll find it interesting and useful.
>
> Best regards,
> Vasco
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140106/9ab1f987/attachment.htm>


More information about the erlang-questions mailing list