[erlang-questions] Using processes to implement business logic
Camille Troillard
lists@REDACTED
Fri Jan 30 14:11:14 CET 2015
Hi Craig,
Thank you for your answer, it is of incredible value.
I foresee more questions, related to distribution... when I get there.
Cam
On 30 Jan 2015, at 13:27, zxq9 <zxq9@REDACTED> wrote:
> On 2015年1月30日 金曜日 11:52:27 Camille Troillard wrote:
>> Hi,
>>
>> I am looking for opinions about using processes to encapsulate the state of
>> business entities.
>>
>> It looks like, in the context of our problem, we should have more advantages
>> implementing our domain model using processes rather than simple Erlang
>> records. It also appears that processes will act as a nice cache layer in
>> front of the persistent storage.
>>
>> So, what are your experiences?
>
> I've found processes to be extremely flexible with regard to representing
> business entity state. There are a few things to consider before you can gain
> much from process-based encapsulation of state, though.
>
> A determination must be made about what a useful granularity is for your
> business entities. In the case I deal with I have found it useful to start
> with a completelty normalized relational data schema as a starting point and
> build a heirarchy of structures useful to users up from there. It looks
> something like this:
>
> * Elementary record
> - As low as it gets; a schema of normalized relations.
>
> * Record
> - Practically useful assembly of elementary records and other records.
>
> * Document
> - Wraps whatever level of record the user wants to deal with in display,
> export and editing rules. This is the essence of a client-side application
> (regardless what language or paradigm the client is written in -- I've toyed
> with wxErlang for this, but Qt has sort of been a necessity because of ease of
> cross platform deployment).
>
> One form of business logic is encapsulated by the relational rules and the
> shape of the relational schema. A choice has to be made whether to make
> changes cascade at the database level or within application server code. There
> is no "right" answer to the question of what level to propagate data updates,
> but the shape of the data being declared to follow a strict normalized
> relational schema is important if extensibility is a concern (and with
> business data it always is).
>
> My choice has been to propagate notification of changes among record processes
> according to whatever other processes or external entities are subscribed to
> update notifications, but have the database schema cascade changes to foreign
> keys on its own (normalized relations primarily consist of primary keys and
> foreign keys, though). This choice forces a commitment to having client code
> (or record processess) calculate derived values, and using the database rules
> only for data integrity enforcement.
>
> Where before I had used materialized views to cache sets of data, I now use
> mnesia as a denormalized cache. Mnesia cannot store tables larger than 2GB,
> but this has not been a practical limitation within a single
> installation/client site (so long as BLOBs are stored as files, and only
> references to them are stored in the database rows). If this ever does become
> a limitation a caching strategy other than general memoization will become
> useful, but I've not hit any walls yet.
>
> Records that are "open" or "active" by a user are instantiated as processes.
> These records subscribe to the records they depend on so they receive/push
> updates among each other. In this way User A using Client A can update some
> data element X, and X will notify its underlying record process, which will
> propagate the change across the system downward to the underlying records and
> database, and upward to User B on Client B who has a related document open.
> This can take some getting used to for users who have grown accustomed to the
> typical "refresh the web page to see updates" form of editing or single-user
> business applications. (At the moment these living records exist on the
> application server, but it could be a delegated task if the clients were also
> Erlang nodes (but not a part of the server's cluster), if each table's owning
> process managed the subscription system instead of each record. Just haven't
> gotten that far yet.)
>
> This sort of data handling requires a lot of consideration about what
> "normalization" means, and also care when defining the record schemas. From
> records, though, it is easy to write OOP GUI code, or process-based wxErlang
> GUI code (which is easier, but harder to deploy on Windows, and impossible on
> mobile just now) without your head exploding, and gets you past the "Object-
> Relational Mismatch" problem. The tradeoff is all that thought that goes into
> both the relational/elementary record schema and the aggregate record schemas,
> which turn out to look very different. It requires a considerable amount of
> time to get folks who have only ever used an ORM framework on track with doing
> Objects-as-processes/records and elementary records as normalized relations --
> I have not found a magic shortcut to this yet.
>
> You will not get the schemas right the first time, or the second. Any data
> that is "just obvious" at first will probably prove to be non-trivial at its
> root. That is:
> - tracking people's names in different languages
> - properly dealing with scripts instead of just "languages"
> - doing addresses + location properly
> - making the intuitive leap that families are more like contract organizations
> which cover a time span instead of a simple {Husband, Wife, [Kids]} tuple
> - business relationship tracking
> - event timelines
> - non-Western units of measure
> - anything to do with calendars
> - etc.
>
> Even without all this architecture and just beginning with a relatively dirty,
> denormalized schema in mnesia or ETS tables it is possible to see how much
> more interesting "live" records defined as processes that are aware of their
> interdependencies can be. Combining this with a subscribe/publish model is
> very natural in Erlang. But even with a smallish store of business data you
> will have to find a way to distinguish between an "active" record and one that
> needs to reside latent as a collection of rows in tables. If you instantiate
> everything you can quickly find yourself trying to spawn not a few tens of
> thousands, but millions of processes (I think this is why you ask your next
> question below).
>
> Making each table or type of record a table- or store-owning process and doing
> pub/sub at that level may be a golden compromise or might wind up creating
> bottlenecks. This is part of my thinking behind making the client-side code
> Erlang also, because it seems like a very smooth method of delegation. The
> only way to really know is through experimentation. I imagine that there is
> probably a golden balance somewhere in the middle, but I haven't had to locate
> it yet, and in any case I am still discovering ways to do things.
>
> One thing that is obvious, though, is that my method of writing data
> definitions could be refined a bit and interpreted to generate much of the
> simpler record Erlang code, the SQL definitions, and probably the ASN.1
> definitions also (btw, it turns out things like JSON are not sufficient for
> doing business data reliably, and XML is a different sort of nightmare --
> boring, old, stodgy ASN.1 is the right tool in this case). Leveraging the
> information in the data definitions more completely would make experimentation
> a lot faster. As with anything else, its a time/money tradeoff, and one I am
> not in a position to make in my favor yet.
>
>> Now another question... given this “actor” based approach, I am having
>> difficulties to figure out a proper way of dealing with processes lifetime.
>> How would you do this in practice? Manually, or implement simple garbage
>> collection, reference counting, ...?
>
> Whenever a record's subscription count hits zero, it retires. This is a form
> of reference counting that is a natural outcome of the subscription "open a
> document" and "close a document/crash/drop connection" actions. So far this
> has been entirely adequate.
>
> I've written this in a bit of a rush, hopefully I explained more than I
> confused. There are a million more things to discover about how to make a
> system like this do more of the heavy lifting and deliver a better user value.
>
> -Craig
More information about the erlang-questions
mailing list