[erlang-questions] Robustness problems when using records [WAS: Re: Question/Alternative on Frames Proposal [Warning: Long]]

Tom Parker thpr@REDACTED
Sat May 19 22:29:39 CEST 2012


Hmm.  Given Richard's reference to "version skew" I had figured this issue had long since been clearly articulated.  Perhaps that is not a good assumption on my part.


Let us use the following assumptions:

(1) Erlang intends to deliver a robust system.  For purposes of this discussion, robust is defined to mean that a wrong answer (ignoring a fault in the programmer's logic) must not be produced.  Returning errors is acceptable, but a miscalculation is not.(2) Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability. (quoted from erlang.org)
(3) A data structure with a name->value relationship is useful for software developers, has material value when shared across different modules, and should be robust when shared across different modules.
(4) Halting - for an undefined period of time - a set of virtual machines to perform an update violates either high availability or soft-real time, depending on your point of view.
(5) The solution needs to be practical, and not require an ISO9000-grade manual of steps to make sure you maintain robustness.

If you do not agree with the assumptions, might as well stop now.  The entire discussion around whether my statement is right or wrong turns on the 5 assumptions above (and actually you can throw out either 4 or 5, but not both)

Given the above, I believe records force you to choose to break one of the following items:
(1) Robustness
(2) Soft-Real Time
(3) High Availability
(4) Practicality


I do not believe this can be solved by Make files.  In section 4 of his document (one can specifically refer to question 7), Richard treats the issue with records as one of compilation (What must be done to C? Recompile C).  However, just like "things that are not tested don't work", code which is compiled and not deployed doesn't exist.  Therefore the problem (for records) is not simply one of compilation, but one of compilation *and concurrent deployment of the new version of C with the new version of P*.


Therefore, I continue to assert my statement below is correct as written.  Records are the issue, and it is the destruction of the name information before it reaches the VM that is the issue.  Here is the counter-example to why a Make-like functionality is not sufficient to meet the assumptions above for the Erlang language/platform when records are the method used for the name->value relationship:

(1) Define a record, let's call it (shorthand, not real syntax) #rectangle{x, y, w, h}, place this into geom.hrl
(2) Compile geomlibrary and geomconsumer referencing geom.hrl
(3) Start distributed application with geomlibrary on machine#1 and geomconsumer on machine#2
(4) Modify #rectangle in geom.hrl to be {w, h, x, y}
(5) Recompile, and Makefile technology ensures both geomlibrary and geomconsumer are rebuilt.
(6) Deploy new BEAM files to a running (HA) application

Let us start with the assumption that deployment of geomlibrary and geomconsumer are independent.  There is a window of time in which the first module (whichever it is) is deployed and the second module is not.  This window of time, due to potential network and CPU latencies on the machines, is not strictly definable.  In the window of time between deployment, there is a robustness issue as the wrong answer will be produced (an area function would calculate x*y during that period of time regardless of whether P or C were deployed first).  Therefore, we have a known robustness vulnerability if deployment is independent.  Therefore, deployment must be concurrent.

One option is to shut down the application and restart.  This violates the high availability requirement.

The other option is to make the deployment of multiple BEAM files to a running application a unit of work.  In order to do that, you have to halt any use of all of the modules involved in that unit of work across all of the machines potentially running those modules, and wait for acknowledgement back from all machines before any are allowed to proceed.  If any timeout, you must not deploy the change to any of the machines.  This introduces a number of issues involved with guarantees of robustness across machines and "soft real time" response times... and has the ability to halt the application up to the defined timeout period, which may need to be material for a large scale application.  If you extend the problem to consider machines that are communicating over network connections and NOT sharing a cookie, this unit of work consideration becomes a nearly intractable problem of coordination and trust across potentially disparate organizations... in
 having to define, coordinate and trust a "shared unit of work".

So records fail for at least one of the following reasons:
(1) You have failed the robustness requirement by not doing a unit of work deployment and therefore have a window of time in which wrong answers (but no error) can be produced.
(2) You have failed the high availability requirement by forcing a full restart of the application.
(3) You have failed the soft-real-time requirement due to unit of work deployment of BEAM files across a distributed application
(4) You have failed practicality of deployment of BEAM files across a distributed application due to coordinating a "shared unit of work" across untrusted systems.

This choice of which of the important aspects of Erlang must be sacrificed is one of the fundamental reasons why frames or some equivalent proposal to 
have a name-> value relationship _in the VM_ is required in order to 
maintain the ability to build robust, massively scalable soft real-time systems with requirements on high availability.  Frames, or some equivalent, ensure the name locations make it to the VM and the deployment of recompiled modules can be independent and robust.
 
--Tom 

--
Tom Parker
thpr@REDACTED


________________________________
 From: Kostis Sagonas
To: erlang-questions@REDACTED 
Sent: Saturday, May 19, 2012 11:26 AM
Subject: [erlang-questions] Robustness problems when using records [WAS: Re: Question/Alternative on Frames Proposal [Warning: Long]]
 
On 05/18/2012 02:21 AM, Tom Parker wrote:
> ....
> 
> Where I think we would vehemently agree: I expect Erlang to be robust.
> That's why I'm even here. An issue with records where you can compile
> two files (using the same .hrl) and end up with a result that (a)
> compiles, (b) doesn't produce an error and (c) produces the wrong
> answer... is a serious issue. It distinctly shows the shortcomings of
> records.

Not really related to the frame discussion, but I would like to point out that the above statement is wrong. The situation you describe does not show shortcomings of records; instead it shows shortcomings of programming with .hrl files and without an appropriate 'make'-like utility to track dependencies between files.

Records are not without problems, but IMO this is not one of them. This is a problem of using .hrl files and choosing to program in a way which is not disciplined and thus error prone. There is a very simple way of avoiding this problem that IMO is a very nice way of programming: Use records as abstract data types and have them in a single module that exports appropriate getters and setters for manipulating fields of the record. If, for whatever reason, you do not like this way of using records and want the ability to perform pattern matching and field extraction in more than one module as you do today with records, better make sure you use an appropriate Makefile (or equivalent) for compiling your application. It's not the fault of record syntax if you do not use such a mechanism! (Makefiles are technology of the 70's after all...)

Kostis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120519/2dac93e1/attachment.htm>


More information about the erlang-questions mailing list