[erlang-questions] List Question

Andrew McIntyre andrew@REDACTED
Wed Aug 9 01:40:22 CEST 2017


Hello Richard,

The oo implementation we have is in 2 parts,

1. parse into a tree with no knowledge of semantics

2. Use a specific Object to provide an interface to the data based on
its type


I am involved in standards process and there is a blog post here for
anyone interested:

https://kb.medical-objects.com.au/display/PUB/HL7v2+parsing


Essestially plan to do same in erlang

1. Parse into a tree

2. Create unit that has functions for reading the known values for a
specific segment - these are autogenerated  eg have a msh.erl that has
a msh:Sending_Facility() function

3. Use reader functions that allow message to become more complex, but
permit old function to continue to work


Andrew

Wednesday, August 9, 2017, 9:27:24 AM, you wrote:

RAOK> From what I've been able to glean about the HL7 message
RAOK> format, there are two aspects:
RAOK> * the basic syntax is the multi-level delimited thingy but
RAOK> * there is also *semantics*, predefined message types,
RAOK>   an assortment of data types, rules for mapping data
RAOK>   types to trees and so on.

RAOK> From a quick look at the Elixir HL7 parser, they have
RAOK> taken steps to handle some (but perhaps not all) of the
RAOK> *semantics* of HL7 and don't just give a tree of strings,
RAOK> but more structured data.

RAOK> Parsing the multi-level delimited syntax is trivial.
RAOK> Dealing with the semantics is not.
RAOK> I think that in figuring out for yourself how to work
RAOK> with HL7 messages in Erlang, the starting point would
RAOK> be
RAOK>  - what message types do you want to handle?
RAOK>  - what kinds of data occur in them?
RAOK>  - how do you want to represent those kinds of
RAOK>    data in Erlang?
RAOK>  - do you actually want to represent *every* field
RAOK>    at all?  Some might not be relevant to you.
RAOK>  - would you be streaming messages through a
RAOK>    system (like some sort of pub/sub queueing
RAOK>    middleware), summarising messages, storing
RAOK>    them, or what?
RAOK>  - what does a type declaration for a message type
RAOK>    look like in HL7?  Is there some way to automatically
RAOK>    derive parsing code from that?

RAOK> What I'm getting at with the last point is that there
RAOK> is ASN.1 support for Erlang.  Give it an ASN.1
RAOK> definition, and you get Erlang code out the other end.
RAOK> I am particularly thinking of the PADS project:

RAOK> PADS: Processing Arbitrary Data Streams

RAOK> Kathleen Fisher and Bob Gruber, AT&T Labs

RAOK> Slides in ppt http://homepages.inf.ed.ac.uk/wadler/xmlbinding/

RAOK> Transactional data streams, such as sequences of stock-market
RAOK> buy/sell orders, credit-card purchase records, web server
RAOK> entries, and electronic fund transfer orders, can be mined very
RAOK> profitably. As an example, researchers at AT&T have built
RAOK> customer profiles from streams of call-detail records to significant financial effect.

RAOK> Often such streams are high-volume: AT&T's call-detail stream
RAOK> contains roughly 300 million calls per day requiring
RAOK> approximately 7GBs of storage space. Typically, such stream data
RAOK> arrives ``as is'' in ad hoc formats with poor documentation. In
RAOK> addition, the data frequently contains errors. The appropriate
RAOK> response to such errors is application-specific. Some
RAOK> applications can simply discard unexpected or erroneous values
RAOK> and continue processing. For other applications, however, errors
RAOK> in the data can be the most interesting part of the data.

RAOK> Understanding a new data source and producing a suitable parser
RAOK> are crucial first steps in any use of such data. Unfortunately,
RAOK> writing parsers for this kind of data is a difficult task, both
RAOK> tedious and error-prone. It is complicated by lack of
RAOK> documentation, convoluted encodings designed to save space, the
RAOK> need to handle errors robustly, and the need to produce
RAOK> efficient code to cope with the scale of the stream. Often, the
RAOK> hard-won understanding of the data ends up embedded in parsing
RAOK> code, making long-term maintenance difficult for the original
RAOK> writer and sharing the knowledge with others nearly impossible.

RAOK> The goal of the PADS project is to provide languages and tools
RAOK> for simplifying data processing. We have a preliminary design of
RAOK> a declarative data-description language, PADSL, expressive
RAOK> enough to describe the data feeds we see at AT&T in practice,
RAOK> including ASCII, binary, EBCDIC, Cobol, and mixed data formats.
RAOK> From PADSL we generate a tunable C library with functions for
RAOK> parsing, manipulating, and summarizing the data. In joint work
RAOK> with Mary Fernandez and Ricardo Medel, we are working to
RAOK> integrate PADS and XQuery to support declarative querying of
RAOK> data sources with PADS descriptions.

RAOK> --------------------
RAOK> The PADS project moved from AT&T to
RAOK> http://pads.cs.tufts.edu/doc.html

RAOK> I say this in all seriousness: if I had a need to process
RAOK> GB of HL7 data, I would start by seeing if PADS was adequate
RAOK> to describe it, and if so I'd write an HL7->Erlang data
RAOK> translator in C or ML (as PADS has C and ML versions).
RAOK> If not, I'd see what ideas I could steal from PADS.

RAOK> Using a declarative data language to describe the message types
RAOK> I was interested in would be an up-front cost, but it would
RAOK> hugely simplify later maintenance.






-- 
Best regards,
 Andrew                             mailto:andrew@REDACTED

sent from a real computer





More information about the erlang-questions mailing list