[erlang-questions] Erlang and/or MPI

Fri Jun 28 03:16:20 CEST 2013

On 28/06/2013, at 1:40 AM, David Mercer wrote:

> On Wednesday, June 26, 2013, Richard A. O'Keefe wrote:
> 
>> There's another idea which has often been applied in Haskell, and that's
>> writing a program in a high level language that _generates_ low level
>> code.
> 
> Am I wrong in thinking LISP would be a good language for this (to write in, not to generate to)?  LISP programs are data, after all.

Lisp is of course a wonderful language for writing programs that
write programs.  Erlang programs are _also_ data, though not as
conveniently, which is one of the reasons why Lisp-Flavoured Erlang
(LFE) exists.

The Haskell people claim that an advantage of using Haskell for
meta-programming is its type system.  Debugging code written by
hand is hard enough; debugging code that was written by a program
is harder, *unless* something about the way it was generated gives
you high confidence in it.

For example, Atom
(http://hackage.haskell.org/package/atom-1.0.12)
had this text in the message announcing the 0.0.2 release
(http://www.haskell.org/pipermail/haskell-cafe/2009-April/060602.html):

  Experiences with our Eaton project:
  - 5K lines of Haskell/atom replaced 120K lines of matlab, simulink,
    and visual basic.
  - 2 months to port simulink design to atom.
  - 3K lines of atom generates 22K lines of embedded C.
  - Design composed of 450 atomic state transition rules.
  - Rules with execution periods from 1ms to 10s all scheduled at
    compile time to a 1 ms main loop.
  - 3 minute compilation time from atom source to ECU.
* - Atom design clears electronic/sw testing on first pass.
* - Currently in vehicle testing with no major issues.

where the asterisks are mine.

For floating-point calculations, Repa
(http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial)
looks interesting.  Quoting that tutorial,
  Repa is a Haskell library for high performance, regular,
  multi-dimensional parallel arrays.  All numeric data is
  stored unboxed and functions written with the Repa combinators
  are automatically parallel (provided you supply "+RTS -N" on
  the command line when running the program).

As for Fortran, if it is written in an antique style, it _can_ be
painful to read.   If it is written using the modern features of
the language, it _can_ be quite easy to read.

I suggest that the single most important issue is getting the code
*right* in the first place.  I've just spent a day debugging some
code that manipulated quaternions.  There were two bugs:
-- my code for computing the absolute value while avoiding overflows
and loss of precision was stupidly broken.  My fault.
-- the source I had consulted for the definition of the logarithm
of a quaternion didn't bother mentioning that the formula they gave
implied a division by zero for quaternions with a zero vector part
and was outright wrong for (-w,0,0,0) quaternions.  Their fault.

I think my favourite example was a clustering algorithm by some
highly respected researchers where the published pseudocode ran
smack into an uninitialised variable on the first iteration.

Array-crunching code can be *astonishingly* hard to get right.
I believe that this is because an uncommonly high proportion of
trivial mistakes result in syntactically legal (but wrong!) code.

So I claim that the right language to *start* a project like this
in is the language that will let you code and debug the algorithm
most quickly.  Heck, it could well be Matlab or R.  See for
example http://cran.r-project.org/web/views/HighPerformanceComputing.html