[erlang-questions] Keeping massive concurrency when interfacing with C

Tue Oct 4 05:28:05 CEST 2011

On Mon, Oct 3, 2011 at 10:05 PM, John Smith <emailregaccount@REDACTED> wrote:
> Sorry, I should've explained in more detail what we're trying to do.
> That would help, eh? :)
>
> In a nutshell, our goal is take a portfolio of securities (namely
> bonds and derivatives), and calculate a risk/return analysis for each
> security. For risk, interest rate shock, and for return, future cash
> flows. There are different kinds of analyses you could perform.
>
> Here's a more concrete example. Pretend you're an insurance company.
> You have to pay out benefits to your customers, so you take their
> money and make investments with it, hoping for a (positive) return, of
> course. Quite often insurance companies will buy bonds, especially if
> there are restrictions on what they can invest in (e.g., AAA only).
>
> You need to have an idea of what your risk and return are. What's
> going to happen to the value of your portfolio if yields rise or fall?
> Ideally you want to know what your cash flows will look like in the
> future, so you can have a reasonable idea of what shape you'll be in
> depending on the outcome.
>
> One such calculation would involve shocking the yield curve (yields
> plotted against maturity). If yields rise 100 basis points, what
> happens to your portfolio? If they fall far enough how much would
> yields need to fall before any of your callable bonds started being
> redeemed?
>
> Part of the reason why I think Erlang would work out well is the
> calculations for each security are independent of each other -- it's
> an embarrassingly parallel problem. My goal was to spawn a process for
> each scenario of a security. Depending on how many securities and
> scenarios you want to calculate, there could be tens or hundreds of
> thousands, hence why I would be spawning so many processes (I would
> distribute these across multiple machines of course, but we would have
> only a few servers at most to start off with).
>
> Because Erlang is so efficient at creating and executing thousands of
> processes, I thought it would be feasible to create that many to do
> real work, but the impression I get is maybe it's not such a great
> idea when you have only a few dozen cores available to you.
>
> CGS, could you explain how the dynamic library would work in more
> detail? I was thinking it could work like that, but I wasn't actually
> sure how it would be implemented. For example, if two Erlang processes
> invoke the same shared library, does the OS simply copy each function
> call to its own stack frame so the data is kept separate, and only one
> copy of the code is used? I could see in that case then how 20,000
> Erlang processes could all share the same library, since it minimizes
> the amount of memory used.
>
> David, the solution you described is new to me. Are there any
> resources I can read to learn more?
>
> Joe (your book is sitting on my desk as well =]), that's rather
> interesting Erlang was purposely slowed down to allow for on-the-fly
> code changes. Could you explain why? I'm curious.
>
> We are still in the R&D phase (you could say), so I'm not quite sure
> yet which specific category the number crunching will fall into (I
> wouldn't be surprised if there are matrices, however). I think what
> I'll do is write the most intensive parts in both Erlang and C, and
> compare the two. I'd prefer to stick purely with Erlang though!
>
> We have neither purchased any equipment yet nor written the final
> code, so I'm pretty flexible to whatever the best solution would be
> using Erlang. Maybe next year I can pick up one of those 20K core
> machines =)
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>

Given your description above, I'd probably just write the first
version of your application in Erlang. Normally I'm all for the NIF's
but your scenario doesn't strike me as a the best fit (without first
measuring the native Erlang which will be easier to code and maintain
initially).

The reason here is that there's a noticeable cost to passing data
across the Erlang/(Driver|NIF|CNode) boundary so anything you're doing
on the C side should be fast enough to more than make up for this. A
good example here is from Kevin Smith's talk at the last Erlang
Factory SF on using CUDA cards for numerical computations (he's
illustrating the CUDA memory transfer overhead, but the same basic
idea applies to passing data from Erlang to C).

Given that your examples (sound to my non-financially familiar brain)
to be small calculations on lots of data, you might be pleasantly
surprised on the performance you'll get just from using Erlang across
a large number of cores. And even if you find out in the future that
you can write a small NIF that does your calculation in C using a
request queue, that's just as well because you'll have tested that you
need it and will know exactly how much you're saving by using C and so
on.

Paul