[erlang-questions] Sending a large Erlang content to a set of remote nodes
Olivier BOUDEVILLE
olivier.boudeville@REDACTED
Fri Mar 29 10:46:54 CET 2013
Hello Joe,
erlang@REDACTED a écrit sur 28/03/2013 18:44:08 :
> How big is the set of remote VMs (how many dozens?) -
> Is the communication bandwidth between machines symmetric.
>
> Once two machines have got a copy they could *both* send the data to a
third.
> Machine one sends the first half, machine two the second. Now three
> machines have a copy
>
> Now three machines can send a copy to a fourth, the first can send
> the first third, ...
> and so on.
>
> Lookup epidemic gossip protocols.
>
> This is a very nice exercise in parallel programming.
Yes, I agree; I had thought to a similar "peer-to-peer" mechanism (like
also the one Bob hinted) but for my use case currently I think it would be
a bit of an overkill.
To introduce more context, the setting is a distributed discrete-time
simulation engine (Sim-Diasca) running on a HPC cluster (hence indeed with
rather homogeneous hosts and symmetric network links, at least a few
dozens of them) where sendfile will now be used (in replacement of
previous solutions) for at least two purposes:
- during the deployment phase: sending a compressed archive,
containing the simulator code and data, dedicated to all computing nodes
(as no prerequisite is expected to be available on them beforehand);
currently the simulation archive is usually rather small, and various
delays result in the parallel deployment processes being not really
synchronised (hence no "multi-sendfile" - one file reading, multiple TCP
sendings - could be really useful there)
- during a recovery phase: to address any reliability issue which
could happen on future large-scale simulations (despite a good MTBF for
each core/host/link, a sufficiently large number of cores would make
failures almost certain for longer simulations), the user will be able to
specify the maximum number (k) of hosts that may be simultaneously lost in
the course of the simulation without having it crashing; for that, each
time a simulation milestone is met, each node (one node per host
currently) is to send a compressed file containing a serialization of its
full state (mostly the state of its model instances) to the k nodes
securing it, in order that a simulation rollback can be performed in case
of up-to-k simultaneous crashes; the size of each file should be roughly
the same as the one of the RAM of the corresponding node (some gigabytes),
and sendfile will be very useful there; a "multi-sendfile" (reading the
serialization file once, sending it to the k securing nodes
simultaneously) could be useful there, however it is low-priority for us,
and the current sendfile seems already a very good solution for that
(thanks Tuncer!); moreover, as during this phase each node will send its
state to its k securing nodes and reciprocally will receive the
serialization information from the k nodes it secures, a kind of uniform,
already-saturating network load should exist by design
But if ever there was in the future an Erlang-based generic, efficient,
transparent peer-to-peer file-exchange service between a set of nodes, of
course I would gladly integrate it :0)
Best regards,
Olivier.
>
> Cheers
>
> /Joe
>
>
>
> I was searching for a solution that would be
> reliable/simple/efficient to do so (preferably in that order),
> knowing that these terms could be either be kept in the RAM of the
> sender or, maybe preferably (the size of the data being probably
> roughly on par with the local RAM), as a compressed file on disk.
>
> Currently I send a binary, compressed archive thanks to a basic
> Erlang message, but I think it is not a good practice (ex: maybe the
> kernel ticks are not sent "out of band" and their delaying by larger
> archives could trigger spurious time-outs). I imagine sendfile with
> enough async threads could be a good candidate, however I am unsure
> that the same content (either as a whole or by chunks) could be read
> once, yet be sent to multiple recipients.
>
> Any idea?
>
> Thanks in advance for any hint!
>
> Best regards,
>
> Olivier.
> ---------------------------
> Olivier Boudeville
>
> EDF R&D : 1, avenue du Général de Gaulle, 92140 Clamart, France
> Département SINETICS, groupe ASICS (I2A), bureau B-226
> Office : +33 1 47 65 59 58 / Mobile : +33 6 16 83 37 22 / Fax : +33
> 1 47 65 27 13
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les
> informations qui y figurent sont strictement confidentielles. Toute
> utilisation de ce Message non conforme à sa destination, toute
> diffusion ou toute publication totale ou partielle, est interdite
> sauf autorisation expresse.
> Si vous n'êtes pas le destinataire de ce Message, il vous est
> interdit de le copier, de le faire suivre, de le divulguer ou d'en
> utiliser tout ou partie. Si vous avez reçu ce Message par erreur,
> merci de le supprimer de votre système, ainsi que toutes ses copies,
> et de n'en garder aucune trace sur quelque support que ce soit. Nous
> vous remercions également d'en avertir immédiatement l'expéditeur
> par retour du message.
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de
> toute erreur ou virus.
> ____________________________________________________
> This message and any attachments (the 'Message') are intended solely
> for the addressees. The information contained in this Message is
> confidential. Any use of information contained in this Message not
> in accord with its purpose, any dissemination or disclosure, either
> whole or partial, is prohibited except formal approval.
> If you are not the addressee, you may not copy, forward, disclose or
> use any part of it. If you have received this message in error,
> please delete it and all copies from your system and notify the
> sender immediately by return message.
> E-mail communication cannot be guaranteed to be timely secure, error
> or virus-free.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________
This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.
E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130329/08603def/attachment.htm>
More information about the erlang-questions
mailing list