[erlang-questions] Non-blocking BEAM code loading?

Mon Nov 7 18:23:29 CET 2011

Neat! A JIT prototype.  What is the story there?  Is that actively being
worked on somewhere by someone?  Can anyone provide details or an idea of
when we might see it?

Thanks,

-Anthony

On Mon, Nov 07, 2011 at 01:12:38PM +0100, Lukas Larsson wrote:
> It is NOT planned for R15B, it is however something which will be needed by
> the JIT prototype so we will have to implement it while doing that.
> On Nov 7, 2011 9:48 AM, "Tino Breddin" <tino.breddin@REDACTED> wrote:
> 
> > As Paolo mentioned there is a optimization of the code upgrade strategy on
> > the
> > roadmap for R15B AFAIR. Not only is currently the actual code upgrade only
> > performed by a single core, but also any other tasks in the system. Meaning
> > on a well loaded multi-core system all load will need to be handled by that
> > single core for the time of the upgrade. This might cause the delay you are
> > seeing.
> >
> > T
> >
> > On Nov 6, 2011, at 4:33 PM, Paolo Negri wrote:
> >
> > > We run an application which runs thousands of long lived processes and
> > > we see the system blocking on code purge during code updates.
> > > I remember that Kenneth Lundin at the recent Erlang User Conference
> > > announced that something related to code loading optimization is in
> > > the erlang roadmap, hopefully slides will be published soon [1], if I
> > > remember well the change was related to spreading code purge across
> > > all the available cores while currently a single core is actually used
> > > to perform the operation.
> > >
> > > We also use the trick of compiling data in modules in order to push
> > > data in the constant pool but we actually have thousands of small
> > > terms (rendered as one function clause per term) and loading these
> > > modules doesn't seem to block, but in our case I guess that the
> > > overall size is much less than 60MB.
> > >
> > > [1]
> > http://www.erlang-factory.com/conference/ErlangUserConference2011/speakers/KennethLundin
> > >
> > > Paolo
> > >
> > > On Sun, Nov 6, 2011 at 5:02 AM, Bob Ippolito <bob@REDACTED> wrote:
> > >> Normally just a few hundred, purge isn't the slow part for us and I
> > don't
> > >> believe that it blocks at all (not that I noticed).
> > >>
> > >> On Saturday, November 5, 2011, Robert Virding
> > >> <robert.virding@REDACTED> wrote:
> > >>> If you have many processes then code loading can take a noticeable
> > time.
> > >>> The code server must purge old versions of a module which it does by
> > going
> > >>> through all processes checking each one if it running the old code and
> > if so
> > >>> killing it. I don't know if this blocks all the schedulers and if so
> > why,
> > >>> but it can take a noticeable time to do.
> > >>>
> > >>> Robert
> > >>>
> > >>>
> > >>> ________________________________
> > >>>
> > >>> ETS is no good for our use case, we have ~60MB worth of uncompressed
> > >>> serialized terms (nested gb_trees mostly) that we need live in a given
> > >>> request. We traverse it very quickly and end up with a very small list
> > of
> > >>> terms as the result (essentially a filter on a nested structure). A
> > no-copy
> > >>> ets would work, but since the work is so short lived and code is
> > tightly
> > >>> associated to this structure I think that our current solution is
> > >>> appropriate as long as we can fix the blocking.
> > >>>
> > >>> "declare constant" may also work, but I think it is more practical to
> > just
> > >>> make code loading better in the short term (which has other benefits).
> > You
> > >>> could implement "declare constant" on top of the code loader, we have a
> > >>> mochiglobal module in mochiweb that basically serves that purpose.
> > >>>
> > >>> Using a module is a convenient way to give concurrent access to the
> > data
> > >>> to hundreds of simultaneous processes with minimal serialization.
> > >>>
> > >>> -bob
> > >>>
> > >>> On Saturday, November 5, 2011, Björn-Egil Dahlberg
> > >>> <wallentin.dahlberg@REDACTED> wrote:
> > >>>> Yes, it is a simple (and currently only way) to push data to the
> > constant
> > >>>> pool. You could use ETS instead. It would of course also remove data
> > from
> > >>>> the heap and reduce GC copy strain but introduce copy on any read.
> > >>>> Björn Gustavsson talked about introducing a "declare constant"
> > function
> > >>>> earlier but i don't know if he has done any work on it. The use case
> > was the
> > >>>> same as for you, pushing lookup structures from gb_trees and gb_sets.
> > But,
> > >>>> solving code loading would probably be a better prioritization.
> > >>>> I would like to think that the garbage collector should solve this.
> > Data
> > >>>> sets which are read only and live are tenured to a generational heap
> > and not
> > >>>> included in minor gc phases. Putting it in a constant removes it all
> > >>>> together of course but i would like the garbage collector to identify
> > and
> > >>>> handle this with generational strategies. The trade off is
> > generational
> > >>>> heaps linger and may hold dead data longer than necessary.
> > >>>>
> > >>>>
> > >>>> Den 5 november 2011 21:30 skrev Bob Ippolito <bob@REDACTED>:
> > >>>>>
> > >>>>> We abuse code loading "upgrades" so that we can share memory and
> > reduce
> > >>>>> GC pressure for large data structures that do not change quickly
> > (once every
> > >>>>> few minutes). Works great except for all the blocking!
> > >>>>>
> > >>>>> On Saturday, November 5, 2011, Björn-Egil Dahlberg
> > >>>>> <wallentin.dahlberg@REDACTED> wrote:
> > >>>>>> There is no other locking for code loading than blocking. This is an
> > >>>>>> optimization of course since locking mechanism overhead is removed
> > from the
> > >>>>>> equation. Code loading is not used all that often in the normal
> > cases
> > >>>>>> besides startups and upgrades.
> > >>>>>> That being said, there are plans to remove this "stop-the-world"
> > >>>>>> strategy since it is blocking other strategies and optimizations.
> > Also, we
> > >>>>>> are well aware of that blocking does degrade performance when
> > loading new
> > >>>>>> modules and does not agree with our concurrency policy.
> > >>>>>> I think we can lessen the time blocked in the current implementation
> > >>>>>> but the blocking strategy should (and probably will) be removed.
> > Nothing
> > >>>>>> planned as of yet though.
> > >>>>>> Regards,
> > >>>>>> Björn-Egil
> > >>>>>>
> > >>>>>> 2011/11/5 Bob Ippolito <bob@REDACTED>
> > >>>>>>>
> > >>>>>>> We've found a bottleneck in some of our systems, when we load in
> > >>>>>>> large
> > >>>>>>> new modules there is a noticeable pause (1+ seconds) that blocks
> > all
> > >>>>>>> of the schedulers. It looks like this is because the
> > >>>>>>> erlang:load_binary/2 BIF blocks SMP before it does anything at all.
> > >>>>>>>
> > >>>>>>> It would be a big win for us if more of this happened without
> > >>>>>>> blocking
> > >>>>>>> the VM, there's a lot of busy work in loading a module that
> > shouldn't
> > >>>>>>> need any locking. For example, decompressing and decoding the
> > literal
> > >>>>>>> table is probably where our code spends almost all of its time.
> > >>>>>>>
> > >>>>>>> There aren't a lot of comments for why it needs to lock the VM,
> > >>>>>>> especially for the whole of load_binary. Are there any hidden
> > gotchas
> > >>>>>>> in here that I should know about before giving it a try? I'm unable
> > >>>>>>> to
> > >>>>>>> find much where the block is actually necessary, but I am not very
> > >>>
> > >> _______________________________________________
> > >> erlang-questions mailing list
> > >> erlang-questions@REDACTED
> > >> http://erlang.org/mailman/listinfo/erlang-questions
> > >>
> > >>
> > >
> > >
> > >
> > > --
> > > Engineering
> > > http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
> > >
> > > wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
> > > Sitz der Gesellschaft: Berlin; HRB 117846 B
> > > Registergericht Berlin-Charlottenburg
> > > Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
> > > _______________________________________________
> > > erlang-questions mailing list
> > > erlang-questions@REDACTED
> > > http://erlang.org/mailman/listinfo/erlang-questions
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> >

> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym@REDACTED>