[erlang-questions] [eeps] New EEP: setrlimit(2) analogue for Erlang

Fri Feb 8 02:28:06 CET 2013

On Thu, Feb 7, 2013 at 2:24 PM, Björn-Egil Dahlberg
<wallentin.dahlberg@REDACTED> wrote:
> 2013/2/7 Ville Tuulos <tuulos@REDACTED>
>>
>> On Thu, Feb 7, 2013 at 7:27 AM, Björn-Egil Dahlberg <egil@REDACTED>
>> wrote:
>> > I dug out what I wrote a year ago ..
>> >
>> > eep-draft:
>> >
>> > https://github.com/psyeugenic/eep/blob/egil/system_limits/eeps/eep-00xx.md
>> >
>> > Reference implementation:
>> > https://github.com/psyeugenic/otp/commits/egil/limits-system-gc/OTP-9856
>> > Remember, this is a prototype and a reference implementation.
>> >
>> > There is a couple of issues not addressed or at least open-ended.
>>
>> Looks great! I truly hope this will get accepted rather sooner than
>> later, at least as an experimental feature.
>
>
> Well as I said previously, I still have some issues with this approach.
> Envision the goal and then see if this is a good first step.
> Meaning: if we want to include other resource limits that is not process
> oriented how would we then design the interface then? A resource limits
> interface perhaps?

The general case is a can of worms. It took 5-6 years for Linux to get
process containers [1] in a decent shape.

I hope that a consensus could be reached that a relatively simple
max-heap limit would be useful enough as such for now, without having
to worry about all the other cases (ref. Tony's email). The interface
could be pretty much as in EEP 0042.

If ever someone had courage to propose a sane, generic resource limit
interface, maybe after the max-heap had been battle-hardened over
several years, it shouldn't be too difficult to unify / deprecate the
old options.

>> Does the proposal cover refc binaries as well? Almost all interesting
>> use cases that I can imagine for limits involve binaries.
>
>
> My implementation doesn't cover it. It covers process memory blocks, i.e.
> everything except refc-binaries. (well their headers are covered: procbins)
>
> Binaries are special beasts and needs special care. The easiest way to
> implement refc binary limits would be to use the virtual binary heap concept
> already present in the gc.
>
> By using vheaps we would count all the memory referenced by procbins in the
> process heap. Now, here it gets interesting. Several procbins may refer to
> the same refc binary blob and that memory would be counted several times.
> You would have to take special accounting care if you want refc uniqueness
> which would make it kind of expensive. You can't mark the binary as counted
> since other processes might be gc:ed and they might also reference it so you
> would have to make some other accommodations.
>
> Other than that you can just use the same idea as heap_size limits. The
> checks are similar and the api could be too: erlang:process_flag(limits,
> [{bin_vheap_size, Size}]).

If I understood correctly, bin_vheap_size sounds exactly what I would
need. I think the semantics should be strictly related to _creation_
of binaries, not sharing. The idea is to block a rogue process from
creating tons of binaries, which is hard / impossible to do now.

On the other hand, it is easy for a process to discard large binaries
from its inbox, if it so wishes. The limits could either ignore
binaries whose reference count > 1 or count them only against the
limit of the process that originally created it, if possible.

I guess I would need to read the source to understand the issue with
double-counting :) One optimization could be that if the pessimistic
assumption that every procbin refers to a separate refc blob results
to a number that is less than the limit, we know that there is nothing
to worry about, which should be the typical case.

Ville

[1] http://lwn.net/Articles/236038/

> // Björn-Egil
>
>>
>> Here's one test case:
>>
>> 1. Create a web server in Erlang e.g. with Mochiweb, Cowboy, Inets.
>> 2. Create a request handler that expects to receive Zip files, which
>> it extracts with zip:extract(Request, [memory]).
>> 3. Create a zip bomb [1]: dd if=/dev/zero bs=1M count=8000 | zip req.zip -
>> 4. POST the small req.zip to the web server.
>> 5. See the VM go down in flames.
>>
>> Obviously the per-process limits would elegantly solve this problem if
>> they covered binaries as well. A far less elegant solution would be to
>> handle the unsafe decompression with an external process and open_port
>> (inefficient) or implement a sophisticated alternative to the zip
>> module which handles the limits by itself (inefficient, annoying).
>>
>> I understand that limiting the message queue / ets / NIFs can be
>> trickier. Just covering the basic max-heap with binaries would be a
>> good starting point.
>>
>> Ville
>>
>> [1] http://en.wikipedia.org/wiki/Zip_bomb
>>
>> > * Should processes be able to set limits on other processes? I think not
>> > though my draft argues for it. It introduces unnecessary restraints on
>> > erts
>> > and hinders performance. 'save_calls' is such an option.
>> >
>> > * ets - if your table increases beyond some limit. Who should we punish?
>> > The
>> > inserter? The owner? What would be the rationale? We cannot just punish
>> > the
>> > inserter, the ets table is still there taking a lot of memory and no
>> > other
>> > process could insert into the table. They would be killed as well.
>> > Remove
>> > the owner and hence the table (and potential heir)? What kind of
>> > problems
>> > would arise then? Limits should be tied into a supervision strategy and
>> > restart the whole thing.
>> >
>> > * In my draft and reference implementation I use soft limits. Once a
>> > process
>> > reaches its limit it will be marked for termination by an exit signal.
>> > The
>> > trouble here is there is no real guarantee for how long this will take.
>> > A
>> > process can continue appending a binary for a short while and ending the
>> > beam with OOM still. (If I remember it correctly you have to schedule
>> > out to
>> > terminate a process in SMP thus you need to bump all reduction. But, not
>> > all
>> > things handle return values from the garbage collector, most notably
>> > within
>> > the append_binary instruction). There may be other issues as well.
>> >
>> > * Message queues. In the current implementation of message queues we
>> > have
>> > two queues. An inner one which is locked by the receiver process while
>> > executing and an outer one which other processes will use and thus not
>> > compete for a message queue lock with the executing process. When the
>> > inner
>> > queue is depleted the receiver process will lock the outer queue and
>> > move
>> > the entire thing to the inner one. Rinse and repeat. The only guarantee
>> > we
>> > have to ensure with our implementation is: signal order between two
>> > processes. So, in the future we might have several queues to improve
>> > performance. If you introduce monitoring of the total number messages in
>> > the
>> > abstracted queue (all the queues) this will most probable kill any sort
>> > of
>> > scalability. For instance a sender would not be allowed to check the
>> > inner
>> > queue for this reason. Would a "fast" counter check in the inner queue
>> > be
>> > allowed? Perhaps if it is fast enough, but any sort of bookkeeping costs
>> > performance. If we introduce even more queues for scalability reasons
>> > this
>> > will cost even more.
>> >
>> > * What about other memory users? Drivers? NIFs?
>> >
>> > I do believe in increments in development as long it is path to the
>> > envisioned goal.
>> > And to reiterate, i'm not convinced that limits on just processes is the
>> > way
>> > to go. I think a complete monitoring system should be envisioned, not
>> > just
>> > for processes.
>> >
>> > // Björn-Egil
>> >
>> > On 2013-02-06 23:03, Richard O'Keefe wrote:
>> >
>> > Just today, I saw Matthew Evans'
>> >
>> >       This pertains to a feature I would like to see
>> >       in Erlang.  The ability to set an optional
>> >       "memory limit" when a process and ETS table is
>> >       created (and maybe a global optional per-process
>> >       limit when the VM is started).  I've seen a few
>> >       cases where, due to software bugs, a process size
>> >       grows and grows; unfortunately as things stand
>> >       today the result is your entire VM crashing -
>> >       hopefully leaving you with a crash_dump.
>> >
>> >       Having such a limit could cause the process to
>> >       terminate (producing a OOM crash report in
>> >       erlang.log) and the crashing process could be
>> >       handled with supervisor rules.  Even better you
>> >       can envisage setting the limits artificially low
>> >       during testing to catch these types of bugs early on.
>> >
>> > in my mailbox.  I have seen too many such e-mail messages.
>> > Here's a specific proposal.  It's time _something_ was done
>> > about this kind of problem.  I don't expect that my EEP is
>> > the best way to deal with it, but at least there's going to
>> > be something for people to point to.
>> >
>> >
>> >
>> > _______________________________________________
>> > eeps mailing list
>> > eeps@REDACTED
>> > http://erlang.org/mailman/listinfo/eeps
>> >
>> >
>> >
>> > _______________________________________________
>> > erlang-questions mailing list
>> > erlang-questions@REDACTED
>> > http://erlang.org/mailman/listinfo/erlang-questions
>> >
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>
>