[erlang-questions] [eeps] New EEP: setrlimit(2) analogue for Erlang

Fri Feb 8 00:25:48 CET 2013

For what it is worth.
I located my old implementation (three years old :-)

https://github.com/tonyrog/otp/tree/limits/

A very brief description (I presented some early stuff for OTP around then)
I do not think that max_message_queue_len is implemented,
not really defined. Too many options here.

/Tony

Resource Limits in Erlang
=========================

# Why?

- Ability to detect and kill runaway processes. 
- Detect and kill zombies.
- Basis of safe mode framework.
- Excellent to use in debugging.

# How?

- Limits are checked at context switch and garbage collection time. 
- Relatively light weight.
- Create limits by using spawn_opt.
- Limits are inherited by spawned processes.
- If a limit is reached a signal 'system_limit' is raised.

## max_memory
Limit the amount of memory a process may use. Account for all memory a 
process is using at any given time. This includes heap, stack, tables, 
links and messages. The memory is shared among all processes spawned by the 
process that where limited. spawn\_opt can be used to give 
away some of that memory to child processes.

## max\_time
Controls how many milliseconds a process may use in "wall clock" time.
Created (sub-)processes will only be able to run for the remaining time.

## max\_cpu
Controls how many milliseconds a process may run in cpu time. 
The cpu time is consumed by the process and all it’s spawned (sub-)processes.

## max\_reductions
A lighter version of max\_cpu. One reductions does approximately corresponds
to a function call.

## max\_processes
Limit number of running (sub-)processes that may be running at any given time.

## max\_ports
Limit number of open ports that can be open at any given time.

## max\_tables
Limit number of ets tables that may be open at any given time.

## max\_message\_queue\_len
Limit the size of the message queue. Who dies when the limit is reached?
Either sender or receiver? Maybe add a dangerous block option?

# process\_info

process\_info is used to read the current limits and the
"remaining" quota. 
process_info(Pid, max\_cpu) is used to read the number of
milliseconds set for execution while process_info(Pid, remaining\_cpu)
return how many cpu milliseconds that remain to execute.
The items include: max\_process, max\_ports, max\_tables, max\_memory, 
max\_reductions, max\_message\_queue\_len, max\_cpu, max\_time,
remaining\_process, remaining\_ports, remaining_tables, remaining\_memory, 
remaining\_reductions, remaining\_message\_queue\_len, 
remaining\_cpu, remaining\_time

On 7 feb 2013, at 16:27, Björn-Egil Dahlberg <egil@REDACTED> wrote:

> I dug out what I wrote a year ago ..
> 
> eep-draft:
> https://github.com/psyeugenic/eep/blob/egil/system_limits/eeps/eep-00xx.md
> 
> Reference implementation:
> https://github.com/psyeugenic/otp/commits/egil/limits-system-gc/OTP-9856
> Remember, this is a prototype and a reference implementation.
> 
> There is a couple of issues not addressed or at least open-ended.
> 
> * Should processes be able to set limits on other processes? I think not though my draft argues for it. It introduces unnecessary restraints on erts and hinders performance. 'save_calls' is such an option.
> 
> * ets - if your table increases beyond some limit. Who should we punish? The inserter? The owner? What would be the rationale? We cannot just punish the inserter, the ets table is still there taking a lot of memory and no other process could insert into the table. They would be killed as well. Remove the owner and hence the table (and potential heir)? What kind of problems would arise then? Limits should be tied into a supervision strategy and restart the whole thing.
> 
> * In my draft and reference implementation I use soft limits. Once a process reaches its limit it will be marked for termination by an exit signal. The trouble here is there is no real guarantee for how long this will take. A process can continue appending a binary for a short while and ending the beam with OOM still. (If I remember it correctly you have to schedule out to terminate a process in SMP thus you need to bump all reduction. But, not all things handle return values from the garbage collector, most notably within the append_binary instruction). There may be other issues as well.
> 
> * Message queues. In the current implementation of message queues we have two queues. An inner one which is locked by the receiver process while executing and an outer one which other processes will use and thus not compete for a message queue lock with the executing process. When the inner queue is depleted the receiver process will lock the outer queue and move the entire thing to the inner one. Rinse and repeat. The only guarantee we have to ensure with our implementation is: signal order between two processes. So, in the future we might have several queues to improve performance. If you introduce monitoring of the total number messages in the abstracted queue (all the queues) this will most probable kill any sort of scalability. For instance a sender would not be allowed to check the inner queue for this reason. Would a "fast" counter check in the inner queue be allowed? Perhaps if it is fast enough, but any sort of bookkeeping costs performance. If we introduce even more queues for scalability reasons this will cost even more.
> 
> * What about other memory users? Drivers? NIFs?
> 
> I do believe in increments in development as long it is path to the envisioned goal.
> And to reiterate, i'm not convinced that limits on just processes is the way to go. I think a complete monitoring system should be envisioned, not just for processes.
> 
> // Björn-Egil
> 
> On 2013-02-06 23:03, Richard O'Keefe wrote:
>> Just today, I saw Matthew Evans'
>> 
>> 	This pertains to a feature I would like to see
>> 	in Erlang.  The ability to set an optional
>> 	"memory limit" when a process and ETS table is
>> 	created (and maybe a global optional per-process
>> 	limit when the VM is started).  I've seen a few
>> 	cases where, due to software bugs, a process size
>> 	grows and grows; unfortunately as things stand
>> 	today the result is your entire VM crashing -
>> 	hopefully leaving you with a crash_dump. 
>> 
>> 	Having such a limit could cause the process to
>> 	terminate (producing a OOM crash report in
>> 	erlang.log) and the crashing process could be
>> 	handled with supervisor rules.  Even better you
>> 	can envisage setting the limits artificially low
>> 	during testing to catch these types of bugs early on.
>> 
>> in my mailbox.  I have seen too many such e-mail messages.
>> Here's a specific proposal.  It's time _something_ was done
>> about this kind of problem.  I don't expect that my EEP is
>> the best way to deal with it, but at least there's going to
>> be something for people to point to.
>> 
>> 
>> 
>> _______________________________________________
>> eeps mailing list
>> eeps@REDACTED
>> http://erlang.org/mailman/listinfo/eeps
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

"Installing applications can lead to corruption over time. Applications gradually write over each other's libraries, partial upgrades occur, user and system errors happen, and minute changes may be unnoticeable and difficult to fix"

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130208/51190339/attachment.htm>