BUG: fatal interaction between application:ensure_all_started(A) and permit(B, false)

Sat Mar 20 13:45:19 CET 2021

On Sat, Mar 20, 2021 at 8:46 AM Ulf Wiger <ulf@REDACTED> wrote:
>
> I had the brilliant idea of using application permissions for a particular use case. This seemed to work perfectly, until I ran `rebar3 shell`, and spotted some disturbing behavior.
>
> The bug, apparently, lies in that `application:ensure_all_started(A)` ends up busy-looping if A depends on B, and permission(B) -> false. What's worse, for each call to start(B), the application controller notices the permission flag, returns `ok` and inserts an entry in its internal `start_p_false` list. This amounts to a memory leak.
>
> I commented it in a tweet, then decided to try to find the source, esp. since I suspected `application:ensure_all_started/1`.
>
> https://twitter.com/uwiger/status/1372944356781531136
>
> In short, if permission(B) -> false, what happens is:
> start(A) -> {error, {not_started, B}}
> start(B) -> ok
> start(A) -> {error,  {not_started, B}}
> ... [repeat endlessly]
>
> Now, it could be fixed by adding a permission check in the looping function, but this raises the question of what should happen in the above case. Three alternatives:
>
> 1. ensure_all_started(A) returns {error, {not_permitted, B}}, or something
> 2. the call hangs until the flag(s) change, but start(B) is only called once.
> 3. Warn against the use of permissions in the docs, and deprecate them.
>
> I'm assuming that most of you may not even know about permissions. They were introduced back in 1996-97 (I believe), when I and Martin Björklund were going back and forth on how to support distributed applications and cluster control. Eventually, this led to dist_ac and the protocol being defined, so that users could write a controller app taking control of an application and giving instructions on where it should run. In the AXD301, this was done by the RCM application. I believe I talked about it at the EUC 1997, but it's hard to find information about that on the web. :)
>
> Anyway, permissions were left in the API, and ARE documented.
>
> Thoughts?

I know we've used the permissions mechanism occasionally during
maintenance or live upgrades. Off-hand I don't know if we'd want
alternative 1 or 2 (my colleague Daniel Szoboszlay might know more
about this).

/Mikael