[erlang-questions] obviously no bugs? (Re: Alternative supervision approaches)

Thu Jun 26 21:10:16 CEST 2014

> I was referring to something formally based on an FSM and detecting the
> current
> environment and state of the node before restarting the children and not
> just using
> timeouts. Not sure if that was clear based on your response.

whatever the mechanism, the point to me is that i get the impression that:

a) it failed at all in the first place and b) it seems like it is not
always an expected failure and c) the "fix" is to restart & pray and
then if the prayers don't work then wait a little longer or do some
as-yet-unspecified random jiggery-pokery of the 'environment' and
start praying again.

i'm not trying to say that formal methods are the be-all-end-all. but
i do wish that the state of the art-and-practice was more into formal
methods and model checking and all that jazz. by which i think i mean
that i wish the barriers to using those things were lower. i ass/ume
the effort required to go that route is large and always seen as bad
roi unless one is working on nasa or healthcare type stuff.

i just wonder how many other people have similar pie-in-the-sky
day-dream wishes or if the standard groupthink is, "eh, whatever,
restarts will be enough to get us shipping and maybe making a profit!"
i mean i personally wouldn't be able to use model checking because i'm
for the most part utterly ignorant wrt the tools. so i would
practically pragmatically firmly be in the "eh, whatever, restarts
will be enough" camp!

on the other flipper, if one were programming with CSP or some such
then in theory you'd "simply" run FDR over your code and get answers
about deadlock for "free". vs. doing something in TLA and then
maintaing it vs. the "real" code.

etc.

yes i am a blithering rambler.