[erlang-questions] CACM June 2017, Kode Vicious
Richard A. O'Keefe
ok@REDACTED
Mon Jul 3 04:41:05 CEST 2017
The June 2017 issue of CACM turned up on the common-room
table today, and as usual, I turned to Kode Vicious.
This directed my attention to
https://blog.acolyer.org/2016/10/06/simple-testing-can-prevent-most-critical-failures/
which summarises a 2014 Usenix paper I'd missed,
https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf
To summarise the summary:
92% of the catastrophic failures in distributed systems
that Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues,
Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm
studied "are the result of incorrect handling of nonfatal
errors explicitly signalled in software."
and from the paper:
"We found the majority of catastrophic failures could easily
have been prevented by performing simple testing on error
handling code - the last line of defense – even without an
understanding of the software design."
The second quote is basically talking about ensuring that test
coverage covers exception handlers.
Reading this made me realise just what a big deal
"Let it Crash!" is. It is literally unthinkable
for most programmers, due to the way we teach them
using languages like Java. Nobody seems to be able
to think "hey, if we get a lot of crashes due to
'sloppy' exception handling code, maybe we shouldn't
*have* exception handlers." And it's sobering to
realise that *I* would probably never have had this
insight. I am just awed by the mind that could think
such a thing.
Viva Erlang! Semper floreat et crescat!
(Another article in that issue,
"Too Big NOT to Fail, Embrace failure so it does not embrace you",
may also be of interest.)
More information about the erlang-questions
mailing list