[erlang-questions] Handling Crash Reports at scale

Peer Stritzinger peerst@REDACTED
Tue Jun 4 13:48:04 CEST 2013


I'm not sure what you mean by defensive code, but just to be sure:

You should never let error reporting influence how you handle faults in 
your software.  Defensive code is a code smell in Erlang.

What you coucld do is to have a error logger handler that matches 
certain often occuring errors and ignores it.  Even better it should 
count them (the count of something unineresting is often interesting).

For counting and other metrics you might want to have a look at folsom 
https://github.com/boundary/folsom

-- Peer


On 2013-06-03 17:29:54 +0000, ANTHONY MOLINARO said:

> I'd recommend addressing the other crashes with "defensive code".  By 
> capturing and categorizing certain types of errors and clearing those 
> from your logs you'll be able to better see true errors.  I usually 
> have a two phase approach.  First phase, I capture and log the types of 
> errors.  Second, once I see certain types  are regular I replace those 
> with metrics (via mondemand).  Once you have a stream of errors you can 
> then monitor the rates, plus uncaught errors will make it into your 
> logs which should remain very sparse.
> 
> -Anthony
> 
> On Jun 3, 2013, at 9:11 AM, Ransom Richardson <ransomr@REDACTED> wrote:
> Are there tools/procedures that are recommended for processing crash 
> reports from a service running at scale?
> 
> Currently we have a limited deployment and I look through all of the 
> crash reports by hand. Some are very useful for finding actual bugs in 
> our code. But other crashes are the result of client's sending bad 
> data, strange timing issues (mostly not in our code), etc and are not 
> actionable. As we prepare to scale up our service, I'm wondering how to 
> continue to get the value from the interesting crash reports without 
> having to look through all of the uninteresting ones. 
> 
> I haven't found rb to be very useful for finding the new/interesting 
> crashes. Are there effective ways that peopler are using it?
> 
> Are there other tools for parsing and grouping crash reports to make it 
> easy to find new/interesting ones?
> 
> thanks,
> Ransom
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130604/b2e0924c/attachment.htm>


More information about the erlang-questions mailing list