[erlang-questions] Handling Crash Reports at scale

Mon Jun 3 18:11:16 CEST 2013

Are there tools/procedures that are recommended for processing crash reports from a service running at scale?

Currently we have a limited deployment and I look through all of the crash reports by hand. Some are very useful for finding actual bugs in our code. But other crashes are the result of client's sending bad data, strange timing issues (mostly not in our code), etc and are not actionable. As we prepare to scale up our service, I'm wondering how to continue to get the value from the interesting crash reports without having to look through all of the uninteresting ones.

I haven't found rb to be very useful for finding the new/interesting crashes. Are there effective ways that peopler are using it?

Are there other tools for parsing and grouping crash reports to make it easy to find new/interesting ones?

thanks,
Ransom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130603/a3f72650/attachment.htm>