July 25, 2009

Introducing exception signatures

At work we develop, maintain and host a web solution which consists of a fair amount of assemblies and spans across several websites. Although we might like to think that we develop perfect code that certainly isn't always the case. And as with all non-djb applications ours sometimes generates a few exceptions.

Now please don't think of us as bad people, exceptions are an unfortunate part of nature and not all exceptions are the result of our bad programming, quite often we get exceptions from external libraries, from asp.net (viewstate exceptions, forbidden input exceptions and so on).

Keeping track of exceptions

Quite early on in our development process we decided that we needed a way of reacting to these exceptions as they happened. Our solution was to implement global exception handlers which sent an email to us whenever an unhandled exception occurred.

This worked great and in our early days it was more than enough. We could immediately react to an exception and we could often deploy a fix rapidly.

If you don't keep track of your exception at all you're probably in trouble. See this post by Jeff Atwood for a better explanation of why than I could ever come up with.

Keeping track of large amounts of exceptions

When our sites had been live for a while and started to get some serious use we noticed that it was getting really hard to keep track of all incoming exception mails. From looking at the stack trace we had no idea if an exception had been resolved, if someone was working on it, it's priority and so on. Troubled by the need to keep a bunch of stack traces in (my) memory I started thinking about ways of solving the problem.

My idea was to convert the exceptions into issue tickets. We use Trac at work so I started looking at simply redirecting all of our exception email there. Getting the exception mails converted to tickets wasn't really a problem but it didn't help at all with organization since every exception spawned a new ticket and we still had to manually merge the duplicates.

Enter Exception Signatures

In order to enable organization I started looking at generating a fingerprint for every exception. Some sort of string generated by looking at the stack traces of the exception, its inner exceptions and meta information of each exception. I wanted a solution which could be intelligent enough to differentiate between two exceptions thrown from the same place but for different reasons (i.e., the algorithm should look at the entire exception chain instead of only the base exception target site).

Handling dynamic data in exception messages

I also wanted it to be able to detect and disregard dynamic data in exception messages. Ever heard of Exception.Data? That's where we store extra information that we want to tack along for debugging purposes. Exception.Data is great but sometimes it's to much work and you just want to throw in some debugging data in the Exception.Message string like this:

public void RefundPurchase() {
    if(PurchaseState != "completed")
        throw new InvalidOperationException(
            "Invalid state for refund: " + PurchaseState
        );
}

In order to handle these cases I made the algorithm aware of three "pretty standard" ways of including dynamic data in exception messages.

  1. Text after a colon ':'. Invalid state for refund: inprogress
  2. Text within single and double quotes. Invalid state for refund "inprogress"
  3. Text within balanced parenthesis, brackets and braces. Invalid state for refund (state was inprogress)

Handling localized exceptions

I also wanted localized exceptions to produce the same signature when possible. We have a windows app used by some of our customers and we wanted exceptions from them to produce the same signature regardless of their OS language. To accommodate this I added some custom code for exceptions which contain an ErrorCode (SocketException and Win32Exception) and use that instead of the message.

Human semi-friendly signatures

My last requirement was that the signatures generated would be short so that we could display them to our users when they encounter one of our error-pages. This enables our customer service to provide status updates to calling customers by simply looking up the signature in trac and looking at its status.

As such the signatures are no-crazy-signs 8 character hex strings, like "617c1de5" and "1570d6d7"

Conclusion and possible solution

I think I've covered all my requirements and I must say that it has worked great for us. All of our exceptions end up in trac where we can prioritize, assign and catalogue them. We even have a great way of knowing when exceptions that we think we've squashed re-appear because then the ticket just get re-opened.

Jeff Atwood wrote about using exception logs as a de-facto to do list a while ago and that's exactly what we have accomplished using this technique. We even did it long before he published his post but no one will ever believe that =)

What about ELMAH?

I have not used ELMAH personally but I've certainly heard many great things about it. From what I've read about ELMAH it seems that they have some great features for filtering unwanted exceptions but I haven't seen anything about grouping exceptions. The ideal thing would of course be if existing frameworks like ELMAH used this implementation to provide simple grouping.

Our end result

trac-tickets

image

This is an example of how it looks when we encounter an exception in our system. Note that although the exception was thrown twice there’s only one ticket.

Get the code and abuse it

This is the part which makes my stomach ache... The code, complete with a couple of unit tests and an example program plus more information is available at http://www.freakcode.com/projects/exception-signatures.

Licensing information

kick it on DotNetKicks.com