The key is a tuple of the elements I thought can uniquely identify a crash, at least to some practical extent, so crashes generated by the same bug will not be included more than once. It’s supposed to be opaque to the user of the class, so it can easily be changed to reflect different heuristics without breaking existing code.
A more flexible implementation would be to have a set of classes of key objects to choose from, each with a different heuristic, coded in the comparison operator. For now I’ll leave that for a future version, if the need ever arises.
So far this simple implementation using a tuple has worked well for me. But if you need something different, just derive from the Crash class and reimplement the key() method.
To disable detection completely, subclass Crash and return self at the key() method.
This is what I chose to include in the key and why:
Event code and exception code:
Wouldn’t make sense not to include them. :)
Program counter (EIP/RIP):
The same fault in different places of the code are most likely different bugs. However, different faults in the same place are not necessarily the same bug, so we can’t rely on this alone.
To avoid problems with DLL relocations, a label is used whenever possible.
Stack trace (EIP/RIP values only):
This heuristic is actually meant to detect different ways of triggering the same bug, rather than different bugs. But it’s also useful to detect heap overflows, since all of them will be triggered at the same set of EIPs (where the heap routines are located) but coming from different parent functions.
To avoid problems with DLL relocations, labels are used whenever possible.
Debug string:
Different debug strings mean most likely different bugs. There’s a catch: if the debug string is generated from something else (like the value of some variable we don’t care about), this heuristic may fail and give us more crashes than we really wanted. This is the case for strings generated by heaps in debug mode, as they often include the heap chunk addresses. If this becomes a problem you can filter out the unwanted debug string events before passing them to the container.
This is what I chose NOT to include in the key and why:
Exception address:
Most exceptions caught are page faults, and in that case we’re more interested in the program counter, since a page fault is generally triggered by corrupting a pointer, and the corrupted value itself isn’t really useful to uniquely identifying the crash it produces.
Then again I still want to review this heuristic for each specific type of exception for the next version, to make sure it’s not getting too many false negatives. I didn’t give much thought for scenarios other than page faults when I thought about this one. :(
First chance or second chance:
Generally second chance exceptions are exactly the same as first chance exceptions, they simply mean the application didn’t handle them. Depending on the application you’re debugging you could be interested in logging either first chance or second chance exceptions only, but rarely both.
Process and thread IDs:
One might say, two processes could crash at the same address because of different bugs. But the problem is, the process and thread IDs are dependent on a particular execution of the target application, and we want to be able to compare crashes from multiple executions.
Stack contents and register values:
Both are most likely to contain garbage we’re not interested in, plus many values are dependent on a particular execution of the application.
By ignoring this we might be missing different ways to trigger the same bug, though.