I've found a number of good links describing various aspects of exception/event logging:
and then here's an IBM Redbook that provides some nice theory on the subject of Event Management and Best Practices.
And then there's this one here that I think needs some clarification and warning ...
he speaks of handling the errors "centrally", and what he appears to mean by that is to use a single uniform method (or collection of methods). No problem, provided you remember his other statement about using available context, and design your application infrastructure accordingly (i.e., if you have an exception way down at the bottom of your process stack, make sure you can convey all the relevant information you need to report, identify, and (hopefully) resolve the problem all the way back up, and that it can be handled correctly by that central event handler. This may mean defining a serializable wrapper class to encapsulate a set of contextual properties that can then be passed back up the chain, possibly across remoting boundaries, etc.
He also suggests reporting errors to the Windows Event Logs. "I'm going to have to go ahead and sort of disagree with [him] there."
There are many issues I have with writing to the event logs:
- For the most part you're stuck with text. It's nice if you can serialize objects that are relevant to the event/exception context and have them available for automated analysis. Sure, you could serialize them up as XML instead of binary, if you want to take the performance hit, but you still don't have a good way of deserializing and accessing them programmatically.
- If your application is running on one machine, no problem. If you're on a web farm, do you really want to have to check every machine's event log to get the information for whatever exception a specific customer is screaming about? And that's assuming you actually did use a centralized handler and don't also have to look on multiple tiers ... application servers, database servers, whatever. Sure, you can use a third party app like Rutherford suggests to monitor them all for you, but quis custodiet ipsos custodes?
- If your centralized exception handler is at the web level (as it probably has to be unless you don't care about web-level exceptions), are you really dumping your exception information to the web server's event log? With as much context information as possible so the exception can be tracked down? To a machine in your security DMZ?