Servicelog Source Available

Source code for the servicelog library and utilities is now available from the linux-diag project on SourceForge: There is user-level documentation (PDF) for servicelog available on SourceForge as well.

Why is servicelog different than other logging mechanisms, such as syslog? It’s intended to store entries that are only relevant to system service. The concept of a serviceable event is introduced, which is a single servicelog entry that contains enough information to identify a failure and to indicate how to repair it. This information will typically include:

  • a short description of the failure, including a reference code
  • identification of the physical location of a failing component (via location code, for example)
  • indication of severity of the failure and/or priority of the repair
  • pointers to documented procedures for repairing the failure (for example, PCI hotplug instructions for replacing a failing PCI adapter)

System management tools can register to be notified when new serviceable events are created (the Service Focal Point on the Hardware Management Console will be updated when a serviceable event is logged on a Linux partition on System p). When a failure is fixed (for example, a failed PCI adapter is replaced via a hotplug action), a repair action should be logged to servicelog, which will cause all of the relevant open serviceable events to be marked as “closed” (i.e., fixed). This will provide a complete history of all of the failures that have occurred on a system, as well as all of the repair actions that have taken place.

Servicelog is particularly useful with Linux on System p right now. The superior First Failure Data Capture (FFDC) facilities provided by System p will result in very informative servicelog entries to indicate a wide range of possible platform failures, and each reference code and repair procedure is documented in IBM’s eServer hardware InfoCenter.