Hardware Inventory with lsvpd

VPD, Vital Product Data, is information associated with system hardware that is useful for easing system configuration and service. The lsvpd package for Linux provides commands that can be used to retrieve an inventory of system hardware, along with the VPD associated with each device. The lsvpd package will install three commands: lsvpd (“list VPD”), lscfg (“list configuration”), and lsmcode (“list microcode”). The lscfg command is the human-readable command of the three; lsvpd and lsmcode provide output that is more easily read by scripts/applications.

The lsvpd package requires the libvpd library. The libvpd library can also be used to retrieve inventory data from within an application; in fact, that’s how lsvpd, lscfg, and lsmcode work.

Types of Vital Product Data

Running lscfg by itself will list each device, along with its location code. More detailed VPD for each device on that list can be obtained by running “lscfg -vl <device>“. The following examples illustrate the type of data that can be retrieved from the lsvpd package:

# lscfg -vl eth0
  eth0             U787A.001.DNZ00Z5-P1-T5
                                         Port 1 - IBM 2 PORT 10/100/1000
                                         Base-TX PCI-X Adapter (14108902)

        Manufacturer................Intel Corporation
        Machine Type and Model......82546EB Gigabit Ethernet Controller
                                    (Copper)
        Network Address.............00096b6b0591
        Device Specific.(YL)........U787A.001.DNZ00Z5-P1-T5

The description, manufacturer, model number, MAC address, and location code of the eth0 device are all noted in the output. Here is another example, for a hard drive:

# lscfg -vl sda
  sda              U787A.001.DNZ00Z5-P1-T10-L8-L0
                                         16 Bit LVD SCSI Disk Drive (73400 MB)

        Manufacturer................IBM
        Machine Type and Model......ST373453LC
        FRU Number..................00P2685
        ROS Level and ID............43353141
        Serial Number...............0007EA3B
        EC Level....................H12094
        Part Number.................00P2684
        Device Specific.(Z0)........000003129F00013E
        Device Specific.(Z1)........0626C51A
        Device Specific.(Z2)........0002
        Device Specific.(Z3)........04112
        Device Specific.(Z4)........0001
        Device Specific.(Z5)........22
        Device Specific.(Z6)........H12094
        Device Specific.(YL)........U787A.001.DNZ00Z5-P1-T10-L8-L0

The location code, description, model, and manufacturer are all there, along with the FRU and part numbers (for ordering new parts), the serial number of the device, and its current microcode level (“ROS Level and ID”).

The -A flag to lsmcode will list all the microcode levels on the system, including the system firmware level:

# lsmcode -A
sys0!system:SF240_320 (t) SF220_051 (p) SF240_320 (t)|service:
sg6 1:255:255:255 !570B001.0FC93FFC0FC93FFC
sg5 1:0:4:0 sdd !HUS103036FL3800.0FC94004100698C86F7374
sg4 1:0:3:0 sdc !HUS103036FL3800.0FC942E40FC942E40620
sg3 0:255:255:255 !570B001.0FC940040FC940040FC93FF410193860
sg2 0:0:15:0 !VSBPD3E   U4SCSI.0FC9420C0FC9420C0620
sg1 0:0:5:0 sdb !ST336607LC.0FC9420C0FC9420C0620
sg0 0:0:3:0 sda !ST373453LC.0FC942040FC942040620

See my previous article on pSeries and System p firmware for a description of the dual firmware banks, and information on updating your system firmware level. Currently, device microcode must be updated using a microcode update utility specific to the device in question (iprutils for the onboard RAID SCSI HBAs on POWER5, for example).

Refreshing the VPD Database

Unfortunately, the data in the lsvpd database can become stale as devices are added or changed (via hotplug or DLPAR, for example). Running /usr/sbin/vpdupdate will cause the data to be refreshed. The developers of lsvpd are currently working on having vpdudpate run automatically in response to hotplug events.

Other Tools for Hardware Inventory

Besides lsvpd, there are several other Linux tools that can assist with hardware inventory for system configuration or service:

  • HAL (Hardware Abstraction Layer): run hal-device for a list of devices
  • Open Firmware device tree (on Power): stored in /proc/device-tree
  • The sysfs filesystem (usually mounted on /sys)

The Esteemed Semicolon, Neglected

I have never used semicolons. They don’t do anything, don’t suggest anything. – Kurt Vonnegut

Those words probably didn’t single-handedly relegate the semicolon to the realm of suggestively-winking emoticons, but they seem to effectively reflect the zeitgeist concerning that beleaguered character. Sure, the semicolon is difficult to use: the clauses on each side must be independent, except when it is used to separate the elements of a list. If the clauses are independent, why not just make separate sentences? And why would one use a semicolon (rather than the omnipresent comma) to delineate the items of a list?

You may have noticed that I use semicolons. Frequently. Too often, some might say. For a while, it bothered me that I saw so many semicolons in my e-mail, but not often in the e-mail messages that I received. (Well, it didn’t bother me enough to lose sleep, but I digress.) Perhaps, I thought, the language of Henry James and Jorge Luis Borges was becoming archaic, barely translatable to modern language, like Beowulf.

And then I found a continuous and copious paean to the semicolon: http://www.oneletterwords.com/weblog/?c=Semicolon

I am buoyed by that blog. I will continue to use that most mysterious of characters, in all its wondrous glory, when the pause of the period is too pregnant and that of the comma is not pregnant enough, when the interplay, the tension, between two independent clauses is so overt that their separation does them a disservice.

Epilogue

And then there’s the two-spaces-after-a-period “rule” that seems to be falling by the wayside, especially given that web browsers, in a monomaniacal ambition to sanitize web corpora, will convert all double spaces to singles. (Unless you choose to use the ever so intuitive &nbsp; sequence. (Hey! A semicolon!)) Some interns of mine a few summers ago did not even know that they were supposed to put two spaces after a period; presumably, they once learned it in seventh grade, but were never forced to apply it.

Though this conversion happens en masse, and without permission, I will continue to jab that spacebar twice in response to a period, and let the automation of modern editors and browsers erroneously sanitize them. A useless act of defiance on my part.

Pop quiz: how many spaces after a semicolon? Maybe that’s another reason that semicolon sightings in the wild have become so rare. (Answer: one space. But two spaces after a colon. Intuitive, no?)

Epilogue’s Epilogue

I’m not actually a grammar disciplinarian. The point of language is to communicate; living languages are vibrant, and adjust over time to reflect societal changes. Those developments should be accepted as inevitable, even enjoyed as the addition of a new flavor to an old recipe. I just find it to be a shame that certain characters fall into disfavor merely because they are slightly more difficult to use. Laziness has only rarely resulted in worthwhile mutation. (“O RLY?” you might reply. Yes. Really.) Variances in sentence length and structure are one of the things that can make language pleasurable, rather than strictly utilitarian.

Oh, and I find it easier to visually parse sentences when they are separated by dual spaces. Perhaps your experience differs.

Location Codes on POWER

Hardware location codes provide a method for mapping connectors or logical functions to their physical locations in the system’s enclosure. For example, it allows you to easily identify the physical port on the system that corresponds with eth0, or to identify which of the fans on the system is failing. Besides hardware inventory and diagnostics, location codes are also used on HMCs (hardware management consoles) and IVMs (integrated virtualization managers) to identify the devices to be assigned to (or moved between) logical partitions.

Starting with POWER5, both System p and System i started using the same format for location codes. The POWER4 (and POWER3 and JS20) location codes are much shorter (and differ between pSeries and iSeries), but you’ll be able to deal with them once you get the hang of POWER5 (and later) location codes.

In case it matters, location codes are limited to 80 characters in length (to fit on the display on the operator’s panel on the front of the system). They are composed only of uppercase characters, digits, periods, and dashes. They consist of labels separated by dashes. This is an example of a location code representing a PCI slot:

U787A.001.DNZ00Z5-P1-C3

The initial U indicates that the label represents a “unit” (like a CEC, central electronics complex, or an I/O drawer). The DNZ00Z5 at the end of the label is the serial number of the unit. The “P1” indicates the first planar on the system (i.e. the motherboard). The “C3” indicates the third PCI slot on that planar. Looking for labels on the system chassis will help you find the C3 slot, or the slot might also include an LED that can be turned on to identify the slot (with the usysident command, to be detailed in a near-future article).

Some of the other labels you may run into with location codes:

  • An: the nth air handler (fan, blower, etc.)
  • En: the nth electrical unit (power supply)
  • Dn: the nth device (hard drive, SES device, etc.)
  • Ln: the nth logical path (IDE address, SCSI target, fibre channel LUN, etc.)
  • Tn: the nth port (on a multi-port ethernet adapter, for example)

In other articles on this site, I explain how you can obtain and use location codes using hardware inventory and diagnostic tools. Quick preview: you can use the lscfg command to view a list of devices with their location codes, and several diagnostic tools will provide the location codes of failing components.

[Edit 2007-10-13:  Updated with pointer to article about lsvpd for hardware inventory.]

Kernel Markers in Linux

I feel like I’m parroting LWN a bit recently, but I like to point out items that are of particular interest from a RAS perspective. To that end, I thought I’d mention that Kernel Markers are making good headway in the 2.6.24 kernel.

These markers allow probe points to be specified in the kernel. Each probe point can be active, meaning that a probe is attached to that point, or inactive, meaning that the probe point is not currently “of interest.” When a marked point is encountered, if a probe is attached to the marker, that probe will be invoked with a set of parameters that are specified by the marker. If there is no probe attached to the marker, execution continues as normal. Therefore, unused marker points are mostly dormant.

“Mostly” dormant? That means that there is a very tiny bit of execution overhead each time a marker with no attached probe is encountered. Instrumentation with a moderate number of markers would have a small impact on execution performance and a somewhat larger impact on space (due to the extra data structures needed for markers).

This is certainly a boon for SystemTap; presumably, these markers will be used to make it easier to use and more maintainable.

What does this mean from the user’s perspective? A better instrumented kernel means richer, more interesting monitoring and analysis tools. Performance analysis tools like filemon, tprof, curt, and splat are available on AIX because of its lightweight trace instrumentation, in which each tracepoint can be activated or deactivated on the fly. Kernel markers provide an equivalent mechanism in Linux to enable similar higher-level tools.

I think this also opens up some interesting opportunities for aspect-oriented programming. For example, instead of using printk for logging, markers could be placed at points of interest for logging applications. That way, any (potentially 3rd party) logging application could attach to those points, and then be invoked with a severity flag and a log message each time the kernel wishes to log some information. All kinds of data aggregation and mining could occur; one could even apply something like machine learning to do some more complex processing. This can currently be done with syslog, but I think this kind of AOP would make these tools easier to develop.

One disadvantage, though, unless I’m misreading the documentation, is that only one probe can be attached to a marker. This limits the utility of markers for robust AOP, and could potentially limit is use for advanced performance tools (or at least make it harder to implement those performance tools).

For details on using these trace markers in the Linux kernel, see Documentation/Markers.txt in the kernel source.