Friday, May 1, 2009

System Firmware on pSeries and System p


Most PowerPC-based systems since POWER4 (including blades (JS20 and later)) have dual firmware banks. The intent is to allow an administrator to update the system’s firmware in one bank while preserving the previous firmware image in the other bank. Should there be a problem with the firmware update process, or an issue with the new image for any reason, the administrator can revert to the older, known-safe image.

During normal operation, the system is booted off of the “temporary” bank (sometimes called the temp side or t-side), and the contents of the temporary side are the same as that of the “permanent” bank (a.k.a. the perm side or p-side).

When a firmware update occurs, the new image is copied into the t-side. If the t-side image is different than the p-side image, the t-side will be copied to the p-side before the t-side is overwritten (i.e. the current production image will be backed up to the permanent bank). The system will attempt to reboot to the new image on the t-side. If the boot succeeds, and the new image works well, it can be “committed” to the p-side. If the system does not boot on the t-side (due to a corrupted image, for example), it will automatically boot onto the p-side. At that point, the t-side image can be “rejected” by overwriting it with the known-safe p-side image. The system should then be booted off of the t-side

Given those properties, it has always seemed to me that the sides are misnamed; I find it useful to think of the temporary side as the production side, and the permanent side as the backup side.

temporary == production
permanent == backup (older, known-safe image)

Viewing the Current Firmware Levels

The current firmware levels can be viewed by running /usr/sbin/lsmcode -A (lsmcode is part of the lsvpd open source package):

# lsmcode -A
sys0!system:SF240_320 (t) SF220_051 (p) SF240_320 (t)service:

The above output indicates that the temporary bank contains the firmware level SF240_320, and the permanent bank contains SF220_051. The third entry by the sys0!system: tag indicates that the system is currently booted off of the temporary side. Before a new firmware update operation can be attempted, the t-side image should be committed, overwriting the older p-side image.

I wrote about the serv_config command-line utility in my previous post; this command is the easiest method to determine whether your system is currently operating off of the temporary or permanent side. Run /usr/sbin/serv_config -e sp-current-flash-image. If it prints 0, the p-side is booted; if it prints 1, the t-side is booted. Both the lsmcode and serv_config commands will work on any Linux partition on the system.

Managing System Firmware

The update_flash command (part of the powerpc-utils-papr open source package) provides the ability to manage system firmware from the Linux command-line on POWER systems. There are restrictions as to which partitions can be used to update system firmware:

If the system is managed by an HMC, firmware should be updated from there (via the Licensed Internal Code management screens). In some cases, firmware updates performed via the HMC will be concurrent (meaning that the system and the partitions do not need to be restarted in order to recognize and begin using the updated firmware level).
If the system is partitioned, only the partition that has been granted service authority can perform firmware updates.
If the system is not partitioned, and not HMC managed, the update_flash command is the only method for updating system firmware.
New firmware images can be downloaded from http://techsupport.services.ibm.com/server/mdownload/. The following operations can be performed with the update_flash command:

Validate that the image stored in a file appears to be uncorrupted: update_flash -v -f
Perform an update with the image stored in a file: update_flash -f
Commit the t-side to the p-side (when it has been determined that the production image is safe): update_flash -c
Reject the current image on the t-side (overwrite it with the image in the p-side, because the t-side image is unsafe): update_flash -r
The first three commands should only be used when the system is booted on the t-side; the last can only be used when the system is booted on the p-side.

0 comments: