Power6 High-End System Firmware

Applies to: 9125-F2A

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.


Contents


1.0 Systems Affected

This package provides firmware for Power 575 (9125-F2A) Servers only. Do not use on any other systems.

The firmware level in this package is:


2.0 Cautions and Important Information

Installing power code and system firmware

Attention: 9125-F2A servers should be evaluated for ECA845 installation prior to performing firmware upgrades. Contact your service provider for more information about ECA845.

2.1 Cautions

POWER VM Active Memory Sharing

Attention: After this level of firmware is installed, the platform must be powered off, then powered on to activate the POWER VM Active Memory Sharing function.

Attention: If ES340_095 has been installed, and the new POWER VM Active Memory Sharing function has been activated, and you want to back-level the system firmware, the active memory sharing pool must be deactivated and deleted prior to back-leveling the system firmware. IBM does not recommend back-leveling the system firmware.

2.2 Important Information

HMC-Managed Systems

This firmware level requires HMC V7 R3.4.0 with MH01186 (SP3)

For information concerning HMC releases and to access the HMC code packages, go to the following URL:
http://www14.software.ibm.com/webapp/set2/sas/f/hmcl/home.html

NOTE: You must be logged in as hscroot in order for the firmware installation to complete correctly.

IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.


3.0 Firmware Information and Description

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as deferred. These deferred fixes can be installed concurrently, but will not be activated until the next IPL. Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For deferred fixes within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01ESXXX_YYY_ZZZ

NOTE: Values of service pack and last disruptive service pack level (YYY and ZZZ) are only unique within a release level (XXX). For example, 01ES330_067_045 and 01ES340_067_053 are different service packs.

An installation is disruptive if:

Example: Currently installed release is ES330, new release is ES340 Example: ES330_120_120 is disruptive, no matter what level of ES330 is currently
installed on the system Example: Currently installed service pack is ES330_120_120 and
new service pack is ES330_152_130

An installation is concurrent if:

The release level (XXX) is the same, and
The service pack level (YYY) currently installed on you system is the same or higher than the last disruptive service pack level (ZZZ) of the service pack to be
installed.

Example: Currently installed service pack is ES330_126_120,
new service pack is ES330_143_120.

Firmware Information and Update Description

For information about previous firmware release levels, see Section 7.0 Firmware History.

 
Filename Size Checksum
01ES340_095_042.rpm 24949359 02114
 
ES340
ES340_095_042

08/20/09

Impact: Function      Severity: Special Attention

System firmware changes that affect all systems

  • DEFERRED:  This fix corrects the handling of a specific processor instruction sequence that was generated on a particular heavily-tuned High Performance Computing (HPC) application. This specific instruction sequence has the potential to produce an incorrect result. This instruction sequence has only been observed in a single HPC application.  However, it is strongly recommended that you apply this fix. 
System firmware changes that affect certain systems
  • A problem was fixed that under certain rare circumstances caused a partition to crash when a 24" InfiniBand I/O drawer (feature code 5797 or 5798) drawer was concurrently added.  When this problem occurred, rebooting the system was required to recover.
  • On systems running system firmware ES340_075 and Active Memory Sharing, a problem was fixed that might have caused a partition to lose I/O entitlement after the partition was moved from one system to another using PowerVM Mobility.
  • On systems running system firmware ES340_075 and Active Memory Sharing, a problem was fixed that might have caused a partition to fail to boot with SRC B700F103 if the partition had more than 24 virtual processors assigned to it.
  • On systems with F/C 5802 or 5877 drawers attached, a problem was fixed that prevented an I/O slot's power LED from accurately reflecting the state of the I/O slot in a 5802 or 5877 drawer, under certain circumstances.
  • On systems with external I/O towers attached, the firmware was enhanced so that the system will not crash when SRC B7006981 is logged for certain types of I/O hardware failures. 
ES340_075_042

06/22/09

Impact: Function      Severity: Special Attention

New features and functions:

- DEFERRED: Support for F/C 5803 (24" I/O drawer) and F/C 5873 (diskless 24" I/O drawer).

Attention: After this level of firmware is installed, the platform must be powered off, then powered on, before the 5803 or 5873 I/O drawer is added to the system.

- DEFERRED: Support for POWER VM Active Memory Sharing.

Attention: After this level of firmware is installed, the platform must be powered off, then powered on to activate the POWER VM Active Memory Sharing function.

Attention: If ES340_075 has been installed, and the new POWER VM Active Memory Sharing function has been activated, and you want to back-level the system firmware, the active memory sharing pool must be deactivated and deleted prior to back-leveling the system firmware. IBM does not recommend back-leveling the system firmware.

System firmware changes that affect all systems:

  • A problem was fixed that caused hardware to be deconfigured when the system encountered network errors, even though the SRCs were being logged as informational.
  • A problem was fixed that caused the detailed data at the end of an "early power off warning type 5" AIX error log entry to be filled with invalid data instead of zeros.
  • On systems running firmware release ES340, a problem was fixed that caused data in the platform dump to be invalid.
  • The firmware was enhanced so that SRCs B181720D, B1818A13, and B1818A0F, and occasionally a service processor dump, will not be generated when the service processor's two Ethernet interfaces are on the same subnet. (This is an invalid configuration.)
  • The firmware was enhanced such that error logs with relevant information will be created when a system crashes under certain circumstances, rather than a generic SRC (B1813410), with very little debug information, being logged.
  • A problem was fixed that caused the system to hang when terminating if the system had been in power save mode.
  • The firmware was enhanced so that a call home will be made if the hypervisor issues a "terminate immediate" interrupt.
  • A problem was fixed that caused incorrect field replaceable unit (FRU) part numbers to be returned for the BPC scroll assembly, UEPO panel and the CEC MDA scroll assembly.
  • The firmware was enhanced so that the service processor only logs SRC B1A38B24 when a valid network setup error is found. The callouts for this SRC were also improved. 
System firmware changes that affect certain systems:
  • In systems using InfiniBand switches for processor clustering, a problem was fixed that caused packets to be dropped under certain circumstances.
  • A problem was fixed that caused the migration of a partition with more that 900 virtual slots defined, from a system running firmware ES320 to a system running firmware ES340, to fail.
  • On systems using on/off (temporary) memory capacity on demand (COD), the firmware was enhanced to improve memory COD's interaction with other tools (such as Inventory Scout in AIX), and to make the billing process easier. 
ES340_061_042

04/20/09

Impact: Function      Severity: HIPER

System firmware changes that affect all systems:

  • HIPER: The firmware was enhanced to improve the service processor's capability to recover from bad bits in the flash memory. A predictive error, or an unrecoverable error, will be logged against the card that contains the system firmware if the number of correctable or uncorrectable errors exceeds the threshold.
  • HIPER: A problem was fixed that caused nodes to guard out processor cores, or checkstop, during the transition to nominal voltage from "power save" mode.
  • A problem was fixed that caused the service processor diagnostics to report a "TOD (time-of-day) overflow" error, instead of an uncorrectable memory error, when failures occurred on memory DIMMs.
  • A problem was fixed that prevented the service processor from automatically booting from the permanent (or P) side if the temporary (or T) side of the firmware flash was corrupted. When the problem occurred, the service processor stopped instead of booting from the P side.
  • A problem was fixed that might have caused the system to crash when a processor was dynamically removed when the system was running.
  • The firmware was enhanced such that VPD data corruption in the Anchor (VPD) will be corrected by the firmware, rather than having to have the Anchor card replaced.
  • A problem was fixed that caused the system to crash, under certain circumstances, with SRC B112E504 being logged, followed by SRC B181C350, when a system dump was initiated.
  • A problem was fixed that caused a partition being migrated to crash on the target system.
  • On systems running the ES340 release of system firmware, a problem was fixed that caused an abort code to be logged in the virtual input/output system (VIOS) error log on the source system after a successful partition migration.
  • A problem was fixed that caused a partition being migrated to become unresponsive on the target system when firmware-assisted dump was enabled.
  • The firmware was enhanced so that SRC BA210012 will not generate a call home when logged.
  • A problem was fixed that caused hardware to be deconfigured when the system encountered network errors, even though the SRCs were being logged as informational.
System firmware changes that affect certain systems:
  • On systems with external I/O drawers, a problem was fixed that could cause the system to hang on checkpoint C700406E during a "warm" reboot (a reboot in which the processor drawer is power-cycled but the I/O drawers are not).
  • On systems with virtual fiber channel disks, a problem was fixed that prevented the system management services (SMS) from displaying the virtual fiber channel disks if the virtual fiber channel server reported that any of them were reserved. 
ES340_042_042

01/16/09

Impact: Function       Severity: HIPER

System firmware changes that affect all systems:

  • HIPER: A problem was fixed that caused nodes to guard out processor cores, or checkstop, during the transition to nominal voltage from "power save" mode.
  • The firmware was enhanced so that SRC B1xx3409, which indicates an invalid state change (such as pushing the power on button twice quickly) will be logged as informational instead of predictive, and will not call home.
  • A problem was fixed that caused a service processor dump to be taken and SRC B181EF88 to be logged, even though the operation of the system was not affected.
  • A problem was fixed that caused the location codes for multi-port PCI adapters, such as the 4-port Ethernet adapters, to be incorrect.
  • On systems with multiple host channel adapter (HCA) cards, a problem was fixed that logical ports on the HCA cards to be intermittently inactive.
System firmware changes that affect certain systems:
  • In networks using a time server, a problem was fixed that caused the date on a client system to be reset to 1969 if the client system lost power.

4.0 How to Determine Currently Installed Firmware Level

You can view the server's current firmware level on the Advanced System Management Interface (ASMI) Welcome pane. It appears in the top right corner. Example: ES340_095.


5.0 Downloading the Firmware Package

Follow the instructions on the web page. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC is not internet-connected you will need to download the new firmware level to a CD-ROM or ftp server.


6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: ESXXX_YYY_ZZZ

Where XXX = release level

Instructions for installing firmware updates and upgrades can be found at http://publib.boulder.ibm.com/infocenter/systems/scope/hw/topic/ipha1/updupdates.htm


7.0 Firmware History

 
ES330
ES330_095_078

08/31/09

Impact:  Usability         Severity:  HIPER

System firmware changes that affect all systems

  • DEFERRED:  This fix corrects the handling of a specific processor instruction sequence that was generated on a particular heavily-tuned High Performance Computing (HPC) application. This specific instruction sequence has the potential to produce an incorrect result. This instruction sequence has only been observed in a single HPC application.  However, it is strongly recommended that you apply this fix.
  • HIPER:  A problem was fixed that caused the migration of a partition using shared processors to fail with a reason code of 4180043, or caused the source system to hang or crash.
  • A problem was fixed that caused SRC 1000911B to be erroneously logged during a reset/reload of the service processor.
System firmware changes that affect certain systems
  • On systems with 7311-D11, 7314-G30, 5790, or 5796 19" drawers attached, a problem was fixed that caused SRC 10009138 to be erroneously logged.
Concurrent maintenance (CM) firmware fixes
  • A problem was fixed that caused SRC B7005603 to be erroneously logged when a F/C 5802 or 5877 drawer was concurrently added.
ES330_092_078

05/18/09

Impact: Availability      Severity: HIPER

System firmware changes that affect all systems:

  • HIPER: The firmware was enhanced to improve the service processor's capability to recover from bad bits in the flash memory. A predictive error, or an unrecoverable error, will be logged against the card that contains the system firmware if the number of correctable or uncorrectable errors exceeds the threshold.
  • A problem was fixed that prevented the service processor from automatically booting from the permanent (or P) side if the temporary (or T) side of the firmware flash was corrupted. When the problem occurred, the service processor stopped instead of booting from the P side.
  • The firmware was enhanced so that SRC B1xxE458 (with word 6=0000E42B) will be logged as informational instead of generating a call home.
  • A problem was fixed that caused the system to crash, under certain circumstances, with SRC B112E504 being logged, followed by SRC B181C350, when a system dump was initiated.
  • A problem was fixed that caused a partition being migrated to become unresponsive on the target system when firmware-assisted dump was enabled.
  • A problem was fixed that caused hardware to be deconfigured when the system encountered network errors, even though the SRCs were being logged as informational.
  • A problem was fixed that caused the detailed data at the end of an "early power off warning type 5" AIX error log entry to be filled with invalid data instead of zeros.
  • A problem was fixed that caused a partition being migrated to crash on the target system.
  • A problem was fixed that might cause a system to crash with SRC B170E504 when a processor was dynamically deconfigured.
  • The firmware was enhanced such that when data is written to the VPD (Anchor) card, the results are verified, resulting in fewer VPD cards being replaced.
System firmware changes that affect certain systems
    In systems using InfiniBand switches for processor clustering, a problem was fixed that caused packets to be dropped under certain circumstances.
ES330_078_078

01/15/09

Impact: Function       Severity: HIPER

This level is a disruptive update from any ES330 firmware level. The system should be powered off before installing this level of system firmware. If this level is installed when the system is running, the CECs will be rebooted, causing all partitions to be terminated, and a reboot will be required. 

System firmware changes that affect all systems:

  • DEFERRED and HIPER: The system initialization settings were changed to reduce the likelihood of a system crash under extremely rare circumstances.
  • HIPER: A problem was fixed that caused a system to fail to reboot after a B1xxE504 SRC was logged, due to a processor interconnection bus failure. The same SRC, B1xxE504, was logged when the reboot failed.
  • A problem was fixed that might, if a platform dump occurred, have caused a reset/reload of the service processor, and the platform dump to be corrupted.
  • A problem was fixed that caused incorrect field replaceable unit (FRU) part numbers to be returned for the BPF scroll assembly and the UEPO panel.
  • A problem was fixed that prevented the system from rebooting if an error occurred during a memory-preserving IPL.
  • The firmware was enhanced so that a call home will be made if the hypervisor issues a "terminate immediate" interrupt.
  • The firmware's redundant bit steering logic was enhanced to improve performance.
  • A problem was fixed that caused the location codes for multi-port PCI adapters, such as the 4-port Ethernet adapters, to be incorrect.
  • A problem was fixed that prevented service processor and hypervisor error log entries from being reported to the operating system after a successful partition migration. This problem only affected the partition that was migrated.
  • On systems running AIX or Linux, a problem was fixed that, under certain rare circumstances, might cause the operating system to crash.
  • A problem was fixed that, in certain configurations, caused the removal of a host Ethernet adapter (HEA) port to fail when using a dynamic LPAR (DLPAR) operation.
  • A problem was fixed that, under certain rare circumstances, caused the hypervisor to crash when it was booting with SRC B6000103 being logged.
  • A problem was fixed that, under certain circumstances, prevented the operating system from recovering a PCI-E adapter on which a temporary enhanced error handling (EEH) error occurred.
  • A problem was fixed that, under certain rarely occurring circumstances, caused the system to crash if an L2 or L3 cache failure is not discovered and repaired when it initially occurs.
  • A problem was fixed that caused the service processor diagnostics to call out a processor as the failing item, instead of the memory DIMMs, when a large number of memory error correction coding (ECC) errors occurred.
  • A problem was fixed that caused the wrong field replaceable unit (FRU) to be called out when SRC B152F109, which indicates a problem with the NVRAM in a bulk power controller (BPC), was logged.
  • A problem was fixed that prevented service processor and hypervisor error log entries from being reported to the operating system after a successful partition migration. This problem only affected the partition that was migrated.
  • A problem was fixed that might cause a default catch to occur when booting from an iSCSI device.
System firmware changes that affect certain systems:
  • On systems with a host Ethernet adapter (HEA) or host channel adapter (HCA) assigned to a Linux partition, a problem was fixed that prevented the partition from booting if 512 GB, 1 TB, or 1.5 TB of memory was assigned to the partition. When this problem occurred, SRC B700F105 was logged.
  • In systems with clustered processors, various problems were fixed in the InfiniBand interconnection networks.
  • A problem was fixed that, under certain circumstances, caused an AIX or Linux partition to fail to boot with SRC D200E0AF being logged.
  • On systems with external I/O frames, a problem was fixed that might have prevented the firmware from "unthrottling" processors after entering power save mode.
ES330_046_034

08/28/08

Impact: Function       Severity: HIPER

System firmware changes that affect all systems:

  • DEFERRED and HIPER: A problem was fixed that, under certain rarely occurring circumstances, an application could cause a processor to go into an error state, and the system to crash.
  • HIPER: A problem was fixed that caused the system to terminate abnormally with SRC B131E504.
  • HIPER: A problem was fixed that might cause a partition to crash during a partition migration before the migration was complete.
  • A problem was fixed that caused the location codes of multi-port PCI adapters, such as a 4-port Ethernet card, to be displayed incorrectly.
  • A problem was fixed that caused multiple instances of SRC B1818A03 and B1818A0A to be logged erroneously, and multiple calls home to be made, during a frame connection reset.
  • A problem was fixed that caused SRC B1819506 to be erroneously generated, and a call home to be made, when service processor (or system controller) error log entries were generated faster than they could be processed.
  • A problem was fixed that caused the hardware management console (HMC) to show an "Incomplete" state after it attempted to read a file with an incorrect size from the service processor (or system controller). This problem also occurred if the "factory configuration" option was used on the advanced system management interface (ASMI) menus.
  • Enhancements were made to the firmware to improve the FRU callouts for certain types of failures of the time-of-day clock circuitry.
  • A problem was fixed that prevented a dump file larger than 4 GB from being successfully off-loaded to the hardware management console (HMC).
  • On systems with redundant bulk power controllers, a problem was fixed that caused the hardware management console (HMC) to get stuck at "Pending Authentication" for one of the bulk power controllers (BPCs).
  • On systems with I/O drawers attached, a problem was fixed that might have caused some I/O slots in the drawers not to be configured when the system was booted.
  • In systems with clustered processors, various problems were fixed in the InfiniBand interconnection networks.
  • A problem was fixed that caused the location codes of the external InfiniBand ports on a 5791 I/O drawer with the InfiniBand interface to be reported incorrectly on the HMC.
  • A problem was fixed that caused SRC B7006971 to be generated because the firmware was incorrectly performing operations on PCI-Express I/O adapters during dynamic LPAR (DLPAR) operations on memory.
  • A problem was fixed the might have caused an out-of-memory condition in the hypervisor, with SRC B7000200 being logged.
  • A problem was fixed in the thermal management firmware that caused SRCs B1812635 and B1812636 to be logged, and the system or node to run in low power mode when it should have been in nominal, or nominal when it should have been in low power mode.
  • A problem was fixed that caused SRC B1818A10 to be erroneously generated after a successful installation of system firmware.
  • A problem was fixed that caused the AIX commands "lsmcode" and "diag" to fail after a partition migration.
  • A problem was fixed that caused the message "BA330000malloc error!" to be displayed on the operating system console after a partition migration, even though SRC BA330000 had not been logged. When this problem occurred, the partition migration appeared to be successful. However, a process within the partition was either hung or had failed, and in most cased the partition had to be rebooted to fully recover.
  • A problem was fixed that caused the status of the connection between the hardware management console (HMC) and the service processor to be set to an invalid state. This might cause problems when the HMC and service processor tried to communicate.
  • A problem was fixed that caused partitions that were being rebooted to hang at D200E0AF after a concurrent firmware update under certain circumstances. 
ES330_034_034

06/10/08

Impact: Function      Severity: HIPER

This level is a disruptive update from the prior level, ES330_018. The system should be powered off before installing this level of system firmware. If this level is installed when the system is running, the CECs will be rebooted, causing all partitions to be terminated, and a reboot will be required. 

System firmware changes that affect all systems:

  • HIPER: A problem was fixed that caused a concurrent firmware installation to hang with SRC BA00E840 being logged. This problem may also cause a partition migration to hang, under certain circumstances, with the same SRC, BA00E840, being logged. This SRC will be logged when this level of firmware is installed and will generate a call home; it should be ignored. It will not be logged during subsequent installations.
  • HIPER: The processor initialization settings were changed to reduce the likelihood of a processor going into an error state and causing a checkstop or system crash.
  • HIPER: A problem was fixed that caused large numbers of enhanced error handling (EEH) errors to be logged against the 4-port gigabit Ethernet adapter, F/C 5740, under certain circumstances.
  • A problem was fixed that caused the /tmp directory on the system controllers and the service processor in the bulk power controller (BPC) to fill up, which results in an out-of-memory condition. When this problem occurred, the system controllers or service processor in the BPC usually performed a reset/reload. This is one possible cause of SRC B1817201 being logged. 
  • A problem was fixed in the repair and verify (R and V) function on the HMC that caused an unnecessary shutdown of the processor node when an error was logged against a bulk power regulator (BPR).
  • A problem was fixed that caused a partition using a host channel adapter (HCA) or host Ethernet adapter (HEA) to appear to hang (with progress code D200C1FF being displayed) before successfully shutting down. The amount of time the partition appeared to hang depended on the amount of memory assigned to the partition and the usage of HCA or HEA.
  • A problem was fixed that prevented the HMC from connecting to the managed system if the HMC's DHCP server IP range is changed when the managed system is running.
  • The firmware was enhanced so that the IDs "celogin1" and "celogin2" allow an authorized service provider to log into the bulk power controller (BPC).
  • The firmware was enhanced to improve the system memory error recovery.
  • The firmware was enhanced so that the contents of /tmp are included when a service processor dump is taken.
  • A problem was fixed in the hypervisor that might cause a partition migration to fail.
  • The firmware was enhanced so that:
    • A failure when writing VPD to a P6 processor will cause the node to be deconfigured rather than terminating the system.
    • The failure of a VPD write operation will not corrupt the VPD table, which may lead to unnecessary system down-time and unnecessary FRU replacement.
System firmware changes that affect certain systems:
  • On systems using QLogic InfiniBand switches, a problem was fixed that caused the PortInfo:linkWidthActive and PortInfo:linkSpeedActive to be inaccurately stored and displayed on the display of subnet parameters.
ES330_018_018

05/13/08

Impact: New     Severity: New
  • GA Level


8.0 Change History

 
Date Description
Sept 9, 2009 Added information in Section 2.2 pertaining to the HMC level required for this firmware level.