Power7 High-End System Firmware

Applies to: 9119-FHB

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.

1.0 Systems Affected
1.1 Minimum HMC Code Level
1.2 Minimum SDMC Code Level
2.0 Important Information
3.0 Firmware Information and Description
4.0 How to Determine Currently Installed Firmware Level
5.0 Downloading the Firmware Package
6.0 Installing the Firmware
7.0 Firmware History

1.0 Systems Affected

This package provides firmware for Power 795 (9119-FHB) Servers only.

The firmware level in this package is:

AH730_066

1.1 Minimum HMC Code Level

This section is intended to describe the "Minimum HMC Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the HMC level must be equal to or higher than the "Minimum HMC Code Level" before starting the system firmware update. If the HMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum HMC Code Level" the firmware update will not proceed.

The Minimum HMC Code level for this firmware is: HMC V7 R7.3.0 (PTF MH01255 or MH01256) with PTF MH01257 (Mandatory efix).

Although the Minimum HMC Code level for this firmware is listed above, HMC level V7 R7.3.0 with PTF MH01265 (Service Pack 1) and PTF MH01272 (Fix for V7R7.3.1), or higher is suggested for this firmware level.

For information concerning HMC releases and the latest PTFs, go to the following URL to access Fix Central.
http://www-933.ibm.com/support/fixcentral/

For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

NOTE: You must be logged in as hscroot in order for the firmware installation to complete correctly.

1.2 Minimum SDMC Code Level

This section is intended to describe the "Minimum Systems Director Management Console (SDMC) Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the SDMC level must be equal to or higher than the "Minimum SDMC Code Level" before starting the system firmware update. If the SDMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum SDMC Code Level" the firmware update will not proceed.

The Minimum SDMC Code level for this firmware is: SDMC V6 R7.3.0 with Mandatory PTF MF53082.
Although the Minimum SDMC Code level for this firmware is listed above, SDMC level V6 R7.3.0 with PTFs MF53082, MF53525 and MF53657, or higher is suggested for this firmware level.

For information concerning SDMC releases and the latest PTFs, go to the following URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/

For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

2.0 Important Information

Prior to this service pack, the system firmware was not properly enforcing the system partition processor limit (SPPL) attribute for shared processor partitions. This service pack fixes that enforcement to ensure that shared processor partitions comply with the limit for virtual processors when the SPPL setting is 24 or 32.

You will be affected by this change if you have the following configuration:
   - 795 class server (model 9119-FHB)
   - The server has 3 or fewer books, or the server has 4 or more books and the SPPL attribute is set to 24 or 32.
   - The server has 24 processor cores per book and you have configured more than 24 virtual processors for a shared processor partition.
   - The server has 32 processor cores per book and you have configured more than 32 virtual processors for a shared processor partition.

After this service pack is installed, the behavior of the shared processor partitions that exceed the SPPL attribute will change as follows:

- Partition activation:
   - Partitions will continue to boot and reboot successfully unless the minimum number of virtual processors is greater than the SPPL.
   - Partitions that are activated will limit the number of active virtual processors to no more than the SPPL limit.

- Partition configuration:
   - Errors that are logged when the SPPL is exceeded can result in HMC errors HSCLA4D6 and HSC0A4D6.
   - Attempts to change the number of virtual processors or entitled processing units via a profile or dynamic LPAR change will be subject to the SPPL setting of 24 or 32.
      For example, if the SPPL is set to 32 and your shared processor partition is configured with 40 virtual processors,   you must reduce the number of virtual processors to 32 or fewer for the change to be successful.
   - If you create a new shared processor partition, the number of virtual processors must not exceed the SPPL value.

- Partition mobility:
   - A partition must comply with the SPPL of the target server.

- Partition hibernation (suspend/resume):
   - If you have suspended partitions that have exceeded the SPPL limit and install this service pack, you will not be able to successfully resume those suspended
     partitions. You should ensure all suspended partitions comply with (have virtual processors fewer than or equal to) the new SPPL limit before installing this service pack.

Downgrading firmware from any given release level to an earlier release level is not recommended.
If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.

Memory Considerations for Firmware Upgrades

Firmware Release Level upgrades and Service Pack updates may consume additional system memory.
Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors.
Factors influencing server firmware memory requirements include the following:

Number of logical partitions
Partition environments of the logical partitions
Number of physical and virtual I/O devices used by the logical partitions
Maximum memory values given to the logical partitions

Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, there are some server models that require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

Additional information can be found at:
http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/topic/p7hat/iphatlparmemory.htm

3.0 Firmware Information and Description

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as Deferred and/or Partition-Deferred. Deferred fixes can be installed concurrently, but will not be activated until the next IPL. Partition-Deferred fixes can be installed concurrently, but will not be activated until a partition reactivate is performed. Deferred and/or Partition-Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For these types of fixes (Deferred and/or Partition-Deferred) within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01AHXXX_YYY_ZZZ

XXX is the release level
YYY is the service pack level
ZZZ is the last disruptive service pack level

NOTE: Values of service pack and last disruptive service pack level (YYY and ZZZ) are only unique within a release level (XXX). For example, 01AH330_067_045 and 01AH340_067_053 are different service packs.

An installation is disruptive if:

The release levels (XXX) are different.

Example: Currently installed release is AH330, new release is AH340

The service pack level (YYY) and the last disruptive service pack level (ZZZ) are the same.

Example: AH330_120_120 is disruptive, no matter what level of AH330 is currently
installed on the system

The service pack level (YYY) currently installed on the system is lower than the last disruptive service pack level (ZZZ) of the service pack to be installed.

Example: Currently installed service pack is AH330_120_120 and
new service pack is AH330_152_130

An installation is concurrent if:

The release level (XXX) is the same, and
The service pack level (YYY) currently installed on the system is the same or higher than the last disruptive service pack level (ZZZ) of the service pack to be installed.

Example: Currently installed service pack is AH330_126_120,
new service pack is AH330_143_120.


*Filename*	*Size*	*Checksum*
01AH730_066_035.rpm	50992509	09202

Note: The Checksum can be found by running the AIX sum command against the rpm file (only the first 5 digits are listed).
ie: sum 01AH730_066_035.rpm
AH730 For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url: http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AH730_066_035 12/08/11	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect certain systems HIPER/Pervasive on systems with a Virtual Input/Output (VIO) client running AIX, and with a F/C 5803 or 5873 I/O drawer attached: A problem was fixed that caused the system to crash with SRC B700F103.
AH730_065_035 11/22/11	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect all systems HIPER/Pervasive: On systems running firmware level AH730_051, or AH730_058, a problem was fixed that caused the target server to hang, or go to the incomplete state on the management console, after a Live Partition Mobility (LPM) operation. This problem can also occur when a partition hibernation operation is done.
AH730_058_035 11/07/11	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. New Features and Functions Support for the PCIe2 1.8GB cache RAID SAS adapter (tri-port 6Gb), F/C 5913. System firmware changes that affect all systems A problem was fixed that caused SRC B7005442 to be erroneously logged, and functional processor cores to be erroneously guarded out, when an error occurred in the operating system or an application. HIPER/Non-Pervasive: A problem was fixed that caused the system to crash with SRC B18187DA. A problem was fixed that prevented a partition from being activated with SRC B2006009. The firmware was enhanced to improve the field replaceable units (FRUs) called out when a clock failure occurs. HIPER/Pervasive: A problem was fixed that caused the managed system to go the incomplete state with SRC B7000602, and have to be rebooted, if these conditions were met: - An inactive partition is present on the managed system. - A concurrent system firmware update to AH730_051 was done. - The inactive partition is deleted before being activated with the new firmware level, either by the user or a partition migration operation. A problem was fixed that caused a service processor firmware synchronization to fail with SRC E302F842 when: - A system firmware upgrade to a new release (from AH720_xxx to AH730_yyy, for example) was installed, then - A service processor card was replaced. System firmware changes that affect certain systems On systems or logical partitions with a large number of virtual processors, a performance problem was fixed that prevented the utilization of the entitled capacity of partitions. A problem was fixed that caused a shared processor partition that is configured with two virtual processors, and an entitled capacity of 1.0 processors, to hang when only one processor is in the physical shared pool. A problem was fixed that caused the managed system's processors displayed by the HMC to be incorrect. This problem occurs when the system is booted when no partitions are defined, which for example can occur after an MES model upgrade. Please see the "Important Information" section of this document for additional details about this Service Pack. Concurrent hot add/repair maintenance firmware fixes On a system with mirrored memory, a problem was fixed that caused a hot node repair operation to fail. A problem was fixed that caused the host Ethernet adapters (HEA) to be in a non-functional state after a hot node add.
AH730_051_035 09/28/11	Impact: Performance Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. New Features and Functions Support for dynamic memory DIMM error handling improvements. System firmware changes that affect all systems A problem was fixed that caused SRC B18138B7 to be erroneously logged, and the service processor to terminate, when errors were continuously logged due to failing hardware. This problem can cause both node controllers to terminate, which disables the node. The service processor firmware was enhanced to log an error, instead of causing a kernel panic, if a guard record was corrupted or truncated. A problem was fixed that caused the wrong error code to be logged when the memory test took longer than normal during system boot. A problem was fixed that caused the Advanced System Management Interface (ASMI) menus to be displayed in English no matter which language was selected. The firmware was enhanced to verify that no uncorrectable memory errors are present in all of a partition's memory when the hypervisor accesses that memory. The firmware was enhanced to reduce the number of times informational SRC 10009002 is logged when a system is booted. A problem was fixed that caused two calls home for the same error to be made when a platform dump was generated. A problem was fixed that caused unrecoverable SRC B181A809 to be erroneously logged. A problem was fixed that caused a system boot to terminate with unrecoverable SRC B181A403. A problem was fixed prevented a platform system dump from being deleted when the file system space on the service processor was full. A problem was fixed prevented an encapsulated state partition from being activated after a main store dump (MSD). A problem was fixed that caused a partition to fail to activate when the activation took place within an hour of the system being powered on. This problem is much more likely to occur on large systems with a large number of I/O slots. A problem was fixed that caused the system to terminate when rebooting after the power was removed, then reapplied. A problem was fixed that caused a firmware installation from the HMC with the "do not auto accept" option selected to fail. A problem was fixed that caused a partition to fail with SRC B170E540 when rebooting after an unrecoverable error was logged that impacted the partition's reserved memory area (RMA). A problem was fixed that caused SRCs B181156C and B181A40F to be erroneously logged after a service processor reset. The firmware was enhanced to delay the rebooting of a partition after a uncorrectable error (UE) is logged in the partition's memory. This gives the service processor sufficient time to gard out the memory in which the UE occurred. The firmware was enhanced to log SRC B181C3251 as informational rather than predictive. The firmware was enhanced to log SRC B1812A11 as informational, instead of "service action required", when the thermal/power management device (TPMD) is successfully reset. A problem was fixed that erroneously caused SRC B18186x1 to be logged and an FSP dump to be generated. The field replaceable unit (FRU) callouts were enhanced for SRC B181E550. A problem was fixed that caused a system's partition dates to revert back to 1969 after the service processor or its battery was replaced. This occurred regardless of whether or not the service processor's time-of-day (TOD) clock was correctly set during the service action. A problem was fixed that caused the system to crash with SRC B700F103. System firmware changes that affect certain systems HIPER/Pervasive: On systems running VIOS, a problem was fixed that caused the system to crash with SRC B700F103. HIPER/Pervasive: On systems with processors that don't have memory associated with them, a problem was fixed that was degrading system performance. On systems running Advanced Memory Sharing (AMS), a problem was fixed that caused the system to crash during the creation of a logical partition (LPAR). On a system that terminates when in dynamic power save mode, a problem was fixed that caused SRCs B150B943, B113C660, and B113C661 to be erroneously logged when the system rebooted. On systems running more than 100 logical partitions, a problem was fixed that caused a concurrent firmware installation to fail. On systems running IBM i partitions, a problem was fixed that prevented IBM i partitions that were suspended from being reactivated after a main store dump (MSD). On systems running IBM i partitions, a problem was fixed that caused changing the processor weight on an IBM i partition to 255 to have no effect. On systems running Advanced Memory Sharing (AMS), a problem was fixed that prevented the virtual I/O server (VIOS) partition associated with an AMS pool from shutting down. On systems with partitions with dedicated memory assigned, a problem was fixed that caused a resume operation on a partition with dedicated memory to fail with HMC SRC HSC0A945. On systems running an IBM i partition with dedicated memory, and redundant virtual I/O server (VIOS) partitions, a problem was fixed that caused the resumption of the IBM i partition to fail if the hypervisor failed-over to the other VIOS partition while the IBM i partition was in hibernation. The firmware was enhanced to allow the installation of IBM i from the HMC command line interface (CLI) using the "chsysstate" command. On systems running shared processor partitions, a problem was fixed that caused a partition to hang until powered off and back on. On systems running the Advanced Energy Manager (AEM), a problem was fixed that caused the work rate calculation for a processor to be incorrect if the system dropped into safe mode. On systems from which a node has been removed, a problem was fixed that caused the node to continue to be listed when the Processing Unit Deconfiguration option is selected on the Advanced System Management Interface (ASMI) menus. On systems with an uninterruptible power supply (UPS) attached, a problem was fixed that caused the system to power cycle after a power failure, instead of waiting for power to be restored before powering on. A problem was fixed that prevented an automatic system reboot after a checkstop when a service processor fail over occurred during the checkstop recovery. On systems with F/C 1954 (4-port GB Ethernet adapter) installed, a problem was fixed that prevented the adapter from being configured during boot, and two B7006970 SRCs to be erroneously logged. On systems running VIOS, a problem was fixed that caused the location code in the output of the VIOS command "lsmap -npiv -all" to be incorrect. A problem was fixed that caused a partition migration operation to abort when the partition has more than 4096 virtual slots. On systems running Advanced Memory Sharing (AMS), the firmware was enhanced to reduce the time required to migrate an AMS partition. On systems running DB2 pureScale, a problem was fixed that caused intermittent remote direct memory access (RMDA) errors, and a core dump of the pureScale server process. On systems with processors that don't have memory associated with them, the firmware was enhanced to improve boot time and system performance. A problem was fixed that caused the system to appear to hang, and a service processor reset/reload to occur, when multiple hardware errors occurred. On systems running virtual switches, the firmware was enhanced to limit the number of partitions that have access to a particular vswitch. On systems with more than 1000 partitions, a problem was fixed that caused the error logs to be flooded with informational SRC B7005120 when all of the partitions are rebooted at the same time. On systems in which a service processor had been guarded out manually, a problem was fixed that caused the Deconfiguration Records option, which is under the System Service Aids in the Advanced System Management Interface (ASMI), to display null data for that service processor. On systems with redundant service processors, a problem was fixed that prevented a service processor fail-over from occurring. On systems with unequal amounts of memory in the processor books, and on which the system processor partition limit is set to max, a problem was fixed that prevented the hypervisor from choosing the domain with the smallest amount of memory when the user is packing memory into the domain. A problem was fixed that caused the message "Console Internal Error", and an FSP dump to be taken, when the Memory eRepair Data option was selected from the Advanced System Management Interface (ASMI) menu. The firmware was enhanced to allow the system to boot when one of the two clock cards does not power on. The firmware was enhanced to add a threshold to certain conditions that cause SRC B114E504 to be logged. This reduces the unnecessary replacement of processor modules. The firmware was enhanced to allow the Enhanced Cache Option (also known as Turbo Core) to be enabled when three or more processor nodes are present. A problem was fixed that caused SRC B1754201, with memory DIMMs in the FRU list, to be erroneously logged after the reset/reload of a node controller. The firmware was enhanced to allow the concurrent replacement of the secondary service processor even if the service processor redundancy policy is set to "disabled". Concurrent hot add/repair maintenance firmware fixes On partitions running Red Hat Linux 6.1, a problem was fixed that caused a node evacuation operation to fail. A problem was fixed that caused a node repair operation to fail if it had to be restarted. A problem was fixed that caused the system to crash during a hot GX adapter repair.

4.0 How to Determine Currently Installed Firmware Level

You can view the server's current firmware level on the Advanced System Management Interface (ASMI) Welcome pane. It appears in the top right corner. Example: AH730_123.

5.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC or SDMC is not internet-connected you will need to download the new firmware level to a CD-ROM or ftp server.

6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: AHXXX_YYY_ZZZ

Where XXX = release level

If the release level will stay the same (Example: Level AH330_075_075 is currently installed and you are attempting to install level AH330_081_075) this is considered an update.
If the release level will change (Example: Level AH330_081_075 is currently installed and you are attempting to install level AH340_096_096) this is considered an upgrade.

Instructions for installing firmware updates and upgrades can be found at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/p7ha1/updupdates.htm

IBM i Systems:
See "IBM Server Firmware and HMC Code Wizard":
http://www-912.ibm.com/s_dir/slkbase.NSF/DocNumber/408316083

NOTE: For all systems running with the IBM i Operating System, the following IBM i PTFs must be applied to all IBM i partitions prior to installing AH730_066:

V7R1M0 - MF51869
V6R1M1 - MF51864

These PTFs can be ordered through Fix Central.

7.0 Firmware History

The Firmware History can be reviewed at the following link:

http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html