Power7 System Firmware

Applies to: 9119-FHB

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.


Contents


1.0 Systems Affected

This package provides firmware for Power 795 (9119-FHB) Servers only.

The firmware level in this package is:


1.1 Minimum HMC Code Level

This section is intended to describe the "Minimum HMC Code Level" required by the System Firmware to complete the firmware installation process. When installing the System Firmware, the HMC level must be equal to or higher than the "Minimum HMC Code Level" before starting the system firmware update.  If the HMC managing the server targeted for the System Firmware update is running a code level lower than the "Minimum HMC Code Level" the firmware update will not proceed.

The Minimum HMC Code level for this firmware is:  HMC V7 R7.9.0 (PTF MH01405) with mandatory efix (PTF MH01406).

Although the Minimum HMC Code level for this firmware is listed above,  HMC V7 R7.9.0 Service Pack 3  (PTF MH01546) with ifix (PTF MH01666) or higher is recommended.

Important: To avoid vulnerability to security or known HMC issues  , the HMC should be updated to the above recommended level,  prior to installing this server firmware level.

Notes: 

- V7 R790 SP3 :  HMC V7.R790 is the last HMC release to support HMC Models CR4, CR3, C07, C06, C05

- Upgrading the HMC to V7R7.9.0 is required prior to installing this firmware. This is due to the firmware containing support for Single Root I/O Virtualization (SR-IOV) adapters. An SR-IOV adapter can be configured in shared mode and be shared by multiple logic partitions at the same time. HMC supports the configuration of the logical ports assigned to partitions and supports the configuration, backup, and restore of the adapter and physical port properties.

For information concerning HMC releases and the latest PTFs,  go to the following URL to access Fix Central.
http://www-933.ibm.com/support/fixcentral/

For specific fix level information on key components of IBM Power Systems running the AIX, IBM i and Linux operating systems, we suggest using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home

NOTE: You must be logged in as hscroot in order for the firmware installation to complete correctly.

2.0 Important Information


Special Instructions for Upgrading to Server Firmware AH780:
Note: If the dual HMC is not disconnected prior to the upgrade, the upgrade will fail shortly after the "retrieving updates" stage with the following error:

HSCF0999 - Disconnect or power-off the sibling management console(s) from the following list and retry the update. After the update is complete, reconnect or power-on the sibling.
The requested update level can not be applied on the following server from this management console (<HMC performing the upgrade>) while the server is managed by multiple management consoles. management console(s).
<server MTMS>: Sibling console(s)
- On the dual HMC, select HMC Management, then the Shut Down and Restart task.  
- On the Shutdown or Restart panel select Shutdown HMC and click OK. 
- If the HMC is in a remote or "lights out" data center then the HMC can be disconnected from the server and frame. 

 
See the following document for detailed information: http://www-01.ibm.com/support/docview.wss?uid=nas8N1010700


ECA Info:
Before upgrading your system from AH720 to AH730/AH760/AH780 firmware release, contact your authorized provider and ask about ECA 256 and ECA 303, as hardware may have to be upgraded.

SPPL NOTE:
In some previous firmware releases, the system firmware was not properly enforcing the system partition processor limit (SPPL) attribute for shared processor partitions.  This service pack fixes that enforcement to ensure that shared processor partitions comply with the limit for virtual processors when the SPPL setting is 24 or 32.

You will be affected by this change if you have the following configuration:
   - 795 class server (model 9119-FHB)
   - The server has 3 or fewer books, or the server has 4 or more books and the SPPL attribute is set to 24 or 32.
   - The server has 24 processor cores per book and you have configured more than 24 virtual processors for a shared processor partition.
   - The server has 32 processor cores per book and you have configured more than 32 virtual processors for a shared processor partition.

After this service pack is installed, the behavior of the shared processor partitions that exceed the SPPL attribute will change as follows:

- Partition activation:
   - Partitions will continue to boot and reboot successfully unless the minimum number of virtual processors is greater than the SPPL.
   - Partitions that are activated will limit the number of active virtual processors to no more than the SPPL limit.

- Partition configuration:
   - Errors that are logged when the SPPL is exceeded can result in HMC errors HSCLA4D6 and HSC0A4D6.
   - Attempts to change the number of virtual processors or entitled processing units via a profile or dynamic LPAR change will be subject to the SPPL setting of 24 or 32.
      For example, if the SPPL is set to 32 and your shared processor partition is configured with 40 virtual processors,   you must reduce the number of virtual processors to 32 or fewer for the change to be successful.
   - If you create a new shared processor partition, the number of virtual processors must not exceed the SPPL value.

- Partition mobility:
   -  A partition must comply with the SPPL of the target server.

- Partition hibernation (suspend/resume):
   - If you have suspended partitions that have exceeded the SPPL limit and install this service pack, you will not be able to successfully resume those suspended
     partitions.  You should ensure all suspended partitions comply with (have virtual processors fewer than or equal to) the new SPPL limit before installing this service pack.

Downgrading firmware from any given release level to an earlier release level is not recommended.
If you feel that it is necessary to downgrade the firmware on your system to an earlier release level, please contact your next level of support.

IPv6 Support and Limitations

IPv6 (Internet Protocol version 6) is supported in the System Management Services (SMS) in this level of system firmware. There are several limitations that should be considered.

When configuring a network interface card (NIC) for remote IPL, only the most recently configured protocol (IPv4 or IPv6) is retained. For example, if the network interface card was previously configured with IPv4 information and is now being configured with IPv6 information, the IPv4 configuration information is discarded.

A single network interface card may only be chosen once for the boot device list. In other words, the interface cannot be configured for the IPv6 protocol and for the IPv4 protocol at the same time.

Memory Considerations for Firmware Upgrades

Firmware Release Level upgrades and Service Pack updates may consume additional system memory.
Server firmware requires memory to support the logical partitions on the server. The amount of memory required by the server firmware varies according to several factors.
Factors influencing server firmware memory requirements include the following:
Generally, you can estimate the amount of memory required by server firmware to be approximately 8% of the system installed memory. The actual amount required will generally be less than 8%. However, there are some server models that require an absolute minimum amount of memory for server firmware, regardless of the previously mentioned considerations.

Additional information can be found at:
  http://www.ibm.com/support/knowledgecenter/9119-FHB/p7hat/iphatlparmemory.htm


3.0 Firmware Information and Description

Use the following examples as a reference to determine whether your installation will be concurrent or disruptive.

Note: The concurrent levels of system firmware may, on occasion, contain fixes that are known as Deferred and/or Partition-Deferred. Deferred fixes can be installed concurrently, but will not be activated until the next IPL. Partition-Deferred fixes can be installed concurrently, but will not be activated until a partition reactivate is performed.  Deferred and/or Partition-Deferred fixes, if any, will be identified in the "Firmware Update Descriptions" table of this document. For these types of fixes (Deferred and/or Partition-Deferred) within a service pack, only the fixes in the service pack which cannot be concurrently activated are deferred.

Note: The file names and service pack levels used in the following examples are for clarification only, and are not necessarily levels that have been, or will be released.

System firmware file naming convention:

01AHXXX_YYY_ZZZ

NOTE: Values of service pack and last disruptive service pack level (YYY and ZZZ) are only unique within a release level (XXX). For example, 01AH330_067_045 and 01AH340_067_053 are different service packs.

An installation is disruptive if:

Example: Currently installed release is AH330, new release is AH340 Example: AH330_120_120 is disruptive, no matter what level of AH330 is currently
installed on the system Example: Currently installed service pack is AH330_120_120 and new service pack is AH330_152_130.

An installation is concurrent if:

Example: Currently installed service pack is AH330_126_120, new service pack is AH330_143_120.

 
Filename Size Checksum
01AH780_084_040.rpm 55400835
26933
   
Note: The Checksum can be found by running the AIX sum command against the rpm file (only the first 5 digits are listed).
ie: sum 01AH780_084_040.rpm

AH780
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs

The complete Firmware Fix History for this Release Level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html
AH780_084_040 / FW780.60

01/16/17
Impact:  Availability      Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed for a Live Partition Mobility migration that resulted in the source managed system going to the Hardware Management Console (HMC) Incomplete state after the migration to the target system was completed.  This problem is very rare and has only been detected once.. The problem trigger is that the source partition does not halt execution after the migration to the target system.   The HMC went to the Incomplete state for the source managed system when it failed to delete the source partition because the partition would not stop running.  When this problem occurred, the customer network was running very slowly and this may have contributed to the failure.  The recovery action is to re-IPL the source system but that will need to be done without the assistance of the HMC.  For each partition that has a OS running on the source system, shut down each partition from the OS.  Then from the Advanced System Management Interface (ASMI),  power off the managed system.  Alternatively, the system power button may also be used to do the power off.  If the HMC Incomplete state persists after the power off, the managed system should be rebuilt from the HMC.  For more information on HMC recovery steps, refer to this IBM Knowledge Center link: https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
  • A problem was fixed for a latency time of about 2 seconds being added to a target Live Partition Mobility (LPM) migration system when there is a latency time check failure.  With the fix, in the case of a latency time check failure, a much smaller default latency is used instead of two seconds.  This error would not be noticed if the customer system is using a NTP time server to maintain the time.
  • A problem was fixed for a shared processor pool partition showing an incorrect zero "Available Pool Processor" (APP) value after a concurrent firmware update.  The zero APP value means that no idle cycles are present in the shared processor pool but in this case it stays zero even when idle cycles are available.  This value can be displayed using the AIX "lparstat" command.  If this problem is encountered, the partitions in the affected shared processor pool can be dynamically moved to a different shared processor pool.  Before the dynamic move, the  "uncapped" partitions should be changed to "capped" to avoid a system hang. The old affected pool would continue to have the APP error until the system is re-IPLed.
  • A rare problem was fixed for a system hang that can occur when dynamically moving "uncapped" partitions to a different shared processor pool.  To prevent a system hang, the "uncapped" partitions should be changed to "capped" before doing the move.
  • A problem was fixed for a blank SRC in the LPA dump for user-initiated non-disruptive adjunct dumps.  The SRC is needed for problem determination and dump analysis.
  • A problem was fixed for incorrect error messages from the Advanced System Management Interface (ASMI) functions when the system is powered on but in the  "Incomplete State".  For this condition, ASMI was assuming the system was powered off because it could not communicate to the PowerVM hypervisor.  With the fix, the ASMI error messages will indicate that ASMI functions have failed because of the bad hypervisor connection instead of falsely stating that the system is powered off.
  • A problem was fixed for Live Partition Mobility (LPM) migrations from FW860.10 or FW860.11 to older levels of firmware. Subsequent DLPAR of Virtual Adapters will fail with HMC error message HSCL294C, which contains text similar to the following:  "0931-007 You have specified an invalid drc_name." This issue affects partitions installed with AIX 7.2 TL 1 and later. Not affected by this issue are partitions installed with VIOS, IBM i, or earlier levels of AIX.
AH780_080_040 / FW780.50

06/29/16
Impact:  Availability      Severity:  SPE

New Features and Functions

  • Support was added for the Stevens6+ option of the internal tray loading DVD-ROM drive with F/C #EU13.  This is an 8X/24X(max) Slimline SATA DVD-ROM Drive.  The Stevens6+ option is a FRU hardware replacement for the Stevens3+.  MTM 7226-1U3 (Oliver)  FC 5757/5762/5763 attaches to IBM Power Systems and lists Stevens6+ as optional for Stevens3+.  If the Stevens6+  DVD drive is installed on the system without the required firmware support, the boot of an AIX partition will fail when the DVD is used as the load source.  Also, an IBM i partition cannot consistently boot from the DVD drive using D-mode IPL.  A SRC C2004130 may be logged for the load source not found error.
  • Support was added for systems to be able to automatically convert permanently activated resources (processor and memory) to  Mobile CoD resources for use in a Power Enterprise Pool (PEP).  The ability to do a CoD resource license conversion requires a minimum HMC level of V8R8.4.0 or later.  More information on how to use a  PEP for a group of systems tp share Mobile Capacity on Demand (CoD) processor resources and memory resources can be found in the IBM Knowledge Center at the following link: https://www.ibm.com/support/knowledgecenter/HW4M4/p8ha2/systempool_cod.htm.  

System firmware changes that affect all systems

  • A problem was fixed for PCI adapters locking up when powered on.  The problem is rare but frequency varies with the specific adapter models.  A system power down and power up is required to get the adapter out of the locked state.
  • A security problem was fixed in OpenSSL for a possible service processor reset on a null pointer de-reference during RSA PPS signature verification. The Common Vulnerabilities and Exposures issue number is CVE-2015-3194.
  • A problem was fixed for hypervisor task failures in adjunct partitions with a SRC B7000602 reported in the error log.  These failures occur during adjunct partition reboots for concurrent firmware updates but are extremely rare and require a re-IPL of the system to recover from the task failure.  The adjunct partitions may be associated with the VIOS or I/O virtualization for the physical adapters such as done for SR-IOV.
  • A problem was fixed for a shortened "Grace Period" for "Out of Compliance" users of a Power Enterprise Pool (PEP).   The "Grace Period" is short by one hour, so the user has one less hour to resolve compliance issues before the HMC disallows any more borrowing of PEP resources.  For example, if the "Grace Period" should have been 48 hours as shown in the "Out of Compliance" message, it really is 47 hours in the hypervisor firmware.  The borrowing of PEP resources is not a common usage scenario.  It is most often found in Live Partition Mobility (LPM) migrations where PEP resources are borrowed from the source server and loaned to the target server.
  • A problem was fixed for the Advanced System Management Interface "Network Services/Network Configuration" "Reset Network Configuration" button that was not resetting the static routes to the default factory setting.  The manufacturing default is to have no static routes defined so the fix clears any static routes that had been added.  A circumvention to the problem is to use the ASMI "Network Services/Network Configuration/Static Route Configuration" "Delete" button before resetting the network configuration.
  • A problem was fixed for a sequence of two or more Live Partition Mobility migrations that caused a partition to crash with a SRC BA330000 logged (Memory allocation error in partition firmware).  The sequence of LPM migrations that can trigger the partition crash are as follows:
    The original source partition level can be any FW760.xx, FW763.xx, FW770.xx, FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx, FW820.xx, FW830.xx, or FW840.xx P8 level.  It is migrated first to a system running one of the following levels:
    1) FW730.70 or later 730 firmware or
    2) FW740.60 or later 740 firmware
    And then a second migration is needed to a system running one of the following levels:
    1) FW760.00 - FW760.20 or
    2) FW770.00 - FW770.10
    The twice-migrated system partition is now susceptible to the BA330000 partition crash during normal operations until the partition is rebooted.  If an additional LPM migration is done to any firmware level, the thrice-migrated partition is also susceptible to the partition crash until it is rebooted.
    With the fix applied, the susceptible partitions may still log multiple BA330000 errors but there will be no partition crash.  A reboot of the partition will stop the logging of the BA330000 SRC.

System firmware changes that affect certain systems

  • On systems having a IBM i partition with more than 64 cores, a performance problem was fixed with the choice of processor cores assigned to the partition.  This problem only pertains to the Power 780 (9179-MHD) and the Power 795 (9119-FHB).
  • On systems with a PowerVM Active Memory Sharing (AMS) partition with AIX  Level 7.2.0.0 or later with Firmware Assisted Dump enabled, a problem was fixed for a Restart Dump operation failing into KDB mode.  If "q" is entered to exit from KDB mode, the partition fails to start.  The AIX partition must be powered off and back on to recover.  The problem can be circumvented by disabling Firmware Assisted Dump (default is enabled in AIX 7.2).
  • For a system partition with more than 64 cores, a problem was fixed for Live Partition Mobility (LPM)  migration operations failing with HSCL365C.  The partition migration is stopped because the platform detects a firmware error anytime the partition has more than 64 cores.  This problem only pertains to the Power 780 (9179-MHD) and the Power 795 (9119-FHB).
  • On systems with dedicated processor partitions,  a problem was fixed for the dedicated processor partition becoming intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • DEFERRED:  A problem was fixed for a I/O performance slow-down that can occur after a concurrent repair of a GX bus I/O adapter with a Feature Code of #1808, #1816, #1914, #EN22, #EN23, or #EN25.  A re-IPL of the system after the concurrent repair operation corrects the I/O performance issue.  This fix requires an IPL of the system to take effect.
AH780_075_040 / FW780.40

12/16/15
Impact:  Availability      Severity:  SPE

New Features and Functions

  • Support was added to the service processor to allow control of Dynamic Power Mode from the Hardware Management Console (HMC).  This power mode allows modifying a processor frequency,  either to reduce energy consumption or to overclock the processor and boost the machine speed.   There are four power modes possible:
    1) Disable Power Saver mode – this is default.   No changes in the processor frequency and resource will operate at 100% of nominal processor frequencies at all times.
    2) Enable Static Power Saver mode – activates the Power Saver mode, fixing the processor frequency and voltage at a predetermined low-power mode.
    3) Enable Dynamic Power Saver (favor power) mode – guarantees power savings by limiting the maximum frequency of the system under peak utilization under high utilization.
    4) Enable Dynamic Power Saver (favor performance) mode – allows a higher frequency range at high utilization.
    There is existing support to control Dynamic Power Mode from the Advanced System Management Interface (ASMI) with the "System Configuration /Power Management/ Power Mode Setup" panel options.   With the new support, the HMC can also control the Dynamic Power Modes with CLI commands lspwrmgmt (list the current power mode configuration) and chpwrmgmt (change the power mode):
    chpwrmgmt -m managed-system -r sys -o {enable | disable}  [-t {static | dynamic_favor_perf | dynamic_favor_power | fixed_max_frequency}]  [--help]
    For more information on the HMC CLI chpwrmgmt command, see the following link in the IBM KnowledgeCenter:  (https://www-01.ibm.com/support/knowledgecenter/HW4L4/p8edm/chpwrmgmt.html).
    The HMC must be at V8R8.2.0  or later to have the Dynamic Power Mode feature.
  • Support was added to the Advanced System Management Interface (ASMI) to be able to add a IPv4 static route definition for each ethernet interface on the service processor.  Using a static route definition,  a Hardware Management Console (HMC) configured on a private subnet that is different from the service processor subnet is now able to connect to the service processor and manage the CEC.  A static route persists until it is deleted or until the service processor settings are restored to manufacturing defaults.  The static route is managed with the ASMI panel "Network Services/Network Configuration/Static Route Configuration" IPv4 radio button.  The "Add" button is used to add a static route (only one is allowed for each ethernet interface) and the "Delete" button is used to delete the static route.

System firmware changes that affect all systems

  • For a partition that has been migrated with Live Partition Mobility (LPM) from FW730 to FW740 or later, a problem was fixed for a Main Storage Dump (MSD) IPL failing with SRC B2006008.  The MSD IPL can happen after a system failure and is used to collect failure data.  If the partition is rebooted anytime after the migration, the problem cannot happen.  The potential for the problem existed between the active migration and a partition reboot.
  • A problem was fixed for partial loss of Entitlement for On/Off Memory Capacity On Demand (also called Elastic COD).  Users with large amounts of Entitlement on the system of greater than "65535 GB * Days" could have had a truncation of the Entitlement value on a re-IPL of the system.  To recover lost Entitlement, the customer can request another On/Off Enablement Code from IBM support to "re-fill" their entitlement.
  • A problem was fixed for an incorrect restriction on the amount of "Unreturned"  resources allowed for a Power Enterprise Pool (PEP).  PEP allows for logical moving of resources (processors and memory) from one server to another.  Part of this is 'borrowing' resources from one server to move to another. This may result in "Unreturned" resources on the source server. The management console controls how many total "Unreturned" PEP resources can exist.  For this problem,  the user had some "Unreturned" PEP memory and asked to borrow more but this request was incorrectly refused by the hypervisor.
  • On systems where memory relocation (as done by using Live Partition Mobility (LPM) ) and a partition reboot are occurring simultaneously, a problem for a system termination was fixed.  The potential for the problem existed between the active migration and the partition reboot.
  • A problem was fixed in the hypervisor power off to protect from rare NVRAM corruption in the address space where the partition profiles are stored.  The B7005301 SRC is logged on the next IPL after the corruption that takes the system into the Hardware Management Console (HMC) recovery state.  The HMC found the partition profiles corrupted in NVRAM.  The HMC partition profile recovery procedure must be used to restore the partition profiles from the HMC.
  • A problem was fixed for a hypervisor adjunct partition failed with "SRC B2009008 LP=32770" for an unexpected SR-IOV adapter configuration.  Without the fix, the system must be re-IPLed to correct the adjunct error.  This error is infrequent and can only occur if an adapter port configuration is being changed at the same time that error recovery is occurring for the adapter.
  • A security problem was fixed for an OpenSSL specially crafted X.509 certificate that could cause the service processor to reset in a denial-of-service (DOS) attack.  The Common Vulnerabilities and Exposures issue number is CVE-2015-1789.
  • A security problem was fixed in OpenSSL where a remote attacker could cause an infinite loop on the service processor using malformed Elliptic Curve parameters during the SSL authentication.  This would cause the service processor performance problems and also prevent new management console connections from being made.  To recover from this attack, a reset or power cycle of the service processor is needed after scheduling and completing a normal shutdown of running partitions..  The Common Vulnerabilities and Exposures issue number is CVE-2015-1788.
  • A security problem was fixed in the lighttpd server on the service processor OpenSSL where a remote attacker, while attempting authentication, could insert strings into the lighttpd server log file.  Under normal operations on the service processor, this does not impact anything because the log is disabled by default.  The Common Vulnerabilities and Exposures issue number is CVE-2015-3200.
  • A problem was fixed for a Network boot/install failure using bootp in a network with switches using the Spanning Tree Protocol (STP).  A Network boot/install using lpar_netboot on the management console was enhanced to allow the number of retries to be increased.  If the user is not using lpar_netboot, the number of bootp retries can be increased using the SMS menus.  If the SMS menus are not an option, the STP in the switch can be set up to allow packets to pass through while the switch is learning the network configuration.
  • A problem was fixed in the run-time abstraction services (RTAS) extended error handling (EEH) recovery for EEH events for SR-IOV Virtual Functions (VFs) to fully reconfigure the VF devices after an EEH event.  Since the physical adapter does recover from the EEH event itself, and there are no error logs generated, it might not be immediately apparent that the VF did not fully reconfigure.  This prevents certain PCIe settings from being established for interrupts and performance settings, leading to unexpected adapter behavior and errors in the partition.
  • For systems with an invalid P-side or T-side in the firmware, a problem was fixed in the partition firmware Real-Time Abstraction System (RTAS) so that system Vital Product Data (VPD) is returned at least from the valid side instead of returning no VPD data.   This allows AIX host commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work to some extent even if there is one bad code side.  Without the fix,  all the VPD data is blocked from the OS until the invalid code side is recovered by either rejecting the firmware update or attempting to update the system firmware again.
  • A problem was fixed that prevented a second management console from being added to the system.  In some cases, network outages caused defunct management console connection entries to remain in the service processor connection table, making connection slots unavailable for new management consoles  A reset of the service processor could be used to remove the defunct entries and allow the second management console to connect.
  • A problem was fixed for some service processor error logs not getting reported to the OS partitions as needed.  The service processor was not checking for a successful completion code on the error log message send, so it was not doing retries of the send to the OS when that was needed to ensure that the OS received the message.
  • A problem was fixed for an incorrect call home for SRC B1818A0F.  There was no real problem so this call home should have been ignored.  This occurred when dynamic IP configurations were being done on the service processor and the DHCP server was not responding.  The correct solution was to fix the network configuration so that the DHCP server could be found on network.

System firmware changes that affect certain systems

  • On systems using PowerVM with shared processor partitions that are configured as capped or in a shared processor pool, there was a problem found that delayed the dispatching of the virtual processors which caused performance to be degraded in some situations.  Partitions with dedicated processors are not affected.   The problem is rare and can be mitigated, until the service pack is applied, by creating a new shared processor AIX or Linux partition and booting it to the SMS prompt; there is no need to install an operating system on this partition.  Refer to help document http://www.ibm.com/support/docview.wss?uid=nas8N1020863 for additional details.
  • On a system with a IBM i partition running 7.2 or later with 4K sector disks,  a problem was fixed for a  machine check incorrectly issued.
  • On a system with a AIX partition and a Linux partition, a problem was fixed for dynamically moving an adapter that uses DMA from the Linux partition to the AIX partition that caused the AIX to fail by going into KDB mode (0c20 crash).  The management console showed the following message for the partition operation:  "Dynamic move of I/O resources failed.  The I/O slot dynamic partitioning operation failed.".  The error was caused by Linux using 64K mappings for the DMA window and AIX using 4K mappings for the DMA window, causing incorrect calculations on the AIX when it received the adapter.  Until the fix is applied, the adapters that use DMA should only be moved from Linux to AIX when the partitions are powered off.
  • On systems with IBM i partitions that have a load source device with 4K sectors, a problem has been fixed for Mainstore Dump (MSD) failing with a B200F00C SRC.  Without the fix, the  IBM i 4K sector load source devices are not supported for MSD and always fail.
  • For non-HMC managed systems in Manufacturing Default Configuration (MDC) mode with a single host partition, a problem was fixed for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were not off-loaded to the host OS.  This is an infrequent error caused by a timing error that causes the dump notification signal to the host OS to be lost.  The missing/pending dumps can be retrieved by rebooting the host OS partition.  The rebooted host OS will receive new notifications of the dumps that have to be off-loaded.
  • A problem was fixed that inadvertently allowed TurboCore to be configured on a system with less than three nodes. impacting performance by running with too few cores active.  TurboCore is also referred to as Extended Cache Option (ECO), or Enhanced Cache Mode.  In this mode, four of the eight cores are disabled in the processor card SCM, allowing for a larger L3 cache utilization by the remaining active cores.  This feature requires a system with at least three processor books or nodes, where all the processors are TurboCore capable (8-core).
AH780_071_040 / FW780.30

04/22/15
Impact:  Security      Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed for the iptables process consuming all available memory, causing an "out of memory" dump and reset/reload of the service processor.
  • A problem was fixed with the fspremote service tool to make it support TLSv1.2 connections to the service processor to be compatible with systems that had been fixed for the OpenSSL Padding Oracle On Dowgraded Legacy Encryption (POODLE) vulnerabilities.  After the POODLE fix is installed, by default the system only allows secured connections from clients using the TLSv1.2 protocol.
  • A problem was fixed for performance dumps to speed its processing so it is able to handle partitions with a large number of processors configured.  Previously, for large systems, the performance dump took too long in collecting performance data to be useful in the debugging of some performance problems.
  • A problem was fixed to prevent a hypervisor task failure if multiple resource dumps running concurrently run out of dump buffer space.  The failed hypervisor task could prevent basic logical partition operations from working.
  • A problem was fixed for a partition deletion error on the management console with error code 0x4000E002 and message "...insufficient memory for PHYP".  The partition delete operation has been adjusted to accommodate the temporary increase in memory usage caused by memory fragmentation, allowing the delete operation to be successful.
  • A problem was fixed for I/O drawer MTMS updates where a hypervisor memory leak would cause reconfiguration operations to fail or cause resources to no longer show up for user configuration.
  • A security problem was fixed in OpenSSL where the service processor would, under certain conditions, accept Diffie-Hellman client certificates without the use of a private key, allowing a user to falsely authenticate .  The Common Vulnerabilities and Exposures issue number is CVE-2015-0205.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) messages.  A specially crafted DTLS message could exhaust all available memory and cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0206.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) messages.  A specially crafted DTLS message could do an null pointer de-reference and cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number is CVE-2014-3571.
  • A security problem was fixed in OpenSSL to fix multiple flaws in the parsing of X.509 certificates.  These flaws could be used to modify an X.509 certificate to produce a certificate with a different fingerprint without invalidating its signature, and possibly bypass fingerprint-based blacklisting.  The Common Vulnerabilities and Exposures issue number is CVE-2014-8275.
  • A security vulnerability, commonly referred to as GHOST, was fixed in the service processor glibc functions getbyhostname() and getbyhostname2() that allowed remote users of the functions to cause a buffer overflow and execute arbitrary code with the permissions of the server application.  There is no way to exploit this vulnerability on the service processor but it has been fixed to remove the vulnerability from the firmware.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0235.
  • A problem was fixed in the Advanced System Management Interface (ASMI) to reword a confusing message for systems with no deconfigured resources.  The "System Service Aids/Deconfiguration Records" message text for this situation was changed from "Deconfiguration data is currently not available." to "No deconfigured resources found in the system."
  • A problem was fixed for a hypervisor deadlock that results in the system being in a "Incomplete state" as seen on the management console.  This deadlock is the result of two hypervisor tasks using the same locking mechanism for handling requests between the partitions and the management console.  Except for the loss of the management console control of the system, the system is operating normally when the "Incomplete state" occurs.
  • A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with malformed Elliptic Curve private keys.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0209.
  • A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with a specially crafted X.509 certificate that causes an invalid pointer, out-of-bounds write, or a null pointer de-reference.  The Common Vulnerabilities and Exposures issue numbers are CVE-2015-0286,  CVE-2015-0287, and CVE-2015-0288.

System firmware changes that affect certain systems

  • On systems with redundant service processors and unlicensed cores, a problem was fixed with firmware update to prevent SRC B170B838 errors on unlicensed cores after an administrative failover (AFO) to the backup service processor.
  • On systems with redundant service processors, a problem was fixed for serviceable events being missing on the management console for the case of a backup service processor termination error.  The error log from the failed backup service processor did not get synchronized to the primary service processor.
  • On a system with redundant service processors, a problem was fixed for bad pointer reference in the mailbox function during data synchronization between the two service processors.  The de-reference of the bad pointer caused a core dump, reset/reload, and fail-over to the backup service processor.
  • On systems using the Virtual I/O Server (VIOS) to share physical I/O resources among client logical partitions, a problem was fixed for memory relocation errors during page migrations for the virtual control blocks.  These errors caused a CEC termination with SRC B700F103.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A problem was fixed that could result in unpredictable behavior if a memory UE is encountered while relocating the contents of a logical memory block during one of these operations:
    - Using concurrent maintenance to perform a hot repair of a node.
    - Reducing the size of an Active Memory Sharing (AMS) pool.
    - On systems using mirrored memory, using the memory mirroring optimization tool.
    - Performing a Dynamic Platform Optimizer (DPO) operation.
  • On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed for an inaccurate pool idle count over a small sampling period.
  • A problem was fixed that could result in latency or timeout issues with I/O devices.  On systems using Power7+ processors (IBM Power 770 (9117-MMD, IBM Power 780 (9179-MHD), and IBM Power ESE (8412-EAD)), this issue only impacts shared processor partitions.
  • For a system with Virtual Trusted Platform Module (VTPM) partitions,  a problem was fixed for a management console error that occurred while restoring a backup profile that caused the system to to go the management console "Incomplete state".  The failed system had a suspended VTPM partition and a B7000602 SRC logged.
  • On systems with redundant service processors, a problem was fixed to add a missing check for a broken FSI link-1 pin.  The broken FSI link-1 pin was detectable during fail-over attempts to the backup service processor which failed.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • A problem was fixed for concurrent maintenance to prevent a hardware unavailable failure when doing consecutive concurrent remove and add operations to an I/O Hub adapter for a drawer. 
AH780_068_040 / FW780.21

01/07/15
Impact:  Security      Severity:  HIPER

System firmware changes that affect all systems

  • A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Downgraded Legacy Encryption (POODLE).  This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566.  The service processor POODLE fix is based on a selective disablement of SSLv3 using the Advanced System Management Interface (ASMI) "System Configuration/Security Configuration" menu options.  The Security Configuration options of "nist_sp800_131a", "nist_compat", and "legacy" for affects the disablement SSLv3 and determines the level of protection from POODLE.  The management console also requires a POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V7 R7.9.0 SP1 with PTF MH01484) to eliminate all vulnerability to POODLE and allow use of option 1 "nist_sp800_131a" as shown below:
    -1) nist_sp800_131a (SSlv3 disabled):  This highest level of security protection does not allow service processor clients to connect using SSLv3, thereby eliminating any possibility of a POODLE attack.  All clients must be capable of using TLS v1.2 to make the secured connections to the service processor to use this option.  This requires the management console be at a minimum level that has a poodle fix such as  HMC V7 R7.9.0 SP1 with POODLE PTF MH01484.
    -2) nist_compat (default mode - SSLv3 enabled for HMC):  This medium level of security protection disables SSLv3 (TLS v1.2 must be used instead) for the web browser sessions to ASMI and for the CIM clients and assures them of POODLE-free connections.  But the older management consoles are allowed to use SSLv3 to connect to the service processor.  This is intended to allow non-POODLE compliant HMC levels to be able to connect to the CEC servers until they can be planned and upgraded to the POODLE compliant HMC levels.  Running a non-POODLE compliant HMC to a service processor in this default mode will prevent the ASMI-proxy sessions from the HMC from connecting as these proxy sessions require SSLv3 support in ASMI.
    -3) legacy (SSLv3 enabled):  This basic level of security protection enables SSLv3 for all service processor client connection.  It relies on all clients being at POODLE fix compliant levels to provide full POODLE protection using the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3 connections.  This legacy option is intended for customer sites on protected internal networks that have a large investment in older hardware that need SSLv3 to make browser and HMC connections to the service processor.  The level of POODLE protection actually achieved in legacy mode is determined by the percentage of clients that are at the POODLE fix compliant levels.
  • A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.

System firmware changes that affect certain systems

  • HIPER/Pervasive:  On systems using PowerVM firmware, a performance problem was fixed that may affect shared processor partitions where there is a mixture of dedicated and shared processor partitions with virtual IO connections, such as virtual ethernet or Virtual IO Server (VIOS) hosting, between them.  In high availability cluster environments this problem may result in a split brain scenario.
AH780_066_040 / FW780.20

10/16/14
Impact: Data            Severity:  HIPER

New Features and Functions

  • Support was added for using the Mellanox ConnectX-3 Pro 10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.

System firmware changes that affect all systems

  • A problem was fixed that caused the Advanced System Management Interface (ASMI) menu for Memory Low Power State to be displayed even though it is not applicable to the system.  These systems do not have the DIMM type required for memory low power state.
  • A problem was fixed that caused the Utility COD display of historical usage data to be truncated on the management console.
  • A problem was fixed for memory relocation failing during a partition reboot with SRC B700F103 logged.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A problem was corrected that resulted in B7005300 error logs.
  • A problem was fixed for Utility COD Processors where incorrect SRCs A7004735 and A7004736 are logged when utility processors are activated.  The messages try to convey a problem that does not exist (no out of processor compliance condition actually exists).
  • A problem was fixed for the Advanced System Manager Interface (ASMI) to change the Dynamic Platform Optimizer (DPO) VET capability setting from "False" to "True".  DPO is available on all systems to use without a license required.  Even though the VET for DPO was set to "False", it did not interfere with the running of DPO.
  • A problem was fixed for the Advanced System Manager Interface (ASMI) that allowed possible cross-site request forgery (CSRF) exploitation of the ASMI user session to do unwanted tasks on the service processor.
  • A problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests.  These DMA size adjustments were marked as UE previously for a condition that is normal.
  • Multiple security problems were fixed in the Network Time Protocol (NTP) client for buffer overflows that could be exploited to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2009-1252 and CVE-2009-0159.
  • A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed a man-in -the middle attacker, via a specially crafted fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even if both the client and server supported newer protocol versions. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511.
  • A security problem was fixed in OpenSSL for formatting fields of security certificates without null-terminating the output strings.  This could be used to disclose portions of the program memory on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3508.
  • Multiple security problems were fixed in the way that OpenSSL handled Datagram Transport Layer Security (DLTS) packets.  A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests.  A specially crafted DTLS handshake packet with an included Supported EC Point Format extension could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3509.
  • A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Diffie Hellman (DH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3510.
  • A problem was fixed that caused a service processor reset/reload and a SRC B1818601 error log during an IPL when adjusting the speeds of the system fans.  This problem would normally have a successful recovery with a good IPL of the system unless two other reset/reloads of the service processor had occurred within the last 15 minutes.
  • A security problem in GNU Bash was fixed to prevent arbitrary commands hidden in environment variables from being run during the start of a Bash shell.  Although GNU Bash is not actively used on the service processor, it does exist in a library so it has been fixed.  This is IBM Product Security Incident Response Team (PSIRT) issue #2211.  The Common Vulnerabilities and Exposures issue numbers for this problem are CVE-2014-6271, CVE-2014-7169, CVE-2014-7186, and CVE-2014-7187.

System firmware changes that affect certain systems

  • HIPER/Pervasive:  A problem was fixed in PowerVM where the effect of the problem is non-deterministic but may include an undetected corruption of data, although IBM test has not been able to make this condition occur. This problem is only possible if VIOS (Virtual I/O Server) version 2.2.3.x or later is installed and the following statement is true:  A Shared Ethernet Adapter (SEA) with fail over enabled is configured on the VIOS.
  • On systems that have Active Memory Sharing (AMS) partitions and deduplication enabled, a problem was fixed for not being able to resume a hibernated AMS partition.  Previously,  resuming a hibernated AMS partition could give checksum errors with SRC B7000202 logged and the partition would remain in the hibernated state.
  • On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove that leaves a logical memory block (LMB) in an unusable state until partition reboot.
  • On systems in IPv6 networks, a  problem was fixed for a network boot/install failing with SRC B2004158 and IP address resolution failing using neighbor solicitation to the partition firmware client.
  • On systems that have a boot disk located on a SAN,  a problem was fixed  where the SAN  boot disk would not be found on the default boot list  and then the boot disk would have to be selected from SMS menus.  This problem would normally  be seen for new partitions that had tape drives configured before the SAN boot disk.
  • On systems with a partition that has a 256MB Real Memory Offset (RMO) region size that has been migrated from a Power8 system to  Power7 or Power6 using Live Partition Mobility (LPM), a problem was fixed that caused a failure on the next boot of the partition with a BA210000 log with a CA000091 checkpoint just prior to the BA210000.  The fix dynamically adjusts the memory footprint of the partition to fit on the earlier Power systems.
  • On systems with redundant service processors, a problem was fixed in the run-time error failover to the backup service processor so it does not terminate on FRU support interface (FSI) errors.  In the case of FSI errors on the new primary service processor, the primary will do a reset/reload instead of a terminate.
    This fix does not pertain to the IBM Power ESE (8412-EAD).
  • On systems with mirrored memory and a Logical Memory Block (LMB) size of 16MB, a problem for a LMB memory leak during an IPL was fixed that caused partition configuration errors.
  • A problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures.  The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations.  Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system.
  • For systems with a IBM i load source disk attached to an Emulex-based fibre channel adapter such as F/C #5735, a problem was fixed that caused an IBM i load source boot to fail with SRC B2006110 logged and a message to the boot console of  "SPLIT-MEM Out of Room".  This problem occurred for load source disks that needed extra disk scans to be found, such as those attached to a port other than the first port of a fibre channel adapter (first port requires fewest disk scans).
AH780_059_040 / FW780.11

06/23/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems
  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
  • A  security problem was fixed in the service processor TCP/IP stack to discard illegal TCP/IP packets that have the SYN and FIN flags set at the same time.  An explicit packet discard was needed to prevent further processing of the packet that could result in an bypass of the iptables firewall rules.
AH780_056_040 / FW780.10

04/25/14
Impact: Serviceability         Severity:  SPE

New Features and Functions

  • Support was added for monitored compliance of the Power Integrated Facility for Linux (IFL).  IFL is an optional lower cost per processor core activation for Linux-only workloads on IBM Power Systems.  Power IFL processor cores can be activated that are restricted to running Red Hat Linux or SUSE linux.  In contrast, processor cores that are activated for general-purpose workloads can run any supported operating system.  Power IFL processor cores are enabled by feature code ELJ1 using Capacity Upgrade on Demand (CUoD).  Linux partitions can use IFL processors and the other processor cores but AIX and IBM i5/OS cannot use the IFL processors.  The IFL monitored compliance process will send customer alert messages to the management console if the system is out of compliance for the number of IFL processors and general-purpose workload processors that are in active use compared to the number that have been licensed.
    Power IFL and monitored compliance is not supported on IBM Power ESE (8412-EAD) system because it has the AIX operating system only.

System firmware changes that affect all systems

  • A problem was fixed that prevented a HMC-managed system from being converted to manufacturing default configuration (MDC) mode when the management console command "lpcfgop -m <server> -o clear" failed to create the default partition.  The management console went to the incomplete state for this error.
  • A problem was fixed that logged an incorrect call home B7006956 NVRAM error during a power off of the system.  This error log indicates that the NVRAM of the system is in error and will be cleared on the next IPL of the system.  However, there is no NVRAM error and the error log was created because a reset/reload of the service processor occurred during the power off.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Hardware Deconfiguration/Clear All Deconfiguration Errors" menu option was enhanced to clarify that when selecting "Hardware Resources" value of "All hardware resources", the service processor deconfiguration data is not cleared.   The "Service processor" must be explicitly selected for that to be cleared.
  • A firmware code update problem was fixed that caused the Hardware Management Console (HMC) to go to "Incomplete State" for the system with SRC E302F880 when assignment of a partition universal unique identifier (UUID) failed for a partition that was already running.  This problem happens for disruptive code updates from pre-770 levels to 770 or later levels.
  • A problem was fixed that caused frequent SRC B1A38B24 error logs with a call home every 15 seconds when service processor network interfaces were incorrectly configured on the same subnet.  The frequency of the notification of the network subnet error has been reduced to once every 24 hours.
  • A problem was fixed that prevented guard error logs from being reported for FRUs that were guarded during the system power on.  This could happen if the same FRU had been previously reported as guarded on a different power on of the system.  The requirement is now met that guarded FRUs are logged on every power on of the system.
  • A problem was fixed for the Advanced System Management Interface (ASMI) "Login Profile/Change Password" menu where ASMI would fail with "Console Internal Error, status code 500" displayed on the web browser when an incorrect current password was entered.
  • A problem was fixed for a system with pool resources for a resource remove operation that caused the number of unreturned resources to become incorrect.  This problem occurred if the system first became out of compliance with overdue unreturned resources and then another remove of a pool resources from the server was attempted.
  • A problem was fixed for the Advanced System Management Interface (ASMI)  "System Information/Firmware Maintenance History" menu option on the service processor to display the firmware maintenance history instead of the message  "No code update history log was found".
  • A problem was fixed for a Live Partition Mobility (LPM) suspend and transfer of a partition that caused the time of day to skip ahead to an incorrect value on the target system.  The problem only occurred when a suspended partition was migrated to a target CEC that had a hypervisor time that was later than the source CEC.
  • A problem was fixed for IBM Power Enterprise System Pools that prevented the management console from changing from the backup to the master role for the enterprise pool.  The following error message was displayed on management console:  "HSCL90F7 An internal error occurred trying to set a new master management console for the Power enterprise pool. Try the operation again.  If this error persists, contact your service representative."
    This defect does not pertain to the IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • A problem was fixed for Live Partition Mobility (LPM) where a 2x performance decrease occurs during the resume phase of the migration when migrating from a system with 780 or later firmware back to a system with a pre-780 level of firmware.

System firmware changes that affect certain systems

  • On systems with multiple CEC drawers or nodes, a problem was fixed in the service processor Advanced System Management Interface (ASMI) performance dump collection that only allowed performance data to be collected for the first node of the system.  The  "System Service Aids/Performance Dump" menu of the ASMI is used to work with the performance dump.
  • On systems involved in a series of consecutive Live Partition Mobility (LPM) operations, a memory leak problem was fixed in the run time abstraction service (RTAS) that caused a partition run time AIX crash with SRC 0c20.  Other possible symptoms include error logs with SRC BA330002 (RTAS memory allocation failure).
  • On systems running Dynamic Platform Optimizer (DPO) with one or more unlicensed processors, a problem was fixed where the system performance was significantly degraded during the DPO operation.  The amount of performance degradation was more for systems with larger numbers of unlicensed processors.
  • On systems with a redundant service processor, a problem was fixed where the service processor allowed a clock failover to occur without a SRC B158CC62 error log and without a hardware deconfiguration record for the failed clock source.  This resulted in the system running with only one clock source and without any alerts to warn that clock redundancy had been lost.
  • On systems with a management console and service processors configured with Internet Protocol version 6 (IPv6) addresses,  a problem was fixed that prevented the management console from discovering the service processor.  The Service Location Protocol (SLP) on the service processor was not being enabled for IPv6, so it was unable to respond to IPv6 queries.
  • On a system with a partition with a AIX and Linux boot source to support dual booting, a problem was fixed that caused the Host Ethernet Adapter (HEA) to be disabled when rebooting from Linux to AIX.  Linux had disabled interrupts for the HEA on power down, causing an error for AIX when it tried to use the HEA to access the network.
  • On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition.  Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • On a system with sixteen or more logical partitions, a problem was fixed for a memory relocation error during concurrent hot node repair that caused a hang or a failure.  The problem can also be triggered by mirrored memory defragmentation on a system with selective memory mirroring.
AH780_054_040 / FW780.02

04/18/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Montgomery ladder implementation for the ECDSA (Elliptic Curve Digital Signature Algorithm) to protect sensitive information from being obtained with a flush and reload cache side-channel attack to recover ECDSA nonces from the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0076.  The stolen ECDSA nonces could be used to decrypt the SSL sessions and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow Heartbeat Extension packets to trigger a buffer over-read to steal private keys for the encrypted sessions on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also known as the heartbleed vulnerability.  The stolen private keys could be used to decrypt the SSL sessions and and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • A  security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2323.
  • A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.
AH780_050_040 / FW780.01

03/10/14
Impact:  Data      Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Non-Pervasive:  A problem was fixed for a potential silent data corruption issue that may occur when a Live Partition Mobility (LPM) operation is performed from a system (source system) running a firmware level earlier than AH780_040 or AM780_040 to a system (target system) running AH780_040 or AM780_040.
AH780_040_040 / FW780.00

12/06/13
Impact:  New      Severity:  New

New Features and Functions

  • Support was added to the Virtual I/O Server (VIOS) for shared storage pool mirroring (RAID-1) using the virtual SCSI (VSCSI) storage adapter to provide redundancy for data storage.
  • Support was added to upgrade the service processor to openssl version 1.0.1 and for compliance to National Institute of Standards and Technologies (NIST) Special Publications 800-131a.  SP800-131a compliance required the use of stronger cryptographic keys and more robust cryptographic algorithms.
  • Support was added to the Management Console command line to allow configuring a shared control channel for multiple pairs of Shared Ethernet Adapters (SEAs).  This simplifies the control channel configuration to reduce network errors when the SEAs are in fail-over mode.
  • Support was added in Advanced System Management Interface (ASMI) to facilitate capture and reporting of debug data for system performance problems.  The  "System Service Aids/Performance Dump" menu was added to ASMI to perform this function.
  • Support was added to the Management Console for group-based LDAP authentication.
  • Partition Firmware was enhanced to to be able to recognize and boot from disks formatted with the GUID Partition Table (GPT) format that are capable of being greater than 2TB in size.  GPT is a standard for the layout of the partition table on a physical hard disk, using globally unique identifiers (GUID), that does not have the 2TB limit that is imposed by the DOS partition format.
  • The call home data for every serviceable event of the system was enhanced to include information on every guarded element (processor, memory,I/O chip, etc) and contains the part number and location codes of the FRUs and the service processor de-configuration policy settings.
  • Support for IBM PCIe 3.0 x8 dual 4-port SAS RAID adapter with 12 GB cache with feature code EJ0L and CCIN 57CE.
  • Support for Dynamic Platform Optimizer (DPO) enhancements to show the logical partition current and potential affinity scores.  The Management Console has also been enhanced to show the partition scoring.  The operating system (OS) levels that support DPO:

                ◦ AIX 6.1 TL8 or later
                ◦ AIX 7.1 TL2 or later
                ◦ VIOS 2.2.2.0
                ◦ IBM i 7.1 PTF MF56058
                ◦ Linux RHEL7
                ◦ Linux SLES12

         Note: If DPO is used with an older version of the OS that predates the above levels, either:
                   - The partition needs to be rebooted after DPO completes to optimize placement, or
                   - The partition is excluded from participating in the DPO operation (through a command line option on the "optmem" command that is used to initiate a
                      DPO operation).

  • Support was added to the Management Console and the Virtual I/O Server (VIOS) to provide the capability to to enable and disable individual virtual ethernet adapters from the management console.
  • Support for Management Console logical partition Universally Unique IDs (UUIDs) so that the HMC preserves the UUID for logical partitions on backup/restore and migration.
  • Support for Management Console command line to configure the ECC call home path for SSL proxy support.
  • Support for Management Console to minimize recovery state problems by using the hypervisor and VIOS configuration data to recreate partition data when needed.
  • Support for Management Console to provide scheduled operations to check if the partition affinity falls below a threshold and alert the user that Dynamic Platform Optimizer (DPO) is needed.
  • Support for enhanced platform serviceability to extend call home to include hardware in need of repair and to issue periodic service events to remind of failed hardware.
  • Support for IBM PCIe 3.0 x8 non-caching 2-port SAS RAID adapter with feature code EJ0J. and CCIN 57B4.
  • Support for Virtual I/O Server (VIOS) to support 4K block size DASD as a virtual device.
  • Support for performance improvements for concurrent Live Partition Mobility (LPM) migrations.
  • Support for Management Console to handle all Virtual I/O Server (VIOS) configuration tasks and provide assistance in configuring partitions to use redundant VIOS.
  • Support for Management Console to maintain a profile that is synchronized with the current configuration of the system, including Dynamic Logical Partitioning (DLPAR) changes.
  • Support for Power System Pools allows for the aggregation of Capacity on Demand (CoD) resources, including processors and memory, to be moved from one pool server to any other pool server as needed.
  • Support for a Management Console Performance and Capacity Monitor (PCM) function to monitor and manage both physical and virtual resources.
  • Support for virtual server network (VSN) Phase 2 that delivers IEEE standard 802.1Qbg based on Virtual Ethernet Port Aggregator (VEPA) switching.  This supports the Management Console assignment of the VEPA switching mode to virtual Ethernet switches used by the virtual Ethernet adapters of the logical partitions.  The server properties in the Management Console will show the capability "Virtual Server Network Phase 2 Capable" as "True" for the system.
  • Support for Virtual I/O Server (VIOS) for an IBMi client data connection to a SIS64 device driver backed by VSCSI physical volumes.
  • Support for the Power 795 GX++ 1-port 4X Infiniband QDR adapter with CCIN 2B76 and feature code EN25.
  • Support was dropped for Secured Socket Layer (SSL) protocol version 2 and SSL weak and medium cipher suites in the service processor web server (Ligthttpd) .  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
  • Support was added in Advanced System Management Interface (ASMI) "System Configuration/Firmware Update Policy" menu to detect and display the appropriate Firmware Update Policy (depending on whether system is HMC managed) instead of requiring the user to select the Firmware Update Policy.  The menu also displays the "Minimum Code Level Supported" value.

System firmware changes that affect all systems

  • A problem was fixed that caused a service processor kernel panic on an out-of-memory condition with SRC B181720D when an incorrect MTMS was specified for a frame in the Advanced System Management Interface (ASMI).
  • A problem was fixed that caused a service processor OmniOrb core dump with SRC B181EF88 logged.
  • A problem was fixed that caused the system attention LED to stay lit when a bad FRU was replaced.
  • A problem was fixed that caused a memory leak of 50 bytes of service processor memory for every call home operation.  This could potentially cause an out of memory condition for the service processor when running over an extended period of time without a reset.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a system checkstop during hypervisor time keeping services.
  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • The Hypervisor was enhanced to allow the system to continue to boot using the redundant Anchor (VPD) card, instead of stopping the Hypervisor boot and logging SRC B7004715,  when the primary Anchor card has been corrupted.
  • A problem was fixed with the Dynamic Platform Optimizer (DPO) that caused memory affinity to be incorrectly reported to the partitions before the memory was optimized.   When this occurs, the performance is impacted over what would have been gained with the optimized memory values.
  • A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release. Migration recovery may also be necessary.
  • A problem was fixed that can cause Anchor (VPD) card corruption and  A70047xx SRCs to be logged.  Note: If a serviceable event  with SRC A7004715 is present or was logged previously, damage to the VPD card may have occurred. After the fix is applied, replacement of the Anchor VPD  card is recommended in order to restored full redundancy.
  • The firmware was enhanced to display on the management console the correct number of concurrent Live Partition Mobility (LPM) operations that is supported.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure.
  • A problem was fixed that caused the state of the Host Ethernet Adapter (HEA) port to be reported as down when the physical port is actually up.
  • When powering on a system partition, a problem was fixed that caused the partition universal unique identifier (UUID) to not get assigned, causing a B2006010 SRC in the error log.
  • For the sequence of a reboot of a system partition followed immediately by a power off of the partition, a problem was fixed where the hypervisor virtual service processor (VSP) incorrectly retained locks for the powered off partition, causing the CEC to go into recovery state during the next power on attempt.
  • A problem was fixed that caused an error log generated by the partition firmware to show conflicting firmware levels.  This problem occurs after a firmware update or a Live Partition Mobility (LPM) operation on the system.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • A problem was fixed that caused the slot index to be missing for virtual slot number 0 for the dynamic reconfiguration connector (DRC) name for virtual devices.  This error was visible from the management console when using commands such as "lshwres -r virtualio --rsubtype slot -m machine" to show the hardware resources for virtual devices.
  • A problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.
  • A problem was fixed during resource dump processing that caused a read of an invalid system memory address and a SRC B181C141.  The invalid memory reference resulted from the service processor incorrectly referencing memory that had been relocated by the hypervisor.

System firmware changes that affect certain systems

  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • When switching between turbocore and maxcore mode, a problem was fixed that caused the number of supported partitions to be reduced by 50%.
  • On systems in turbocore mode with unlicensed processors, a problem was fixed that caused an incorrect processor count.  The AIX command lparstat gave too high a value for "Active Physical CPUs in system" when it included unlicensed turbocore processors in the count instead of just counting the licensed processors.
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
  • On systems running AIX or Linux, a problem was fixed that caused the operating system to halt when an InfiniBand Host Channel Adapter (HCA) adapter fails or malfunctions.
  • On systems running AIX or linux, a hang in a Live Partition Mobility (LPM) migration for remote restart-capable partitions was fixed by adding a time-out for the required paging space to become available.  If after five minutes the required paging space is not available, the start migration command returns a error code of 0x40000042 (PagingSpaceNotReady) to the management console.
  • On systems running Dynamic Platform Optimizer (DPO) with no free memory,  a problem was fixed that caused the Hardware Management System (HMC) lsmemopt command to report the wrong status of completed with no partitions affected.  It should have indicated that DPO failed due to insufficient free memory.  DPO can only run when there is free memory in the system.
  • On systems with partitions using physical shared processor pools, a problem was fix that caused partition hangs if the shared processor pool was reduced to a single processor.
  • On a system running a Live Partition Mobility (LPM) operation, a problem was fixed that caused the partition to successfully appear on the target system, but hang with a 2005 SRC.
  • On systems using IPv6 addresses, the firmware was enhanced to reduce the time it take to install an operating system using the Network Installation Manager (NIM).
  • On systems managed by a management console, a problem was fixed that caused a partition to become unresponsive when the AIX command "update_flash -s" is run.
  • On systems with turbo-core enabled that are a target of Live Partition Mobility (LPM),  a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • The firmware was enhanced to reduce the number of concurrent hot add/repair maintenance failures due to the operation timing out on fully-configured systems.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail if a memory channel failure on the CEC was followed by a service processor reset/reload.
  • A problem was fixed that caused SRC B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail with SRC B181C350.
  • A problem was fixed that prevented the I/O slot information from being presented on the management console after a concurrent node repair.
  • A problem was fixed that caused Capacity on Demand (COD) "Out of Compliance" messages during concurrent maintenance operations when the system was actually in compliance for the licensed amount of resources in use.

4.0 How to Determine Currently Installed Firmware Level

You can view the server's current firmware level on the Advanced System Management Interface (ASMI) Welcome pane. It appears in the top right corner. Example: AH780_123.

5.0 Downloading the Firmware Package

Follow the instructions on Fix Central. You must read and agree to the license agreement to obtain the firmware packages.

Note: If your HMC or SDMC is not internet-connected you will need to download the new firmware level to a CD-ROM or ftp server.


6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: AHXXX_YYY_ZZZ

Where XXX = release level

Instructions for installing firmware updates and upgrades can be found at http://www.ibm.com/support/knowledgecenter/9119-FHB/p7ha1/updupdates.htm

IBM i Systems:
See "IBM Server Firmware and HMC Code Wizard":
http://www-912.ibm.com/s_dir/slkbase.NSF/DocNumber/408316083

NOTE:
For all systems running with the IBM i Operating System, the following IBM i PTFs must be applied to all IBM i partitions prior to installing AH780_084:
These PTFs can be ordered through Fix Central.

When ordering firmware for IBM i Operating System managed systems from Fix Central, choose "Select product", under Product Group specify "System i", under Product specify "IBM i", then Continue and specify the desired firmware PTF accordingly.

7.0 Firmware History

The complete Firmware Fix History for this Release level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AH-Firmware-Hist.html