Power10 System Firmware
Applies
to: 9080-HEX
This document provides information about the installation of
Licensed Machine or Licensed Internal Code, which is sometimes referred
to generically as microcode or firmware.
Contents
1.0
Systems Affected
This
package provides firmware for IBM Power System E1080
(9080-HEX) server
only.
The firmware level in this package is:
1.1 Minimum HMC Code Level
This section is intended to describe the "Minimum HMC Code Level"
required by the System Firmware to complete the firmware installation
process. When installing the System Firmware, the HMC level must be
equal to or higher than the "Minimum HMC Code Level" before starting
the system firmware update. If the HMC managing the server
targeted for the System Firmware update is running a code level lower
than the "Minimum HMC Code Level" the firmware update will not proceed.
The
Minimum HMC Code levels for this firmware for HMC x86, ppc64
or ppc64le are listed below.
x86 - This term is used to reference the legacy HMC
that runs on x86/Intel/AMD hardware for the Virtual HMC that can run on
the Intel hypervisors (KVM, XEN, VMWare ESXi).
- The Minimum HMC Code level for this firmware is: HMC V10R1M1011 (PTF
MF69288).
ppc64 or ppc64le - describes the Linux code that is compiled to
run on Power-based servers or LPARS (Logical Partitions)
- The Minimum HMC Code level for this firmware is: HMC V10R1M1011 (PTF MF69289).
The
Minimum HMC level supports the following HMC models:
HMC models: 7063-CR1 and 7063-CR2
x86 - KVM, XEN, VMWare ESXi (6.0/6.5)
ppc64le - vHMC on PowerVM (POWER8,POWER9, and POWER10 systems)
For
information
concerning HMC releases and the latest PTFs, go to the following
URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/
For specific fix level information on key components of IBM Power
Systems running the AIX, IBM i and Linux operating systems, we suggest
using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home
NOTES:
-You must be logged in as hscroot in order for the
firmware installation to complete correctly.
- Systems Director Management Console (SDMC) does not support this
System Firmware level.
2.0 Important Information
NovaLink
levels earlier than "NovaLink 1.0.0.16 Feb 2020 release" with
partitions running certain SR-IOV capable adapters is NOT supported at
this firmware release
NovaLink levels earlier than "NovaLink 1.0.0.16 Feb 2020 release" do
not support IO adapter FCs EC2R/EC2S, EC2T/EC2U, EC66/EC67 with FW1010
and later.
Live Partition
Mobility (LPM) support restrictions for FW1010.00:
Live Partition Mobility (LPM) support restrictions for FW1010.00 have
been removed for FW1010.10 and later releases.
The LPM restrictions for FW1010.00 have been removed for FW1010.10.
Note: The following IBM document article for the LPM support
matrix for POWER10 should be followed for guidance on migrating between
firmware levels
https://www.ibm.com/docs/en/power10?topic=mobility-firmware-support-matrix-partition
Firmware
Update Failure on Power10:
Important information regarding system
firmware update might fail with errors HSCF0180E and E302F854 logged on
the Hardware Management Console (HMC) and System Reference Code (SRC)
B181303F on the flexible service processor (FSP).
See the link for further information: https://www.ibm.com/support/pages/node/6527300
2.1 IPv6 Support and
Limitations
IPv6 (Internet Protocol version 6)
is supported in the System Management Services (SMS) in this level of
system firmware. There are several limitations that should be
considered.
When configuring a network interface
card (NIC) for remote IPL, only the most recently configured protocol
(IPv4 or IPv6) is retained. For example, if the network interface card
was previously configured with IPv4 information and is now being
configured with IPv6 information, the IPv4 configuration information is
discarded.
A single network interface card
may only be chosen once for the boot device list. In other words, the
interface cannot be configured for the IPv6 protocol and for the IPv4
protocol at the same time.
2.2 Concurrent
Firmware Updates
Concurrent system firmware update is supported on HMC Managed Systems
only.
Ensure that there are no RMC connections issues for any system
partitions prior to applying the firmware update. If there is a
RMC connection failure to a partition during the firmware update, the
RMC connection will need to be restored and additional recovery actions
for that partition will be required to complete partition firmware
updates.
2.3 Memory Considerations for
Firmware Upgrades
Firmware Release Level upgrades and
Service Pack updates may consume additional system memory.
Server firmware requires memory to
support the logical partitions on the server. The amount of memory
required by the server firmware varies according to several factors.
Factors influencing server firmware
memory requirements include the following:
- Number of logical partitions
- Partition environments of the logical
partitions
- Number of physical and virtual I/O devices
used by the logical partitions
- Maximum memory values given to the logical
partitions
Generally, you can estimate the
amount of memory required by server firmware to be approximately 8% of
the system installed memory. The actual amount required will generally
be less than 8%. However, there are some server models that require an
absolute minimum amount of memory for server firmware, regardless of
the previously mentioned considerations.
Additional information can be found
at:
https://www.ibm.com/support/knowledgecenter/9080-M9S/p9hat/p9hat_lparmemory.htm
2.4 SBE Updates
Power10 servers
contain SBEs (Self Boot Engines) and are used to boot the system.
SBE is internal to each of the Power10 chips and used to "self boot"
the chip. The SBE image is persistent and is only reloaded if
there is a system firmware update that contains a SBE change. If
there is a SBE change and system firmware update is concurrent, then
the SBE update is delayed to the next IPL of the CEC which will cause
an additional 3-5 minutes per processor chip in the system to be added
on to the IPL. If there is a SBE change and the system firmware
update is disruptive, then SBE update will cause an additional 3-5
minutes per processor chip in the system to be added on to the
IPL. During the SBE update process, the HMC or op-panel will
display service processor code C1C3C213 for each of the SBEs being
updated. This is a normal progress code and system boot should be
not be terminated by the user. Additional time estimate can be between
12-20 minutes per drawer or up to 48-80 minutes for maximum
configuration.
The SBE image is updated with this
service pack.
3.0 Firmware Information
Use the following examples as a reference to determine whether your
installation will be concurrent or disruptive.
For systems that are not managed by an HMC, the installation of
system firmware is always disruptive.
Note: The concurrent levels
of system firmware may, on occasion, contain fixes that are known as
Deferred and/or Partition-Deferred. Deferred fixes can be installed
concurrently, but will not be activated until the next IPL.
Partition-Deferred fixes can be installed concurrently, but will not be
activated until a partition reactivate is performed. Deferred and/or
Partition-Deferred fixes, if any, will be identified in the "Firmware
Update Descriptions" table of this document. For these types of fixes (Deferred and/or
Partition-Deferred) within a service pack, only the fixes in the
service pack which cannot be concurrently activated are deferred.
Note: The file names and service pack levels used in the
following examples are for clarification only, and are not necessarily
levels that have been, or will be released.
System firmware file naming convention:
01VHxxx_yyy_zzz
- xxx is the release level
- yyy is the service pack level
- zzz is the last disruptive service pack level
NOTE: Values of service pack and last disruptive service pack
level (yyy and zzz) are only unique within a release level (xxx). For
example, 01VH900_040_040 and 01VH910_040_045 are different
service packs.
An installation is disruptive if:
- The release levels (xxx) are
different.
Example:
Currently installed release is 01VH900_040_040,
new release is 01VH910_050_050.
- The service pack level (yyy) and the last disruptive
service pack level (zzz) are the same.
Example: VH910_040_040
is disruptive, no matter what level of VH910 is currently installed on
the system.
- The service pack level (yyy) currently installed on the
system is lower than the last disruptive service pack level (zzz) of
the service pack to be installed.
Example:
Currently installed service pack is VH910_040_040 and new service pack
is VH910_050_045.
An installation is concurrent if:
The release level (xxx) is the same, and
The service pack level (yyy) currently installed on the system
is the same or higher than the last disruptive service pack level (zzz)
of the service pack to be installed.
Example: Currently installed service pack is VH910_040_040, new
service pack is VH910_041_040.
3.1 Firmware
Information and Description
Filename |
Size |
Checksum |
md5sum |
01MH1010_117_094.rpm |
145050817
|
40853
|
7fa60c7e415946109516e424da30f953
|
Note: The Checksum can be found by running the AIX sum command against
the rpm file (only the first 5 digits are listed).
ie: sum 01MH1010_117_094.rpm
MH1010
For Impact, Severity and other Firmware definitions, Please refer to
the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
The
complete Firmware Fix History for this Release Level can be reviewed at
the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/MH-Firmware-Hist.html
|
MH1010_117_094 / FW1010.20
03/31/22 |
Impact: Availability
Severity: SPE
New Features and Functions
- Support was added for an Advanced System Management
Interface (ASMI) System Configuration panel option to disable or enable
the system Lateral Cast-Out function (LCO). LCO is enabled by
default and a change to disable it must be done at service processor
standby. POWER processor chips since POWER7 have a feature called
“Lateral Cast-Out” (LCO), enabled by default, where the contents of
data cast out of one core’s L3 can be written into another core’s
L3. Then if a core has a cache miss on its own L3, it can often
find the needed data block in another local core’s L3. This has the
useful effect of slightly increasing the length of time that a storage
block gets to stay in a chip’s cache, providing a performance boost for
most applications. However, for some applications such as SAP
HANA, the performance can be better if LCO is disabled. More
information on how LCO is being configured by SAP HANA can be found in
the SAP HANA on Power Advanced Operation Guide manual that can be
accessed using the following link:
http://ibm.biz/sap-linux-power-library
Follow the "SAP HANA Operation" link on this page to the "SAP HANA
Operation Guides" folder. In this folder, locate the updated
"SAP_HANA_on_Power_Advanced_Operation_Guide" manual that has a new
topic added of "Manage IBM Power Lateral Cast Out settings" which
provides the additional information.
The default behavior of the system (LCO enabled) will not change in any
way by this new feature. The customer will need to power off and
disable LCO in ASMI to get the new behavior.
- Support was added for Secure Boot for SUSE Linux Enterprise
Server (SLES) partitions. The SUSE Linux level must be SLES 15
SP4 or later. Without this feature, partitions with SLES 15 SP4
or later and which have the OS Secure Boot partition property set to
"Enabled and Enforced" will fail to boot. A workaround to this is
to change the partition's Secure Boot setting in the HMC partition
configuration to "Disabled" or "Enabled and Log only".
System firmware changes that
affect all systems
- A problem was fixed
for a possible unexpected SRC B1812641 logged if the system is powered
off immediately after an IPL. The frequency of this problem is
expected to be very rare because systems are not normally powered off
immediately after powering on. If this SRC occurs in this
scenario, it can be ignored.
- A problem was fixed for a logical partition failing to boot
with an SRC B700F104 logged after a memory DDIMM power fault.
This is a rare problem needing a double failure on the Power Management
Integrated Circuit (PMIC) that handles memory DDIMM power regulation
for the OpenCAPI Memory Buffer (OCMB). A re-IPL of the system is
needed to recover from this problem.
- A problem was fixed for a firmware update error with
"HSCF0180E Operation failed" displayed on the HMC with error code
E302F854. This fix is only available for firmware updates from
FW1010.20 to a later service pack. For firmware updates from
earlier levels to FW1010.20, a failure is expected unless the following
circumvention is performed: On the firmware update from the HMC,
select the "Advanced options" to automatically accept the new code
level. This is the default setting for an HMC at PTF levels
MF69286 or MF69287 for HMC V10 R1 M1011.2 For earlier
levels of the HMC, the automatically accept option must be manually
changed to on when performing the code update as it defaults to
off. To do this, use the following steps:
1. When running the HMC code update wizard, click on "Advanced
options".
2. From "Advanced options", select "Install and Activate (Implied
retrieve)".
3. On the "Install and Activate panel", you will see the guidance
text of "Select a LIC level type and accept option; then click
OK." The two accept options displayed are as follows:
o Automatically accept
o Do Not automatically accept
To prevent the problem from occurring, the "Automatically accept"
option must be selected.
- A problem was fixed for errors that can occur if doing a
Live Partition Mobility (LPM) migration and a Dynamic Platform
Optimizer (DPO) operation at the same time. The migration may
abort or the system or partition may crash. This problem requires
running multiple migrations and DPO at the same time. As a
circumvention, do not use DPO while doing LPM migrations.
- A problem was fixed for a system hypervisor hang and an
Incomplete state on the HMC after a logical partition (LPAR) is deleted
that has an active virtual session from another LPAR. This
problem happens every time an LPAR is deleted with an active virtual
session. This is a rare problem because virtual sessions from an
HMC (a more typical case) prevent an LPAR deletion until the virtual
session is closed, but virtual sessions originating from another LPAR
do not have the same check.
- A problem was fixed for vTPM 2.0 updates not being applied
concurrently on a firmware update. The updates are applied after
a reboot of the system.
- A problem was fixed for vague and misleading errors caused
by using an invalid logical partition (LP) id for a resource dump
request. With the fix, the invalid LP id is rejected immediately
as a user input error instead of being processed by the main storage
dump to create what appear to be severe errors.
- A problem was fixed for a partition with an SR-IOV logical
port (VF) having a delay in the start of the partition. If the
partition boot device is an SR-IOV logical port network device, this
issue may result in the partition failing it boot with SRCs BA180010
and BA155102 logged and then stuck on progress code SRC 2E49 for an AIX
partition. This problem is infrequent because it requires
multiple error conditions at the same time on the SR-IOV adapter.
To trigger this problem, multiple SR-IOV logical ports for the same
adapter must encounter EEH conditions at roughly the same time such
that a new logical port EEH condition is occurring while a previous EEH
condition's handling is almost complete but not notified to the
hypervisor yet. To recover from this problem, reboot the
partition.
- A problem was fixed for a secondary fault after a partition
creation error that could result in a Terminate Immediate (TI) of the
system with an SRC B700F103 logged. The failed creation of
partitions can be explicit or implicit that might trigger the secondary
fault. One example of an implicit partition create is the ghost
partition created for a Live Partition Mobility (LPM) migration.
This type of partition can fail to create when there is insufficient
memory available for the hardware page table (HPT) for the new
partition.
- A problem was fixed for an I/O adapter slot error when
powering on the slot with SRC B4000202 and B400F104 logged. One
example where this problem has been seen is when moving an SR-IOV
adapter to shared mode. This problem is infrequent and can be
recovered by retrying the operation that failed, such as DLPAR,
starting the partition, or moving the SR-IOV adapter.
- A problem was fixed for a System Management Services (SMS)
iSCSI information panel being incorrect and an SMS abort when
navigating away from the panel. The iSCSI target and
initiator names are not shown. The configured IP addresses to be used
for an iSCSI boot are all zeroes even after they are set. Navigating
away from the iSCSI information panel causes an SMS abort. This
problem is triggered by setting an iSCSI disk alias in SMS menus then
attempting to show information with the following selection:
"Select Boot Options -> Configure Boot Device Order -> Select 1st
Boot Device ->Network -> ISCSI -> iscsi-disk1 ->
Information". The probability is low that this issue will be
encountered because it requires iSCSI disk aliases to be used for a
boot. Normally for an iSCSI boot disk, most users use a
fully qualified iSCSI OF device path which does not trigger the
problem. If an SMS abort does occur when navigating away from the
iSCSI information menu, the logical partition (LPAR) can be restarted
to SMS menus.
- A problem was fixed for a Hostboot hang during an IPL with
SRC BC141E2B logged. This is a very rare failure for a timing
problem involving multiple process threads. To recover from the
problem, do a re-IPL of the system.
- A problem was fixed for detecting a bad SBE SEEPROM with a
SEEPROM and processor callout with SRC BC102224 logged when an SBE
update is attempted and failed. The fix allows the boot to
continue on the old level of the SEEPROM level. This is a rare
problem that only occurs with an SBE SEEPROM that cannot be
written. Without the fix, the IPL will loop and hang with issues
with the SBE update being continually logged.
- A problem was fixed for a clock error during the IPL that
should have been recoverable but instead failed the IPL with extra
error logs that included BC8A285E and B111B901. The trigger for
this problem requires a recoverable Hostboot IPL failure of some kind
to occur (such as a clock error) and specifically a situation that does
not result in a deconfiguration of Hostboot targets.
- A problem was fixed for a system hang caused by an Open
Memory Interface (OMI) memory loop. This is a very rare error
that can only occur if the OMI host memory controller data link
has gone into degraded bandwidth mode (x8->x4) because of another
error and it also requires a specific memory data pattern to be
transmitted when in this degraded mode for the problem to occur.
- A problem was fixed for an IPL failure involving a
processor that does not have any functional cores. For this rare
problem to occur, a processor with only one functional core must have
that core fail with a checkstop. Then on the ensuing post-dump
IPL, the error occurs during the deconfiguration of the failed
processor. This fix updates the Self Boot Engine (SBE).
- A problem was fixed for ASMI TTY menus allowing an
unsupported change in hypervisor mode to OPAL. This causes an IPL
failure with BB821410 logged if OPAL is selected. The hypervisor
mode is not user-selectable in POWER9 and POWER10. Instead,
the hypervisor mode is determined by the MTM of the system. With this
fix, the "Firmware Configuration" option in ASMI TTY menus is removed
so that it matches the options given by the ASMI GUI menus.
System firmware changes that
affect certain systems
- For a system with a AIX or Linux partition, a problem was
fixed a partition start failure for AIX or Linux with SRC BA54504D
logged. This problem occurs if the partition is a MDC default
partition with virtual Trusted Platform Module (vTPM) enabled. As
a circumvention, power off the system and disable vTPM using the HMC
GUI to change the default partition property for Virtualized Trusted
Platform Module (VTPM) to off.
- For a system with an IBM i partition in MDC mode, a problem
was fixed for a possible system hang if an HMC virtual IBM i console
fails to connect. A rare timing problem with a shared lock can
occur during the console connect attempt, This problem can be
recovered by a re-IPL of the system.
- For systems with Linux partitions, a problem was fixed for
Linux energy scale features not being enabled in Linux partitions for
POWER10. With the problem, Linux is prevented from knowing that
energy scale operations are available for use by the partition.
|
4.0 How to
Determine The Currently Installed Firmware Level
You can view the server's
current firmware level on the Advanced System Management Interface
(ASMI) Welcome pane. It appears in the top right corner. Example:
MH1010_117.
5.0
Downloading the Firmware Package
Follow the instructions on Fix Central. You must read and agree to the
license agreement to obtain the firmware packages.
Note: If your HMC is not internet-connected you will need to
download the new firmware level to a USB flash memory device or ftp
server.
6.0 Installing the
Firmware
The method used to install new firmware will depend on the release
level of firmware which is currently installed on your server. The
release level can be determined by the prefix of the new firmware's
filename.
Example: MHxxx_yyy_zzz
Where xxx = release level
- If the release level will stay the same (Example: Level
VH920_040_040 is currently installed and you are attempting to install
level VH920_041_040) this is considered an update.
- If the release level will change (Example: Level VH900_040_040 is
currently installed and you are attempting to install level
VH920_050_050) this is considered an upgrade.
Instructions for installing firmware updates and upgrades can be found
at https://www.ibm.com/support/knowledgecenter/9080-M9S/p9eh6/p9eh6_updates_sys.htm
IBM i Systems:
For information concerning IBM i Systems, go to the following
URL to access Fix Central:
http://www-933.ibm.com/support/fixcentral/
Choose "Select product", under
Product Group specify "System i", under Product specify "IBM i", then
Continue and specify the desired firmware PTF accordingly.
HMC and
NovaLink Co-Managed Systems (Disruptive firmware updates only):
A co-managed system is managed by HMC and NovaLink,
with one of the interfaces in the co-management master mode.
Instructions for installing firmware updates and upgrades on systems
co-managed by an HMC and Novalink is the same as above for a HMC
managed systems since the firmware update must be done by the HMC in
the co-management master mode. Before the firmware update is
attempted, one must be sure that HMC is set in the master mode using
the steps at the following IBM KnowledgeCenter link for NovaLink
co-managed systems:
https://www.ibm.com/support/knowledgecenter/9009-22A/p9eig/p9eig_kickoff.htm
Then the firmware updates can proceed with the same steps as for
the HMC managed systems except the system must be powered off because
only a disruptive update is allowed. If a concurrent update
is attempted, the following error will occur: " HSCF0180E Operation
failed for <system name> (<system mtms>). The
operation failed. E302F861 is the error code:"
https://www.ibm.com/support/knowledgecenter/9009-22A/p9eh6/p9eh6_updates_sys.htm
7.0 Firmware History
The complete Firmware Fix History (including HIPER descriptions)
for this Release level can be reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/MH-Firmware-Hist.html