Power6 High-End System Firmware
Applies to: 9125-F2A
This document provides information about the installation of
Licensed
Machine or Licensed Internal Code, which is sometimes referred to
generically
as microcode or firmware.
Contents
1.0 Systems Affected
This package provides firmware for Power 575 (9125-F2A) Servers
only.
The firmware level in this package is:
1.1 Minimum HMC Code Level
This section is intended to describe the "Minimum HMC Code Level"
required by the System Firmware to complete the firmware installation
process. When installing the System Firmware, the HMC level must be
equal to or higher than the "Minimum HMC Code Level" before starting
the system firmware update. If the HMC managing the server
targeted for the System Firmware update is lower than the "Minimum HMC
Code Level" the firmware update will not proceed.
The Minimum HMC Code level for this firmware is: HMC V7 R3.5.0 with PTFs MH01212 and
MH01217 (or higher).
For specific HMC maintenance information (Service Pack, eFixes,
etc), we recommend using the Fix Level Recommendation Tool (FLRT):
http://www14.software.ibm.com/webapp/set2/flrt/home
For information concerning HMC releases and the latest PTFs,
go
to the following URL to access the HMC code packages:
http://www14.software.ibm.com/webapp/set2/sas/f/hmcl/home.html
NOTE: You must be logged in as hscroot in order for the
firmware
installation to complete correctly.
2.0 Important Information
IPv6 Support and Limitations
IPv6 (Internet Protocol version 6) is supported in the System
Management
Services (SMS) in this level of system firmware. There are several
limitations
that should be considered.
When configuring a network interface card (NIC) for remote IPL, only
the most recently configured protocol (IPv4 or IPv6) is retained. For
example,
if the network interface card was previously configured with IPv4
information
and is now being configured with IPv6 information, the IPv4
configuration
information is discarded.
A single network interface card may only be chosen once for the boot
device list. In other words, the interface cannot be configured for the
IPv6 protocol and for the IPv4 protocol at the same time.
Memory Considerations for Firmware Upgrades
The increase in memory used by the firmware is due to the additional
functionality
in later firmware releases.
3.0 Firmware
Information
and Description
Use the following examples as a reference to determine whether your
installation
will be concurrent or disruptive.
Note: The concurrent levels of system firmware may, on occasion,
contain
fixes that are known as deferred. These deferred fixes can be installed
concurrently, but will not be activated until the next IPL. Deferred
fixes,
if any, will be identified in the "Firmware Update Descriptions" table
of this document. For deferred fixes within a service pack, only the
fixes
in the service pack which cannot be concurrently activated are
deferred.
Note: The file names and service pack levels used in the
following
examples are for clarification only, and are not necessarily levels
that
have been, or will be released.
System firmware file naming convention:
01ESXXX_YYY_ZZZ
- XXX is the release level
- YYY is the service pack level
- ZZZ is the last disruptive service pack level
NOTE: Values of service pack and last disruptive service pack
level
(YYY and ZZZ) are only unique within a release level (XXX). For
example,
01ES330_067_045 and 01ES340_067_053 are different service
packs.
An installation is disruptive if:
- The release levels (XXX) are different.
Example: Currently installed release is ES330, new release is ES340
- The service pack level (YYY) and the last disruptive
service
pack level (ZZZ) are the same.
Example: ES330_120_120 is disruptive, no matter what level of ES330 is
currently
installed on the system
- The service pack level (YYY) currently installed on the system is
lower
than the last disruptive service pack level (ZZZ) of the service pack
to
be installed.
Example: Currently installed service pack is ES330_120_120 and
new service pack is ES330_152_130
An installation is concurrent if:
The release level (XXX) is the same, and
The service pack level (YYY) currently installed on the system is the
same or higher than the last disruptive service pack level (ZZZ) of the
service pack to be
installed.
Example: Currently installed service pack is ES330_126_120,
new service pack is ES330_143_120.
Firmware Information and Update Description
For information about previous firmware release levels, see Section
7.0 Firmware History.
Filename |
Size |
Checksum |
01ES350_071_038.rpm |
23668624 |
25248 |
Note: The Checksum can be found by running the AIX sum command against the rpm file
(only the first 5 digits are listed).
ie: sum 01ES350_071_038.rpm
ES350 |
ES350_071_038
06/30/10
|
Impact: Serviceability
Severity: SPE
System firmware changes that affect all systems:
- DEFERRED: A problem
was fixed that could result in a system checkstop while running
floating point computations. Although this is a high-impact
problem, it has a very low probability of occurring.
- A problem was fixed
that caused a call home to be erroneously made with SRC B181E911, and a
service processor dump to be taken unnecessarily.
- A problem was fixed
that caused the HMC to show a status of "Incomplete" for the managed
system, and numerous service processor dumps to be generated.
- The firmware
was enhanced to improve the callouts when NVRAM corruption is detected
in the bulk power controller's (BPC's) service processor.
- On systems running ES350_xxx
firmware, a problem was fixed the prevented the reset/reload bit from
being set correctly in a service processor error log entry.
|
ES350_049_038
03/15/10
|
Impact:
Serviceability
Severity: HIPER
System firmware changes that affect all systems:
- HIPER: A problem was fixed that caused the system
to crash if the server was running AIX and had a F/C 5802 or 5877
drawer (in a 19" rack), or F/C 5803 or 5873 drawer (in a 24" rack),
attached.
- DEFERRED:
This fix corrects the handling of a specific processor instruction
sequence that has the potential to result in undetected data
errors. This specific instruction sequence has only been observed
in a small number of highly tuned Floating Point intensive
applications. However, it is strongly recommended that this fix
be applied to all POWER6 systems. This fix has the potential to
decrease system performance on applications that make extensive use of
floating point divide, square root, or estimate instructions.
- A problem was fixed that prevented an SRC from being
recorded in the service processor dump produced by a host-initiated
reset.
- A problem was fixed that caused SRC 10009135, followed by
10009139, to be erroneously logged. These SRCs indicate a system
power control network (SPCN) loop is being broken, then re-established.
- A problem was fixed that, under certain rare circumstances,
caused a partition to hang when being shut down.
- A problem was fixed that caused the system to hang with
SRCs B182953C, B182954C and B17BE434 being logged.
- The firmware was enhanced to detect and handle 12X
InfiniBand I/O drawer cabling errors better.
- A problem was fixed that, under certain rare circumstances,
caused the system to become unresponsive and appear to hang when
page migration occurred on a PCIe slot.
System firmware changes that affect certain systems
- A problem was fixed that caused a virtual SCSI or virtual
fibre channel adapter to be seen by the operating system as not
bootable when it was added to a partition using a dynamic LPAR (DLPAR)
operation.
- On systems with a F/C 5802 or 5877 drawer attached, a
problem was fixed that could impact the performance of a 4-port
Ethernet adapter F/C 5272, 5275, 5279, 5280, 5525, 5526, or 5527
installed in that drawer.
- In partitions running AIX or Linux, a problem was fixed
that caused the addition of an I/O slot to a partition using a dynamic
LPAR (DLPAR) add operation to fail.
- On systems with shared processors, a problem was fixed that
caused the partitions to hang and become unresponsive for very short
periods of time.
- A problem was fixed that prevented the IPv6 DHCP address
from being displayed on the advanced system management interface (ASMI)
network configuration screens when IPv6 and DHCP were enabled.
This only occurred on systems with virtual LAN (VLAN) addresses (such
as eth0.30, eth0.31), and when IPv6 addresses were assigned to the
eth0.xx interface.
- On systems running redundant VIOS partitions, a problem was
fixed that prevented Ethernet traffic from being properly bridged
between the two partitions. This problem also prevented shared
Ethernet adapter failover from working correctly.
|
ES350_038_038
11/23/09
|
Impact:
Function
Severity:Special Attention
New Features and Functions:
- Support for a USB-attached half-high 5.25" backup device
using a
removable
hard disk drive (HDD).
- Support for a platform dump that is not disruptive.
System firmware changes that affect all systems
- A problem was fixed that might cause a concurrent
firmware maintenance
(CFM) operation to fail repeatedly when a large number of I/O loop
errors
were being logged during the CFM operation.
- The firmware was enhanced to handle system dumps
(SYSDUMPs) larger than
4GB in size.
- On systems running system firmware release ES340, a
problem was fixed
that
caused a dynamic LPAR (DLPAR) operation on memory to fail until the
platform
was rebooted.
- The firmware was enhanced to improve the performance of
the F/C 5732,
5735
, and 5769 PCI-E adapters.
- The firmware was enhanced such that SRCs B181F126,
B181F127, and
B181F129 are correctly logged, and no longer calls home unnecessarily
for
these SRCs.
- A problem was fixed that caused a repair and verify
(R&V) operation
on the HMC to fail with the message "Exception encountered while
rendering
panel as HTML".
- The firmware was enhanced such that when a certain type
of hardware
failure
occurs in a bulk power controller (BPC), the appropriate errors will be
logged instead of SRCs B1818601 and B1818611, which indicate a firmware
failure.
- A problem was fixed that caused SRC B181D312 to be
erroneously logged,
and a call home to be erroneously made.
- On systems with 24" I/O drawers, a problem was fixed that
might cause a
partition to crash, with a system reboot required for recovery, when a
F/C 5797 or 5798 drawer was concurrently added.
- The firmware was enhanced to more accurately describe the
reason memory
was deconfigured on the advanced system management interface (ASMI)
memory
deconfiguration screen.
|
4.0
How to Determine Currently Installed Firmware Level
You can view the server's current firmware level on the Advanced System
Management Interface (ASMI) Welcome pane. It appears in the top right
corner.
Example: ES350_038.
5.0 Downloading
the
Firmware Package
Follow the instructions on the web page. You must read and agree to the
license agreement to obtain the firmware packages.
Note: If your HMC is not internet-connected you will need to
download
the new firmware level to a CD-ROM or ftp server.
6.0 Installing the
Firmware
The method used to install new firmware will depend on the release
level
of firmware which is currently installed on your server. The release
level
can be determined by the prefix of the new firmware's filename.
Example: ESXXX_YYY_ZZZ
Where XXX = release level
- If the release level will stay the same (Example: Level
ES330_075_075
is
currently installed and you are attempting to install level
ES330_081_075)
this is considered an update.
- If the release level will change (Example: Level ES330_081_075 is
currently
installed and you are attempting to install level ES340_096_096) this
is
considered an upgrade.
Instructions for installing firmware updates and upgrades can be found
at http://publib.boulder.ibm.com/infocenter/systems/scope/hw/topic/ipha1/updupdates.htm
7.0 Firmware History
ES340 |
ES340_123_042
05/24/10
|
Impact: Availability
Severity: ATT
System firmware changes that affect all systems
- The firmware was
enhanced to improve the callouts when NVRAM corruption is detected in
the bulk power controller's (BPC's) service processor.
System firmware changes that affect certain systems
- A problem was fixed that
caused the system to crash with SRC B7000103 when a concurrent
maintenance operation was performed on an I/O slot directly from a
partition (using AIX SMIT or IBM i HST).
- A problem was fixed that
caused a system or partition running Linux to crash when the
"serv_config -l" command was run.
|
ES340_120_042
04/07/10
|
Impact: Usability
Severity: SPE
System firmware changes that affect all systems
- DEFERRED:
This fix corrects the handling of a specific processor instruction
sequence that has the potential to result in undetected data
errors. This specific instruction sequence has only been observed
in a small number of highly tuned floating point-intensive
applications. However, it is strongly recommended that this fix
be applied to all POWER6 systems. This fix has the potential to
decrease system performance on applications that make extensive use of
floating point divide, square root, or estimate instructions.
- A problem was fixed that prevented
an SRC from being recorded in the service processor dump produced by a
host-initiated reset.
- A problem was fixed that
caused the system to become unresponsive and appear to hang when
page migration occurred on a PCIe slot.
System firmware changes that affect certain systems
- A problem was fixed that
caused a virtual SCSI or virtual fibre channel adapter to be seen by
the operating system as not bootable when it was added to a partition
using a dynamic LPAR (DLPAR) operation.
- In partitions running AIX or
Linux, a problem was fixed that caused the addition of an I/O slot to a
partition using a dynamic LPAR (DLPAR) add operation to fail.
- On systems running redundant
VIOS partitions, a problem was fixed that prevented Ethernet traffic
from being properly bridged between the two partitions. This
problem also prevented shared Ethernet adapter failover from working
correctly.
- On systems using InfiniBand switches for processor
clustering, a problem was fixed that caused InfiniBand ports to
intermittently drop out.
- On systems running active memory sharing (AMS), the
firmware was enhanced so that error messages indicating "out of
compliance" issues with the memory (HMC SRC HSCL031F) will not be
generated if the user allocates more memory than is installed in the
system. (Allocating more memory than is installed in the system
is supported in active memory sharing.)
|
ES340_112_042
12/16/09
|
Impact:
Serviceability
Severity: HIPER
System firmware changes that affect all systems
- HIPER: A problem was fixed that might cause
the system to
crash if the server is running AIX and has a F/C 5802 or 5877
drawer
(in a 19" rack), or F/C 5803 or 5873 drawer (in a 24"rack),
attached.
- A problem was fixed that prevented an automatic reboot
after some types
of memory-related crashes.
- The firmware was enhanced such that SRCs B181F126,
B181F127, and B181F129
are correctly handled, and no longer cause unnecessary calls home to be
made.
- A problem was fixed that caused the system to hang with
SRCs B182953C,
B182954C, and B17BE434 being logged.
- A problem was fixed that caused SRC 10009135, followed by
10009139, to
be erroneously logged. These SRCs indicate a system power control
network (SPCN) loop is being broken, then re-established.
System firmware changes that affect certain systems
- On a single system running Oracle in multiple partitions,
with multiple
IBM LHCAs connected in the same subnet, a problem was fixed that caused
the remaining partitions to lose their reliable datagram socket (RDS)
heartbeat
connections after the reboot of a single partition. There is a
greater
probability of encountering this problem if the partition being
rebooted
has a large partition memory assigned to it.
|
ES340_101_042
11/23/09
|
Impact:
Serviceability
Severity: Attention
System firmware changes that affect all systems
- DEFERRED: The firmware was enhanced to reduce the
number of
correctable
errors (CEs) being erroneously logged against the memory bus with
SRC B124E504.
- The firmware was enhanced such that SRC B181F126 is
correctly managed,
and no longer calls home unnecessarily for this problem.
|
ES340_095_042
08/20/09
|
Impact: Function Severity:
Special Attention
System firmware changes that affect all systems
- DEFERRED: This fix corrects the handling
of a
specific processor
instruction sequence that was generated on a particular heavily-tuned
High
Performance Computing (HPC) application. This specific instruction
sequence
has the potential to produce an incorrect result. This instruction
sequence
has only been observed in a single HPC application. However, it
is
strongly recommended that you apply this fix.
System firmware changes that affect certain systems
- A problem was fixed that under certain rare circumstances
caused a
partition
to crash when a 24" InfiniBand I/O drawer (feature code 5797 or 5798)
drawer
was concurrently added. When this problem occurred, rebooting the
system was required to recover.
- On systems running system firmware ES340_075 and Active
Memory Sharing,
a problem was fixed that might have caused a partition to lose I/O
entitlement
after the partition was moved from one system to another using PowerVM
Mobility.
- On systems running system firmware ES340_075 and Active
Memory Sharing,
a problem was fixed that might have caused a partition to fail to boot
with SRC B700F103 if the partition had more than 24 virtual processors
assigned to it.
- On systems with F/C 5802 or 5877 drawers attached, a
problem was fixed
that prevented an I/O slot's power LED from accurately reflecting the
state
of the I/O slot in a 5802 or 5877 drawer, under certain circumstances.
- On systems with external I/O towers attached, the
firmware was enhanced
so that the system will not crash when SRC B7006981 is logged for
certain
types of I/O hardware failures.
|
ES340_075_042
06/22/09
|
Impact: Function Severity:
Special Attention
New features and functions:
- DEFERRED: Support for F/C 5803 (24" I/O drawer)
and F/C 5873
(diskless 24" I/O drawer).
Attention: After this level of firmware is
installed, the platform
must be powered off, then powered on, before the 5803 or 5873 I/O
drawer
is added to the system.
- DEFERRED: Support for POWER VM Active Memory
Sharing.
Attention: After this level of firmware is installed,
the platform
must be powered off, then powered on to activate the POWER VM Active
Memory
Sharing function.
Attention: If ES340_075 has been installed, and the
new POWER
VM Active Memory Sharing function has been activated, and you want to
back-level
the system firmware, the active memory sharing pool must be deactivated
and deleted prior to back-leveling the system firmware. IBM does not
recommend
back-leveling the system firmware.
System firmware changes that affect all systems:
- A problem was fixed that caused hardware to be
deconfigured when the
system
encountered network errors, even though the SRCs were being logged as
informational.
- A problem was fixed that caused the detailed data at the
end of an
"early
power off warning type 5" AIX error log entry to be filled with invalid
data instead of zeros.
- On systems running firmware release ES340, a problem was
fixed that
caused
data in the platform dump to be invalid.
- The firmware was enhanced so that SRCs B181720D,
B1818A13, and
B1818A0F,
and occasionally a service processor dump, will not be generated when
the
service processor's two Ethernet interfaces are on the same subnet.
(This
is an invalid configuration.)
- The firmware was enhanced such that error logs with
relevant
information
will be created when a system crashes under certain circumstances,
rather
than a generic SRC (B1813410), with very little debug information,
being
logged.
- A problem was fixed that caused the system to hang when
terminating if
the system had been in power save mode.
- The firmware was enhanced so that a call home will be
made if the
hypervisor
issues a "terminate immediate" interrupt.
- A problem was fixed that caused incorrect field
replaceable unit (FRU)
part numbers to be returned for the BPC scroll assembly, UEPO panel and
the CEC MDA scroll assembly.
- The firmware was enhanced so that the service processor
only logs SRC
B1A38B24
when a valid network setup error is found. The callouts for this SRC
were
also improved.
- A problem was fixed that caused a system with I/O drawers
attached to
crash,
and a SYSDUMP to be taken, with SRCs B7000103 and SRC B181D138 being
logged.
Another symptom of this failure is informational SRC B7006970 entries
constantly
posting in the iqyylog.log.
System firmware changes that affect certain systems:
- In systems using InfiniBand switches for processor
clustering, a
problem
was fixed that caused packets to be dropped under certain circumstances.
- A problem was fixed that caused the migration of a
partition with more
that 900 virtual slots defined, from a system running firmware ES320 to
a system running firmware ES340, to fail.
- On systems using on/off (temporary) memory capacity on
demand (COD),
the
firmware was enhanced to improve memory COD's interaction with other
tools
(such as Inventory Scout in AIX), and to make the billing process
easier.
|
ES340_061_042
04/20/09
|
Impact: Function Severity:
HIPER
System firmware changes that affect all systems:
- HIPER: The firmware was enhanced to improve the
service processor's
capability to recover from bad bits in the flash memory. A predictive
error,
or an unrecoverable error, will be logged against the card that
contains
the system firmware if the number of correctable or uncorrectable
errors
exceeds the threshold.
- HIPER: A problem was fixed that caused nodes to
guard out processor
cores, or checkstop, during the transition to nominal voltage from
"power
save" mode.
- A problem was fixed that caused the service processor
diagnostics to
report
a "TOD (time-of-day) overflow" error, instead of an uncorrectable
memory
error, when failures occurred on memory DIMMs.
- A problem was fixed that prevented the service processor
from
automatically
booting from the permanent (or P) side if the temporary (or T) side of
the firmware flash was corrupted. When the problem occurred, the
service
processor stopped instead of booting from the P side.
- A problem was fixed that might have caused the system to
crash when a
processor
was dynamically removed when the system was running.
- The firmware was enhanced such that VPD data corruption
in the Anchor
(VPD)
will be corrected by the firmware, rather than having to have the
Anchor
card replaced.
- A problem was fixed that caused the system to crash,
under certain
circumstances,
with SRC B112E504 being logged, followed by SRC B181C350, when a system
dump was initiated.
- A problem was fixed that caused a partition being
migrated to crash on
the target system.
- On systems running the ES340 release of system firmware,
a problem was
fixed that caused an abort code to be logged in the virtual
input/output
system (VIOS) error log on the source system after a successful
partition
migration.
- A problem was fixed that caused a partition being
migrated to become
unresponsive
on the target system when firmware-assisted dump was enabled.
- The firmware was enhanced so that SRC BA210012 will not
generate a call
home when logged.
- A problem was fixed that caused hardware to be
deconfigured when the
system
encountered network errors, even though the SRCs were being logged as
informational.
System firmware changes that affect certain systems:
- On systems with external I/O drawers, a problem was fixed
that could
cause
the system to hang on checkpoint C700406E during a "warm" reboot (a
reboot
in which the processor drawer is power-cycled but the I/O drawers are
not).
- On systems with virtual fiber channel disks, a problem
was fixed that
prevented
the system management services (SMS) from displaying the virtual fiber
channel disks if the virtual fiber channel server reported that any of
them were reserved.
|
ES340_042_042
01/16/09
|
Impact: Function
Severity: HIPER
System firmware changes that affect all systems:
- HIPER: A problem was fixed that caused nodes to
guard out processor
cores, or checkstop, during the transition to nominal voltage from
"power
save" mode.
- The firmware was enhanced so that SRC B1xx3409, which
indicates an
invalid
state change (such as pushing the power on button twice quickly) will
be
logged as informational instead of predictive, and will not call home.
- A problem was fixed that caused a service processor dump
to be taken
and
SRC B181EF88 to be logged, even though the operation of the system was
not affected.
- A problem was fixed that caused the location codes for
multi-port PCI
adapters,
such as the 4-port Ethernet adapters, to be incorrect.
- On systems with multiple host channel adapter (HCA)
cards, a problem
was
fixed that logical ports on the HCA cards to be intermittently inactive.
- A problem was fixed that caused the system to crash and a
SYSDUMP to be
taken, with SRCs B170E540, B181D138, or B700F105, with a bad PCI-E
adapter
installed and in use, or while running a heavy network load.
System firmware changes that affect certain systems:
- In networks using a time server, a problem was fixed that
caused the
date
on a client system to be reset to 1969 if the client system lost power.
|
ES330 |
ES330_104_078
04/26/10
|
Impact: Availability
Severity: ATT
System firmware changes that affect all systems
- DEFERRED:
This fix corrects the handling of a specific processor instruction
sequence that has the potential to result in undetected data
errors. This specific instruction sequence has only been observed
in a small number of highly tuned floating point-intensive
applications. However, it is strongly recommended that this fix
be applied to all POWER6 systems. This fix has the potential to
decrease system performance on applications that make extensive use of
floating point divide, square root, or estimate instructions.
- A problem was fixed that caused SRC
B1818A0A to be erroneously logged during a concurrent firmware update.
- The firmware was enhanced such that
SRCs B181F126, B181F127, and B181F129 are correctly logged, and no
longer cause unnecessary calls home to be made.
- The firmware was enhanced so that SRC B181720D,
and occasionally a service processor dump, will not be generated
when the service processor's two Ethernet interfaces are on the same
subnet. (This is an invalid configuration.)
- In partitions running AIX or Linux,
a problem was fixed that, under certain rare circumstances, caused the
addition an I/O slot to a partition using a dynamic LPAR (DLPAR) add
operation to fail.
- A problem was fixed that caused the
system to hang with SRCs B182953C, B182954C and B17BE434 being logged.
- A problem was fixed that caused SRC
B1818902 to be erroneously logged during a firmware installation.
System firmware changes that affect certain systems
- On partitions running AIX or Linux,
a problem was fixed that caused a dynamic LPAR (DLPAR) operation to add
an I/O slot to fail.
- On systems running redundant VIOS
partitions, a problem was fixed that prevented Ethernet traffic from
being properly bridged between the two partitions. This problem
also prevented shared Ethernet adapter failover from working correctly.
- On systems using
InfiniBand switches for processor clustering, a problem was fixed that
caused InfiniBand ports to intermittently drop out.
|
ES330_095_078
08/31/09
|
Impact:
Usability
Severity: HIPER
System firmware changes that affect all systems
- DEFERRED: This fix corrects the handling of
a
specific processor
instruction sequence that was generated on a particular heavily-tuned
High
Performance Computing (HPC) application. This specific instruction
sequence
has the potential to produce an incorrect result. This instruction
sequence
has only been observed in a single HPC application. However, it
is
strongly recommended that you apply this fix.
- HIPER: A problem was fixed that caused the
migration of a
partition using shared processors to fail with a reason code of
4180043,
or caused the source system to hang or crash.
- A problem was fixed that caused SRC 1000911B to be
erroneously logged
during
a reset/reload of the service processor.
System firmware changes that affect certain systems
- On systems with 7311-D11, 7314-G30, 5790, or 5796 19"
drawers attached,
a problem was fixed that caused SRC 10009138 to be erroneously logged.
Concurrent maintenance (CM) firmware fixes
- A problem was fixed that caused SRC B7005603 to be
erroneously logged
when
a F/C 5802 or 5877 drawer was concurrently added.
|
ES330_092_078
05/18/09
|
Impact: Availability Severity:
HIPER
System firmware changes that affect all systems:
- HIPER: The firmware was enhanced to improve the
service processor's
capability to recover from bad bits in the flash memory. A predictive
error,
or an unrecoverable error, will be logged against the card that
contains
the system firmware if the number of correctable or uncorrectable
errors
exceeds the threshold.
- A problem was fixed that prevented the service processor
from
automatically
booting from the permanent (or P) side if the temporary (or T) side of
the firmware flash was corrupted. When the problem occurred, the
service
processor stopped instead of booting from the P side.
- The firmware was enhanced so that SRC B1xxE458 (with word
6=0000E42B)
will
be logged as informational instead of generating a call home.
- A problem was fixed that caused the system to crash, under
certain
circumstances,
with SRC B112E504 being logged, followed by SRC B181C350, when a system
dump was initiated.
- A problem was fixed that caused a partition being migrated
to become
unresponsive
on the target system when firmware-assisted dump was enabled.
- A problem was fixed that caused hardware to be deconfigured
when the
system
encountered network errors, even though the SRCs were being logged as
informational.
- A problem was fixed that caused the detailed data at the
end of an
"early
power off warning type 5" AIX error log entry to be filled with invalid
data instead of zeros.
- A problem was fixed that caused a partition being migrated
to crash on
the target system.
- A problem was fixed that might cause a system to crash with
SRC
B170E504
when a processor was dynamically deconfigured.
- The firmware was enhanced such that when data is written to
the VPD
(Anchor)
card, the results are verified, resulting in fewer VPD cards being
replaced.
System firmware changes that affect certain systems
In systems using InfiniBand switches for processor clustering, a
problem
was fixed that caused packets to be dropped under certain
circumstances.
|
ES330_078_078
01/15/09
|
Impact: Function
Severity: HIPER
This level is a disruptive update from any ES330
firmware level.
The system should be powered off before installing this level of system
firmware. If this level is installed when the system is running, the
CECs
will be rebooted, causing all partitions to be terminated, and a reboot
will be required.
System firmware changes that affect all systems:
- DEFERRED and HIPER: The system initialization
settings
were changed
to reduce the likelihood of a system crash under extremely rare
circumstances.
- HIPER: A problem was fixed that caused a system to
fail to reboot
after a B1xxE504 SRC was logged, due to a processor interconnection bus
failure. The same SRC, B1xxE504, was logged when the reboot failed.
- A problem was fixed that might, if a platform dump
occurred, have
caused
a reset/reload of the service processor, and the platform dump to be
corrupted.
- A problem was fixed that caused incorrect field replaceable
unit (FRU)
part numbers to be returned for the BPF scroll assembly and the UEPO
panel.
- A problem was fixed that prevented the system from
rebooting if an
error
occurred during a memory-preserving IPL.
- The firmware was enhanced so that a call home will be made
if the
hypervisor
issues a "terminate immediate" interrupt.
- The firmware's redundant bit steering logic was enhanced to
improve
performance.
- A problem was fixed that caused the location codes for
multi-port PCI
adapters,
such as the 4-port Ethernet adapters, to be incorrect.
- A problem was fixed that prevented service processor and
hypervisor
error
log entries from being reported to the operating system after a
successful
partition migration. This problem only affected the partition that was
migrated.
- On systems running AIX or Linux, a problem was fixed that,
under
certain
rare circumstances, might cause the operating system to crash.
- A problem was fixed that, in certain configurations, caused
the removal
of a host Ethernet adapter (HEA) port to fail when using a dynamic LPAR
(DLPAR) operation.
- A problem was fixed that, under certain rare circumstances,
caused the
hypervisor to crash when it was booting with SRC B6000103 being logged.
- A problem was fixed that, under certain circumstances,
prevented the
operating
system from recovering a PCI-E adapter on which a temporary enhanced
error
handling (EEH) error occurred.
- A problem was fixed that, under certain rarely occurring
circumstances,
caused the system to crash if an L2 or L3 cache failure is not
discovered
and repaired when it initially occurs.
- A problem was fixed that caused the service processor
diagnostics to
call
out a processor as the failing item, instead of the memory DIMMs, when
a large number of memory error correction coding (ECC) errors occurred.
- A problem was fixed that caused the wrong field replaceable
unit (FRU)
to be called out when SRC B152F109, which indicates a problem with the
NVRAM in a bulk power controller (BPC), was logged.
- A problem was fixed that prevented service processor and
hypervisor
error
log entries from being reported to the operating system after a
successful
partition migration. This problem only affected the partition that was
migrated.
- A problem was fixed that might cause a default catch to
occur when
booting
from an iSCSI device.
System firmware changes that affect certain systems:
- On systems with a host Ethernet adapter (HEA) or host
channel adapter
(HCA)
assigned to a Linux partition, a problem was fixed that prevented the
partition
from booting if 512 GB, 1 TB, or 1.5 TB of memory was assigned to the
partition.
When this problem occurred, SRC B700F105 was logged.
- In systems with clustered processors, various problems were
fixed in
the
InfiniBand interconnection networks.
- A problem was fixed that, under certain circumstances,
caused an AIX or
Linux partition to fail to boot with SRC D200E0AF being logged.
- On systems with external I/O frames, a problem was fixed
that might
have
prevented the firmware from "unthrottling" processors after entering
power
save mode.
|
ES330_046_034
08/28/08
|
Impact: Function
Severity: HIPER
System firmware changes that affect all systems:
- DEFERRED and HIPER: A problem was fixed that,
under
certain rarely
occurring circumstances, an application could cause a processor to go
into
an error state, and the system to crash.
- HIPER: A problem was fixed that caused the system
to
terminate abnormally
with SRC B131E504.
- HIPER: A problem was fixed that might cause a
partition to crash
during a partition migration before the migration was complete.
- A problem was fixed that caused the location codes of
multi-port PCI
adapters,
such as a 4-port Ethernet card, to be displayed incorrectly.
- A problem was fixed that caused multiple instances of SRC
B1818A03 and
B1818A0A to be logged erroneously, and multiple calls home to be made,
during a frame connection reset.
- A problem was fixed that caused SRC B1819506 to be
erroneously
generated,
and a call home to be made, when service processor (or system
controller)
error log entries were generated faster than they could be processed.
- A problem was fixed that caused the hardware management
console (HMC)
to
show an "Incomplete" state after it attempted to read a file with an
incorrect
size from the service processor (or system controller). This problem
also
occurred if the "factory configuration" option was used on the advanced
system management interface (ASMI) menus.
- Enhancements were made to the firmware to improve the FRU
callouts for
certain types of failures of the time-of-day clock circuitry.
- A problem was fixed that prevented a dump file larger than
4 GB from
being
successfully off-loaded to the hardware management console (HMC).
- On systems with redundant bulk power controllers, a problem
was fixed
that
caused the hardware management console (HMC) to get stuck at "Pending
Authentication"
for one of the bulk power controllers (BPCs).
- On systems with I/O drawers attached, a problem was fixed
that might
have
caused some I/O slots in the drawers not to be configured when the
system
was booted.
- In systems with clustered processors, various problems were
fixed in
the
InfiniBand interconnection networks.
- A problem was fixed that caused the location codes of the
external
InfiniBand
ports on a 5791 I/O drawer with the InfiniBand interface to be reported
incorrectly on the HMC.
- A problem was fixed that caused SRC B7006971 to be
generated because
the
firmware was incorrectly performing operations on PCI-Express I/O
adapters
during dynamic LPAR (DLPAR) operations on memory.
- A problem was fixed the might have caused an out-of-memory
condition in
the hypervisor, with SRC B7000200 being logged.
- A problem was fixed in the thermal management firmware that
caused SRCs
B1812635 and B1812636 to be logged, and the system or node to run in
low
power mode when it should have been in nominal, or nominal when it
should
have been in low power mode.
- A problem was fixed that caused SRC B1818A10 to be
erroneously
generated
after a successful installation of system firmware.
- A problem was fixed that caused the AIX commands "lsmcode"
and "diag"
to
fail after a partition migration.
- A problem was fixed that caused the message "BA330000malloc
error!" to
be displayed on the operating system console after a partition
migration,
even though SRC BA330000 had not been logged. When this problem
occurred,
the partition migration appeared to be successful. However, a process
within
the partition was either hung or had failed, and in most cased the
partition
had to be rebooted to fully recover.
- A problem was fixed that caused the status of the
connection between
the
hardware management console (HMC) and the service processor to be set
to
an invalid state. This might cause problems when the HMC and service
processor
tried to communicate.
- A problem was fixed that caused partitions that were being
rebooted to
hang at D200E0AF after a concurrent firmware update under certain
circumstances.
|
ES330_034_034
06/10/08
|
Impact: Function Severity:
HIPER
This level is a disruptive update from the prior level,
ES330_018.
The system should be powered off before installing this level of system
firmware. If this level is installed when the system is running, the
CECs
will be rebooted, causing all partitions to be terminated, and a reboot
will be required.
System firmware changes that affect all systems:
- HIPER: A problem was fixed that caused a
concurrent
firmware installation
to hang with SRC BA00E840 being logged. This problem may also cause a
partition
migration to hang, under certain circumstances, with the same SRC,
BA00E840,
being logged. This SRC will be logged when this level of firmware is
installed
and will generate a call home; it should be ignored. It will not be
logged
during subsequent installations.
- HIPER: The processor initialization settings were
changed to reduce
the likelihood of a processor going into an error state and causing a
checkstop
or system crash.
- HIPER: A problem was fixed that caused large
numbers
of enhanced
error handling (EEH) errors to be logged against the 4-port gigabit
Ethernet
adapter, F/C 5740, under certain circumstances.
- A problem was fixed that caused the /tmp directory on the
system
controllers
and the service processor in the bulk power controller (BPC) to fill
up,
which results in an out-of-memory condition. When this problem
occurred,
the system controllers or service processor in the BPC usually
performed
a reset/reload. This is one possible cause of SRC B1817201 being
logged.
- A problem was fixed in the repair and verify (R and V)
function on the
HMC that caused an unnecessary shutdown of the processor node when an
error
was logged against a bulk power regulator (BPR).
- A problem was fixed that caused a partition using a host
channel
adapter
(HCA) or host Ethernet adapter (HEA) to appear to hang (with progress
code
D200C1FF being displayed) before successfully shutting down. The amount
of time the partition appeared to hang depended on the amount of memory
assigned to the partition and the usage of HCA or HEA.
- A problem was fixed that prevented the HMC from connecting
to the
managed
system if the HMC's DHCP server IP range is changed when the managed
system
is running.
- The firmware was enhanced so that the IDs "celogin1" and
"celogin2"
allow
an authorized service provider to log into the bulk power controller
(BPC).
- The firmware was enhanced to improve the system memory
error recovery.
- The firmware was enhanced so that the contents of /tmp are
included
when
a service processor dump is taken.
- A problem was fixed in the hypervisor that might cause a
partition
migration
to fail.
- The firmware was enhanced so that:
- A failure when writing VPD to a P6 processor will cause
the node to be
deconfigured rather than terminating the system.
- The failure of a VPD write operation will not corrupt the
VPD table,
which
may lead to unnecessary system down-time and unnecessary FRU
replacement.
System firmware changes that affect certain systems:
- On systems using QLogic InfiniBand switches, a problem was
fixed that
caused
the PortInfo:linkWidthActive and PortInfo:linkSpeedActive to be
inaccurately
stored and displayed on the display of subnet parameters.
|
ES330_018_018
05/13/08
|
Impact: New Severity: New
|