AH720_101_064
05/20/11 |
Impact: Availability
Severity: HIPER
- High Impact/PERvasive, Should be installed as soon as
possible.
System firmware changes that affect all systems
- HIPER: IBM
testing has uncovered a potential undetected data corruption
issue. The problem can occur in rare instances due to an issue in
the firmware and is most likely to impact hypervisor data. This
issue was discovered during internal IBM testing, and has not been
reported on any customer system. However, IBM recommends
that customers running on POWER7 systems with Ax720_090 and
earlier firmware move to Ax720_101. POWER7 systems running with
Ax710 firmware do not have an exposure to this issue, so no action is
recommended.
- HIPER: A
problem was fixed that caused the hypervisor to delay dispatching a
partition even though it was ready to run, which added latency (delays)
that adversely affected performance. This problem can affect
POWER7 systems running any level of Ax720 firmware prior to Ax720_101.
- A problem was fixed that caused certain service processor
error log entries with a severity of "predictive", and a failing
subsystem of "service processor firmware", to be erroneously converted
to "informational".
- A problem was fixed that caused three B181951C SRCs to be
erroneously logged, and the system IPL time to increase by as much as
an hour. This problem is more likely to occur on systems with
firmware level AL720_082 or AL720_090, AM720_084 or AM720_090, or
AH720_082 or AH720_090 installed.
- A problem was fixed that caused the EnergyScale firmware to
erroneously go into safe mode when processor 0 was guarded out.
- A problem was fixed that caused SRC B1812A61 to be
erroneously logged.
- A problem was fixed that prevented the setting of the boot
diagnostic level in the power on/off menu (in the Advanced System
Management
Interface (ASMI)) from being shown correctly after it was changed.
- A problem was fixed that prevented a system dump from being
off-loaded from the service processor. When this occurred,
additional dumps were not allowed.
- The firmware was enhanced so that a message is displayed if
setting the brand keyword in the ASMI menu (System Configuration ->
Program Vital Product Data -> System brand) fails because the
service processor is not in the correct state.
- The firmware was enhanced such that a call home is not made
when an error logged by the system controller, node controller, or
service processor is informational, or recovered, and the reset/reload
bit is set.
- A problem was fixed that caused multiple
DR_DMA_MIGRATE_FAIL entries in the AIX error log.
- A problem was fixed that caused SRC B7000803 to be
erroneously logged multiple times.
- A problem was fixed that prevented processor resources from
being moved to another partition by a DLPAR (dynamic LPAR) operation.
- A problem was fixed that prevented partitions from booting.
- A problem was fixed that caused the HMC component interval
activity report to always show 100% uncapped CPU available.
- A problem was fixed that caused incorrect data to be
displayed in the "Deconfiguration Records" menu option on the ASMI
(System Service Aids > Deconfiguration Records) when a service
processor was guarded out.
- A problem was fixed that prevented the hypervisor from
displaying updated vital product data (VPD) after the replacement of a
24" power subsystem part.
- A problem was fixed that caused an administrative service
processor fail-over (AFO), followed by another AFO without a reset in
between, to fail.
- During a re-ipl of the system controller (SC), the firmware
tries to reconfigure the primary node controller (NC) that was
previously configured. It terminates after 30 minutes if the same
primary NC is not found. This causes the service processor
fail-over to fail, which results in the server outage.
- A problem was fixed that caused VIOS partitions to fail to
boot.
- A problem was fixed that caused a partition suspend
operation to hang. When this problem occurred, all subsequent
suspend operations were locked out as well.
- A problem was fixed that could cause the target partition
to crash after a successful P6 to P7 partition migration.
Possible AIX error log entries include: label: DSI_PROC,
resource: SYSVMM, with description: "DATA STORAGE INTERRUPT,
PROCESSOR". Other partition-related crash descriptors may also be
logged.
- A problem was fixed that could cause AIX error log entries
following a successful partition migration. Possible AIX error
log entries include: label: RTAS_ERROR, resource: sysplanar0, with
description: "INTERNAL ERROR CODE". Other errors may also be
logged.
- A problem was fixed that caused the installation of some
versions of Linux to fail.
System firmware changes that affect certain systems
- On systems with two HMCs attached, a problem was fixed that
caused one of the HMCs to frequently go to an incomplete state.
- On systems running IBM i partitions, a problem was fixed
that caused a RAID array of SCSI disks to be exposed if an MES upgrade
was done, or a system plan was created.
- On systems running IBM i partitions, a problem was fixed
that caused SRC BA040030 to be erroneously logged, and a call home to
be made, even though the partition booted successfully.
- On systems using the host Ethernet adapter (HEA) function,
a problem was fixed that caused the HMC to erroneously report that
deleting a logical port had failed.
- On partitions running Advanced Memory Sharing (AMS), a
problem was fixed that prevented shutdown of a partition when all
paging VIOS's servicing the partition were hung and unable to complete
outstanding I/O operations.
- On systems running Advanced Memory Sharing (AMS), a problem
was fixed that caused an AMS partition to crash with SRC
B700F103. This problem may occur when reducing the size of the
AMS pool (or doing a hot node repair on a model MMB or MHB) at the same
time as dynamically creating an AMS partition, or changing an AMS
partition's maximum memory.
- A problem was fixed that caused AIX licensing issues when
migrating a partition from a POWER6 to a POWER7 system.
- The "USB Service Functions" option was removed from the
ASMI menus on 9117-MMB, 9179-MHB and 9119-FHB systems, which do not
support this function.
- On systems with a F/C 5803 or 5873 I/O expansion drawer, a
problem was fixed that caused SRC B7006907 to be erroneously logged.
- On systems with a F/C 5803 or 5873 I/O expansion drawer, a
problem was fixed that caused SRC 10009133 to be erroneously logged.
- On systems with a F/C 5803 or 5873 I/O expansion drawer, a
problem was fixed that caused SRCs 1402131B, 1402130A, and 1402130F to
be erroneously logged, and the "health check" that is done before the
HMC attempts to install firmware on the managed system, to fail.
- The ASMI menu was enhanced to generate an error message if
the turbo core option is enabled on a system with fewer than three
nodes.
- On systems at an altitude of 7001 feet (2134 meters) or
higher, a problem was fixed that caused numerous 11007221 and 11007222
SRCs to be erroneously logged.
- On systems with a F/C 5803 or 5873 I/O expansion drawer, a
problem was fixed that caused the lamp test on the HMC to turn off all
of the LEDs when the test was complete instead of returning them to
their original states.
- On systems on which a NIM installation is being set up
using the system management services (SMS) menus, the firmware was
changed to limit the packet size options to 512 and 1024 bytes.
- On systems with Selective Memory Mirroring and the Enhanced
Cache Option enabled, a problem was fixed that caused unpredictable
system behavior when a processor hardware failure occurred.
|
AH720_082_064
12/13/10 |
Impact: Function
Severity: HIPER - High Impact/PERvasive, Should be installed as soon as
possible.
New Features and Functions
- Support for partition suspend/resume. AIX 61 TL6 SP3
or later, or AIX 71 TL0 SP2 or later, is required for partition
suspend/resume.
System firmware changes that affect all systems
- HIPER: A
problem was fixed that caused repeated reset/reloads of
the service processor, and fail-overs, to occur after a
hypervisor-initiated reset/reload of the service processor was
completed. That led to loss of communication between the service
processor and the hypervisor (indicated by SRC B182951C).
- A problem was fixed that caused disks that
were not bootable to be displayed in the system management services
boot menus. This problem also prevented the operating system
level from being displayed for bootable hard disks in the system
management services boot menus.
- A problem was fixed that
caused an error log indicating a dynamic LPAR (DLPAR) error when no
DLPAR operations were done, and unrecoverable SRCs BA180010 and
BA250010 to be erroneously logged, when a recoverable enhanced error
handling (EEH) error was logged on an I/O adapter.
- A problem was fixed
that caused a service processor reset/reload, a service processor dump
to be taken, and B181EF88 to be logged.
- The firmware was enhanced to
log a predictive SRC if the Ethernet cables are misplugged (swapped) on
a node controller.
- A problem was fixed that
caused the managed system to go to the incomplete state on the HMC.
- A problem was fixed that
caused the system to hang at C700406E during boot.
- A problem was fixed that
caused the platform to become unresponsive; this was indicated by an
"incomplete" state on the HMC. When this problem occurred, the
partitions on the managed system became unresponsive.
System firmware changes that affect certain systems
- On systems running AEM
(advanced energy manager), a problem was fixed that caused the AEM to
display erroneous power trending data if only one line cord was
connected, or if there was any type of command failure getting power
trending data from a BPC (bulk power controller).
- A problem was fixed that
caused a virtual SCSI or virtual fibre channel adapter to be seen by
the operating system as not bootable when it was added to a partition
using a dynamic LPAR (DLPAR) operation.
- A problem was fixed that
caused the system ID to change, which caused software licensing
problems, when a live partition mobility operation was done where the
target system was an 8203-E4A or an 8204-E8A.
- PARTITION-DEFERRED:
A problem was fixed that caused SRC BA210000 to be erroneously logged
on the target system when a partition was moved (using Live Partition
Mobility) from a Power7 system to a Power6 system.
- A problem was fixed that
caused SRC BA280000 to be erroneously logged on the target system when
a partition was moved (using Live Partition Mobility) from a Power7
system to a Power6 system.
- A problem was fixed that
caused a partition to hang following a partition migration operation
(using Live Partition Mobility) from a system running Ax720 system
firmware to a system running Ex340, or older, system firmware.
- A problem was fixed that
caused a system or partition running Linux to crash when the
"serv_config -l" command was run.
- On systems using the HEA
broadcast/multicast application to send and receive millions of
packets, such as video streaming, the packet storm mitigation algorithm
was enhanced so that a packet will only be dropped when a packet storm
is detected.
- A problem was fixed that
caused a partition to fail to reboot with SRC B2001230 and word 3 =
000000BF. This failure can be seen on a partition that owns a
PCI, PCI-E, or PCI-X slot.
|