AH730_058_035
11/07/11
|
Impact: Availability
Severity: HIPER
- High Impact/PERvasive, Should be installed as soon as
possible.
New Features and Functions
- Support for the PCIe2 1.8GB cache RAID SAS
adapter (tri-port 6Gb), F/C 5913.
System firmware changes that affect all systems
- A problem
was fixed that caused SRC B7005442 to be erroneously logged, and
functional processor cores to be erroneously guarded out, when an error
occurred in the operating system or an application.
- HIPER/Non-Pervasive:
A problem was fixed that
caused the system to crash with SRC B18187DA.
- A problem was fixed that
prevented a partition from being activated with SRC B2006009.
- The firmware was enhanced to
improve the field replaceable units (FRUs) called out when a clock
failure occurs.
- HIPER/Pervasive: A
problem was fixed
that caused the managed system to go the incomplete state with SRC
B7000602, and have to be rebooted, if these conditions were met:
- An inactive partition is
present on the managed system.
- A concurrent system firmware update to AH730_051 was done.
- The inactive partition is deleted before being activated with the new
firmware level, either by the user or a partition migration operation.
A
problem was fixed that caused a service processor firmware
synchronization to fail with SRC E302F842 when:
- A system firmware upgrade to a
new release (from AH720_xxx to AH730_yyy, for example) was installed,
then
- A service processor card was
replaced.
System firmware changes that affect certain systems
- On systems or logical partitions with a large
number of virtual processors, a performance problem was fixed that
prevented the utilization of the entitled capacity of partitions.
- A problem was
fixed that caused a shared processor partition that is configured with
two virtual processors, and an entitled capacity of 1.0 processors, to
hang when only one processor is in the physical shared pool.
- A problem was fixed that
caused the managed system's processors displayed by the HMC to be
incorrect. This problem occurs when the system is booted when no
partitions are defined, which for example can occur after an MES model
upgrade.
- Please
see the
"Important Information"
section of this document for additional details about this Service Pack.
Concurrent hot add/repair
maintenance firmware fixes
- On a system with mirrored
memory, a problem was fixed that caused a hot node repair operation to
fail.
- A problem was fixed
that caused the host Ethernet adapters (HEA) to be in a non-functional
state after a hot node add.
|
AH730_051_035
09/28/11
|
Impact:
Performance
Severity: HIPER - High Impact/PERvasive, Should be installed as soon as
possible.
New Features and Functions
- Support for dynamic memory DIMM error handling
improvements.
System firmware changes that affect all systems
- A problem was fixed
that caused SRC B18138B7 to be erroneously logged, and the service
processor to terminate, when errors were continuously logged due to
failing hardware. This problem can cause both node controllers to
terminate, which disables the node.
- The service
processor firmware was enhanced to log an error, instead of causing a
kernel panic, if a guard record was corrupted or truncated.
- A problem was fixed that
caused the wrong error code to be logged when the memory test took
longer than normal during system boot.
- A problem was fixed that
caused the Advanced System Management Interface (ASMI) menus to be
displayed in English no matter which language was selected.
- The
firmware was enhanced to verify that no uncorrectable memory errors are
present in all of a partition's memory when the hypervisor accesses
that memory.
- The firmware was enhanced to
reduce the number of times informational SRC 10009002 is logged when a
system is booted.
- A problem was fixed that
caused two calls home for the same error to be made when a platform
dump was generated.
- A problem was fixed that
caused unrecoverable SRC B181A809 to be erroneously logged.
- A problem was fixed that
caused a system boot to terminate with unrecoverable SRC B181A403.
- A problem was fixed prevented
a platform system dump from being deleted when the file system space on
the service processor was full.
- A problem was fixed prevented
an encapsulated state partition from being activated after a main store
dump (MSD).
- A problem was fixed that
caused a partition to fail to activate when the activation took place
within an hour of the system being powered on. This problem is
much more likely to occur on large systems with a large number of I/O
slots.
- A problem was fixed that
caused the system to terminate when rebooting after the power was
removed, then reapplied.
- A problem was fixed that
caused a firmware installation from the HMC with the "do not auto
accept" option selected to fail.
- A problem was fixed that
caused a partition to fail with SRC B170E540 when rebooting after an
unrecoverable error was logged that impacted the partition's reserved
memory area (RMA).
- A problem was fixed that
caused SRCs B181156C and B181A40F to be erroneously logged after a
service processor reset.
- The firmware was enhanced to
delay the rebooting of a partition after a uncorrectable error (UE) is
logged in the partition's memory. This gives the service
processor sufficient time to gard out the memory in which the UE
occurred.
- The firmware was enhanced to
log SRC B181C3251 as informational rather than predictive.
- The firmware was enhanced to
log SRC B1812A11 as informational, instead of "service action
required", when the thermal/power management device (TPMD) is
successfully reset.
- A problem was
fixed that erroneously caused SRC B18186x1 to be logged and an FSP dump
to be generated.
- The field replaceable unit
(FRU) callouts were enhanced for SRC B181E550.
- A problem was fixed that caused a system's partition
dates to revert back to 1969 after the service processor or its battery
was replaced. This occurred regardless of whether or not the
service processor's time-of-day (TOD) clock was correctly set during
the service action.
- A problem was fixed that
caused the system to crash with SRC B700F103.
System firmware changes that affect certain systems
- HIPER/Pervasive: On
systems running
VIOS, a problem was fixed that caused the system to crash with SRC
B700F103.
- HIPER/Pervasive: On
systems with processors that don't have memory
associated with them, a problem was fixed that was degrading system
performance.
- On systems running Advanced
Memory Sharing (AMS), a problem was fixed that caused the system to
crash during the creation of a logical partition (LPAR).
- On a system that terminates
when in dynamic power save mode, a problem was fixed that caused SRCs
B150B943, B113C660, and B113C661 to be erroneously logged when the
system rebooted.
- On systems running more than
100 logical partitions, a problem was fixed that caused a concurrent
firmware installation to fail.
- On systems running IBM i
partitions, a problem was fixed that prevented IBM i partitions that
were suspended from being reactivated after a main store dump (MSD).
- On systems
running IBM i partitions, a problem was fixed that caused changing the
processor weight on an IBM i partition to 255 to have no effect.
- On systems running Advanced
Memory Sharing (AMS), a problem was fixed that prevented the virtual
I/O server (VIOS) partition associated with an AMS pool from shutting
down.
- On systems with partitions
with dedicated memory assigned, a problem was fixed that caused a
resume operation on a partition with dedicated memory to fail with HMC
SRC HSC0A945.
- On systems running an
IBM i partition with dedicated memory, and redundant virtual I/O server
(VIOS) partitions, a problem was fixed that caused the resumption of
the IBM i partition to fail if the hypervisor failed-over to the other
VIOS partition while the IBM i partition was in hibernation.
- The firmware was enhanced to
allow the installation of IBM i from the HMC command line interface
(CLI) using the "chsysstate" command.
- On systems running
shared processor partitions, a problem was fixed that caused a
partition to hang until powered off and back on.
- On systems running the
Advanced Energy Manager (AEM), a problem was fixed that caused the work
rate calculation for a processor to be incorrect if the system dropped
into safe mode.
- On systems from which a node
has been removed, a problem was fixed that caused the node to continue
to be listed when the Processing Unit Deconfiguration option is
selected on the Advanced System Management Interface (ASMI) menus.
- On systems with an
uninterruptible power supply (UPS) attached, a problem was fixed that
caused the system to power cycle after a power failure, instead
of waiting for power to be restored before powering on.
- A problem was fixed that
prevented an automatic system reboot after a checkstop when a service
processor fail over occurred during the checkstop recovery.
- On systems with F/C 1954
(4-port GB Ethernet adapter) installed, a problem was fixed that
prevented the adapter from being configured during boot, and two
B7006970 SRCs to be erroneously logged.
- On systems running VIOS, a
problem was fixed that caused the location code in the output of the
VIOS command "lsmap -npiv -all" to be incorrect.
- A problem was fixed that
caused a partition migration operation to abort when the partition has
more than 4096 virtual slots.
- On systems running
Advanced Memory Sharing (AMS), the firmware was enhanced to reduce the
time required to migrate an AMS partition.
- On systems running
DB2 pureScale, a problem was fixed that caused intermittent remote
direct memory access (RMDA) errors, and a core dump of the pureScale
server process.
- On systems with processors
that don't have memory associated with them, the firmware was enhanced
to improve boot time and system performance.
- A problem was fixed that
caused the system to appear to hang, and a service processor
reset/reload to occur, when multiple hardware errors occurred.
- On systems running virtual
switches, the firmware was enhanced to limit the number of partitions
that have access to a particular vswitch.
- On systems with more than
1000 partitions, a problem was fixed that caused the error logs to be
flooded with informational SRC B7005120 when all of the partitions are
rebooted at the same time.
- On systems in which a service
processor had been guarded out manually, a problem was fixed that
caused the Deconfiguration Records option, which is under the System
Service Aids in the Advanced System Management Interface (ASMI), to
display null data for that service processor.
- On systems with redundant
service processors, a problem was fixed that prevented a service
processor fail-over from occurring.
- On systems with unequal
amounts of memory in the processor books, and on which the system
processor partition limit is set to max, a problem was fixed that
prevented the hypervisor from choosing the domain with the smallest
amount of memory when the user is packing memory into the domain.
- A problem was fixed that
caused the message "Console Internal Error", and an FSP dump to be
taken, when the Memory eRepair Data option was selected from the
Advanced System Management Interface (ASMI) menu.
- The firmware was enhanced to
allow the system to boot when one of the two clock cards does not power
on.
- The firmware was enhanced to
add a threshold to certain conditions that cause SRC B114E504 to be
logged. This reduces the unnecessary replacement of processor
modules.
- The firmware was enhanced to
allow the Enhanced Cache Option (also known as Turbo Core) to be
enabled when three or more processor nodes are present.
- A problem was fixed that
caused SRC B1754201, with memory DIMMs in the FRU list, to be
erroneously logged after the reset/reload of a node controller.
- The firmware
was enhanced to allow the concurrent replacement of the secondary
service processor even if the service processor redundancy policy
is set to "disabled".
Concurrent hot add/repair
maintenance firmware fixes
- On partitions running Red Hat
Linux 6.1, a problem was fixed that caused a node evacuation operation
to fail.
- A problem was fixed that
caused a node repair operation to fail if it had to be restarted.
- A problem was fixed that caused
the system to crash during a hot GX adapter repair.
|