AM780
For Impact, Severity and other Firmware definitions, Please
refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
The complete Firmware Fix History for this
Release Level can be
reviewed at the following url:
http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/AM-Firmware-Hist.html
|
AM780_080_040 / FW780.50
06/29/16 |
Impact: Availability
Severity: SPE
New Features and Functions
- Support was added for the Stevens6+ option of the internal
tray loading DVD-ROM drive with F/C #EU13. This is an 8X/24X(max)
Slimline SATA DVD-ROM Drive. The Stevens6+ option is a FRU
hardware replacement for the Stevens3+. MTM 7226-1U3
(Oliver) FC 5757/5762/5763 attaches to IBM Power Systems and
lists Stevens6+ as optional for Stevens3+. If the Stevens6+
DVD drive is installed on the system without the required firmware
support, the boot of an AIX partition will fail when the DVD is used as
the load source. Also, an IBM i partition cannot consistently
boot from the DVD drive using D-mode IPL. A SRC C2004130 may be
logged for the load source not found error.
- Support was added for systems to be able to automatically
convert permanently activated resources (processor and memory) to
Mobile CoD resources for use in a Power Enterprise Pool (PEP).
The ability to do a CoD resource license conversion requires a minimum
HMC level of V8R8.4.0 or later. More information on how to use
a PEP for a group of systems tp share Mobile Capacity on Demand
(CoD) processor resources and memory resources can be found in the IBM
Knowledge Center at the following link: https://www.ibm.com/support/knowledgecenter/HW4M4/p8ha2/systempool_cod.htm.
System firmware changes that affect all systems
- A problem was fixed for PCI adapters locking up when
powered on. The problem is rare but frequency varies with the
specific adapter models. A system power down and power up is
required to get the adapter out of the locked state.
- A security problem was fixed in OpenSSL for a possible
service processor reset on a null pointer de-reference during RSA PPS
signature verification. The Common Vulnerabilities and Exposures issue
number is CVE-2015-3194.
- A problem was fixed for hypervisor task failures in adjunct
partitions with a SRC B7000602 reported in the error log. These
failures occur during adjunct partition reboots for concurrent firmware
updates but are extremely rare and require a re-IPL of the system to
recover from the task failure. The adjunct partitions may be
associated with the VIOS or I/O virtualization for the physical
adapters such as done for SR-IOV.
- A problem was fixed for a shortened "Grace Period" for "Out
of Compliance" users of a Power Enterprise Pool (PEP). The
"Grace Period" is short by one hour, so the user has one less hour to
resolve compliance issues before the HMC disallows any more borrowing
of PEP resources. For example, if the "Grace Period" should have
been 48 hours as shown in the "Out of Compliance" message, it really is
47 hours in the hypervisor firmware. The borrowing of PEP
resources is not a common usage scenario. It is most often found
in
Live Partition Mobility (LPM) migrations where PEP resources are
borrowed from the source server and loaned to the target server.
- A problem was fixed for the Advanced System Management
Interface "Network Services/Network Configuration" "Reset Network
Configuration" button that was not resetting the static routes to the
default factory setting. The manufacturing default is to have no
static routes defined so the fix clears any static routes that had been
added. A circumvention to the problem is to use the ASMI "Network
Services/Network Configuration/Static Route Configuration" "Delete"
button before resetting the network configuration.
- A problem was fixed for a sequence of two or more Live
Partition Mobility migrations that caused a partition to crash with a
SRC BA330000 logged (Memory allocation error in partition
firmware). The sequence of LPM migrations that can trigger the
partition crash are as follows:
The original source partition level can be any FW760.xx, FW763.xx,
FW770.xx, FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx,
FW820.xx, FW830.xx, or FW840.xx P8 level. It is migrated first to
a system running one of the following levels:
1) FW730.70 or later 730 firmware or
2) FW740.60 or later 740 firmware
And then a second migration is needed to a system running one of the
following levels:
1) FW760.00 - FW760.20 or
2) FW770.00 - FW770.10
The twice-migrated system partition is now susceptible to the BA330000
partition crash during normal operations until the partition is
rebooted. If an additional LPM migration is done to any firmware
level, the thrice-migrated partition is also susceptible to the
partition crash until it is rebooted.
With the fix applied, the susceptible partitions may still log multiple
BA330000 errors but there will be no partition crash. A reboot of
the partition will stop the logging of the BA330000 SRC.
System firmware changes that affect certain systems
- On systems having a IBM i partition with more than 64
cores, a performance problem was fixed with the choice of processor
cores assigned to the partition. This problem only pertains to
the Power 780 (9179-MHD) and the Power 795 (9119-FHB).
- On systems with a PowerVM Active Memory Sharing (AMS)
partition with AIX Level 7.2.0.0 or later with Firmware Assisted
Dump enabled, a problem was fixed for a Restart Dump operation failing
into KDB mode. If "q" is entered to exit from KDB mode, the
partition fails to start. The AIX partition must be powered off
and back on to recover. The problem can be circumvented by
disabling Firmware Assisted Dump (default is enabled in AIX 7.2).
- For a system partition with more than 64 cores, a problem
was fixed for Live Partition Mobility (LPM) migration operations
failing with HSCL365C. The partition migration is stopped because
the platform detects a firmware error anytime the partition has more
than 64 cores. This problem only pertains to the Power 780
(9179-MHD) and the Power 795 (9119-FHB).
- On systems with dedicated processor partitions, a
problem was fixed for the dedicated processor partition becoming
intermittently unresponsive. The problem can be circumvented by
changing the partition to use shared processors.
Concurrent hot add/repair
maintenance (CHARM) firmware fixes
- DEFERRED: A
problem was fixed for a I/O performance slow-down that can occur after
a concurrent repair of a GX bus I/O adapter with a Feature Code of
#1808, #1816, #1914, #EN22, #EN23, or #EN25. A re-IPL of the
system after the concurrent repair operation corrects the I/O
performance issue. This fix requires an IPL of the system to take
effect.
|
AM780_075_040 / FW780.40
12/16/15 |
Impact: Availability
Severity: SPE
New Features and Functions
- Support was added to the service processor to allow control
of Dynamic Power Mode from the Hardware Management Console (HMC).
This power mode allows modifying a processor frequency, either to
reduce energy consumption or to overclock the processor and boost the
machine speed. There are four power modes possible:
1) Disable Power Saver mode – this is default. No changes
in the processor frequency and resource will operate at 100% of nominal
processor frequencies at all times.
2) Enable Static Power Saver mode – activates the Power Saver mode,
fixing the processor frequency and voltage at a predetermined low-power
mode.
3) Enable Dynamic Power Saver (favor power) mode – guarantees power
savings by limiting the maximum frequency of the system under peak
utilization under high utilization.
4) Enable Dynamic Power Saver (favor performance) mode – allows a
higher frequency range at high utilization.
There is existing support to control Dynamic Power Mode from the
Advanced System Management Interface (ASMI) with the "System
Configuration /Power Management/ Power Mode Setup" panel
options. With the new support, the HMC can also control the
Dynamic Power Modes with CLI commands lspwrmgmt (list the current power
mode configuration) and chpwrmgmt (change the power mode):
chpwrmgmt -m managed-system -r sys -o {enable | disable} [-t
{static | dynamic_favor_perf | dynamic_favor_power |
fixed_max_frequency}] [--help]
For more information on the HMC CLI chpwrmgmt command, see the
following link in the IBM KnowledgeCenter:
(https://www-01.ibm.com/support/knowledgecenter/HW4L4/p8edm/chpwrmgmt.html).
The HMC must be at V8R8.2.0 or later to have the Dynamic Power
Mode feature.
- Support was added to the Advanced System Management
Interface (ASMI) to be able to add a IPv4 static route definition for
each ethernet interface on the service processor. Using a static
route definition, a Hardware Management Console (HMC) configured
on a private subnet that is different from the service processor subnet
is now able to connect to the service processor and manage the
CEC. A static route persists until it is deleted or until the
service processor settings are restored to manufacturing
defaults. The static route is managed with the ASMI panel
"Network Services/Network Configuration/Static Route Configuration"
IPv4 radio button. The "Add" button is used to add a static route
(only one is allowed for each ethernet interface) and the "Delete"
button is used to delete the static route.
System firmware changes that affect all systems
- For a partition that has been migrated with Live Partition
Mobility (LPM) from FW730 to FW740 or later, a problem was fixed for a
Main Storage Dump (MSD) IPL failing with SRC B2006008. The MSD
IPL can happen after a system failure and is used to collect failure
data. If the partition is rebooted anytime after the migration,
the problem cannot happen. The potential for the problem existed
between the active migration and a partition reboot.
- A problem was fixed for partial loss of Entitlement for
On/Off Memory Capacity On Demand (also called Elastic COD). Users
with large amounts of Entitlement on the system of greater than "65535
GB * Days" could have had a truncation of the Entitlement value on a
re-IPL of the system. To recover lost Entitlement, the customer
can request another On/Off Enablement Code from IBM support to
"re-fill" their entitlement.
- A problem was fixed for an incorrect restriction on the
amount of "Unreturned" resources allowed for a Power Enterprise
Pool (PEP). PEP allows for logical moving of resources
(processors and memory) from one server to another. Part of this
is 'borrowing' resources from one server to move to another. This may
result in "Unreturned" resources on the source server. The management
console controls how many total "Unreturned" PEP resources can
exist. For this problem, the user had some "Unreturned" PEP
memory and asked to borrow more but this request was incorrectly
refused by the hypervisor.
- On systems where memory relocation (as done by using Live
Partition Mobility (LPM) ) and a partition reboot are occurring
simultaneously, a problem for a system termination was fixed. The
potential for the problem existed between the active migration and the
partition reboot.
- A problem was fixed in the hypervisor power off to protect
from rare NVRAM corruption in the address space where the partition
profiles are stored. The B7005301 SRC is logged on the next IPL
after the corruption that takes the system into the Hardware Management
Console (HMC) recovery state. The HMC found the partition
profiles corrupted in NVRAM. The HMC partition profile recovery
procedure must be used to restore the partition profiles from the HMC.
- A problem was fixed for a hypervisor adjunct partition
failed with "SRC B2009008 LP=32770" for an unexpected SR-IOV adapter
configuration. Without the fix, the system must be re-IPLed to
correct the adjunct error. This error is infrequent and can only
occur if an adapter port configuration is being changed at the same
time that error recovery is occurring for the adapter.
- A security problem was fixed for an OpenSSL specially
crafted X.509 certificate that could cause the service processor to
reset in a denial-of-service (DOS) attack. The Common
Vulnerabilities and Exposures issue number is CVE-2015-1789.
- A security problem was fixed in OpenSSL where a remote
attacker could cause an infinite loop on the service processor using
malformed Elliptic Curve parameters during the SSL
authentication. This would cause the service processor
performance problems and also prevent new management console
connections from being made. To recover from this attack, a reset
or power cycle of the service processor is needed after scheduling and
completing a normal shutdown of running partitions.. The Common
Vulnerabilities and Exposures issue number is CVE-2015-1788.
- A security problem was fixed in the lighttpd server on the
service processor OpenSSL where a remote attacker, while attempting
authentication, could insert strings into the lighttpd server log
file. Under normal operations on the service processor, this does
not impact anything because the log is disabled by default. The
Common Vulnerabilities and Exposures issue number is CVE-2015-3200.
- A problem was fixed for a Network boot/install failure
using bootp in a network with switches using the Spanning Tree Protocol
(STP). A Network boot/install using lpar_netboot on the
management console was enhanced to allow the number of retries to be
increased. If the user is not using lpar_netboot, the number of
bootp retries can be increased using the SMS menus. If the SMS
menus are not an option, the STP in the switch can be set up to allow
packets to pass through while the switch is learning the network
configuration.
- A problem was fixed in the run-time abstraction services
(RTAS) extended error handling (EEH) recovery for EEH events for SR-IOV
Virtual Functions (VFs) to fully reconfigure the VF devices after an
EEH event. Since the physical adapter does recover from the EEH
event itself, and there are no error logs generated, it might not be
immediately apparent that the VF did not fully reconfigure. This
prevents certain PCIe settings from being established for interrupts
and performance settings, leading to unexpected adapter behavior and
errors in the partition.
- For systems with an invalid P-side or T-side in the
firmware, a problem was fixed in the partition firmware Real-Time
Abstraction System (RTAS) so that system Vital Product Data (VPD) is
returned at least from the valid side instead of returning no VPD
data. This allows AIX host commands such as lsmcode, lsvpd,
and lsattr that rely on the VPD data to work to some extent even if
there is one bad code side. Without the fix, all the VPD
data is blocked from the OS until the invalid code side is recovered by
either rejecting the firmware update or attempting to update the system
firmware again.
- A problem was fixed that prevented a second management
console from being added to the system. In some cases, network
outages caused defunct management console connection entries to remain
in the service processor connection table, making connection slots
unavailable for new management consoles A reset of the service
processor could be used to remove the defunct entries and allow the
second management console to connect.
- A problem was fixed for some service processor error logs
not getting reported to the OS partitions as needed. The service
processor was not checking for a successful completion code on the
error log message send, so it was not doing retries of the send to the
OS when that was needed to ensure that the OS received the message.
- A problem was fixed for an incorrect call home for SRC
B1818A0F. There was no real problem so this call home should have
been ignored. This occurred when dynamic IP configurations were
being done on the service processor and the DHCP server was not
responding. The correct solution was to fix the network
configuration so that the DHCP server could be found on network.
System firmware changes that affect certain systems
- On
systems using PowerVM with shared processor partitions that are
configured as capped or in a shared processor pool, there was a problem
found that delayed the dispatching of the virtual processors which
caused performance to be degraded in some situations. Partitions
with dedicated processors are not affected. The problem is
rare and can be mitigated, until the service pack is applied, by
creating a new shared processor AIX or Linux partition and booting it
to the SMS prompt; there is no need to install an operating system on
this partition. Refer to help document http://www.ibm.com/support/docview.wss?uid=nas8N1020863
for additional
details.
- On a system with a IBM i partition running 7.2 or later
with 4K sector disks, a problem was fixed for a machine
check incorrectly issued.
- On a system with a AIX partition and a Linux partition, a
problem was fixed for dynamically moving an adapter that uses DMA from
the Linux partition to the AIX partition that caused the AIX to fail by
going into KDB mode (0c20 crash). The management console showed
the following message for the partition operation: "Dynamic move
of I/O resources failed. The I/O slot dynamic partitioning
operation failed.". The error was caused by Linux using 64K
mappings for the DMA window and AIX using 4K mappings for the DMA
window, causing incorrect calculations on the AIX when it received the
adapter. Until the fix is applied, the adapters that use DMA
should only be moved from Linux to AIX when the partitions are powered
off.
- For Integrated Virtualization Manager (IVM) managed systems
with more than 64 active partitions, a problem was fixed for recovery
from Live Partition Mobility (LPM) errors. Without the fix, the
IVM managed system partition can appear to still be running LPM after
LPM has aborted, preventing retries of the LPM operation. In this
case, the partition must be stopped and restarted to clear the LPM
error state. The problem is not frequent because it requires a
failed LPM on a partition with a partition ID that is greater than
64. This defect only pertains to the IBM Power ESE (8412-EAD).
- On systems with IBM i partitions that have a load source
device with 4K sectors, a problem has been fixed for Mainstore Dump
(MSD) failing with a B200F00C SRC. Without the fix, the IBM
i 4K sector load source devices are not supported for MSD and always
fail.
- For non-HMC managed systems in Manufacturing Default
Configuration (MDC) mode with a single host partition, a problem was
fixed for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP
that were not off-loaded to the host OS. This is an infrequent
error caused by a timing error that causes the dump notification signal
to the host OS to be lost. The missing/pending dumps can be
retrieved by rebooting the host OS partition. The rebooted host
OS will receive new notifications of the dumps that have to be
off-loaded.
- A problem was fixed for an IPL termination with a B150B10C
SRC and B121C770 error logs. This problem only occurred on a
multiple node system and does not pertain to the Power ESE
(8412-EAD). The problem was intermittent so a re-ipl of the CEC
normally resolved the problem.
|
AM780_071_040 / FW780.30
04/22/15 |
Impact: Security
Severity: SPE
System firmware changes that affect all systems
- A problem was fixed for the iptables process consuming all
available memory, causing an "out of memory" dump and reset/reload of
the service processor.
- A problem was fixed for the callout on power good (pgood)
fault SRC 11002634 so that it includes the CEC enclosure and the
failing FRU. Previously, the callout was missing the failing FRU.
- A problem was fixed with the fspremote service tool to make
it support TLSv1.2 connections to the service processor to be
compatible with systems that had been fixed for the OpenSSL Padding
Oracle On Dowgraded Legacy Encryption (POODLE) vulnerabilities.
After the POODLE fix is installed, by default the system only allows
secured connections from clients using the TLSv1.2 protocol.
- A problem was fixed for performance dumps to speed its
processing so it is able to handle partitions with a large number of
processors configured. Previously, for large systems, the
performance dump took too long in collecting performance data to be
useful in the debugging of some performance problems.
- A problem was fixed for a faulty ambient temperature sensor
that triggered emergency power offs with SRC 11007203 or 11007203 even
though the temperature was not over the limit. If the ambient
temperatures are high now, the errors will be logged for call home
service but they will not trigger an emergency power off.
- A problem was fixed to prevent a hypervisor task failure if
multiple resource dumps running concurrently run out of dump buffer
space. The failed hypervisor task could prevent basic logical
partition operations from working.
- A problem was fixed for a partition deletion error on the
management console with error code 0x4000E002 and message
"...insufficient memory for PHYP". The partition delete operation
has been adjusted to accommodate the temporary increase in memory usage
caused by memory fragmentation, allowing the delete operation to be
successful.
- A problem was fixed for I/O drawer MTMS updates where a
hypervisor memory leak would cause reconfiguration operations to fail
or cause resources to no longer show up for user configuration.
- A security problem was fixed in OpenSSL where the service
processor would, under certain conditions, accept Diffie-Hellman client
certificates without the use of a private key, allowing a user to
falsely authenticate . The Common Vulnerabilities and Exposures
issue number is CVE-2015-0205.
- A security problem was fixed in OpenSSL to prevent a denial
of service when handling certain Datagram Transport Layer Security
(DTLS) messages. A specially crafted DTLS message could exhaust
all available memory and cause the service processor to reset.
The Common Vulnerabilities and Exposures issue number is CVE-2015-0206.
- A security problem was fixed in OpenSSL to prevent a denial
of service when handling certain Datagram Transport Layer Security
(DTLS) messages. A specially crafted DTLS message could do an
null pointer de-reference and cause the service processor to
reset. The Common Vulnerabilities and Exposures issue number is
CVE-2014-3571.
- A security problem was fixed in OpenSSL to fix multiple
flaws in the parsing of X.509 certificates. These flaws could be
used to modify an X.509 certificate to produce a certificate with a
different fingerprint without invalidating its signature, and possibly
bypass fingerprint-based blacklisting. The Common Vulnerabilities
and Exposures issue number is CVE-2014-8275.
- A security vulnerability, commonly referred to as GHOST,
was fixed in the service processor glibc functions getbyhostname() and
getbyhostname2() that allowed remote users of the functions to cause a
buffer overflow and execute arbitrary code with the permissions of the
server application. There is no way to exploit this vulnerability
on the service processor but it has been fixed to remove the
vulnerability from the firmware. The Common Vulnerabilities and
Exposures issue number is CVE-2015-0235.
- A problem was fixed in the Advanced System Management
Interface (ASMI) to reword a confusing message for systems with no
deconfigured resources. The "System Service Aids/Deconfiguration
Records" message text for this situation was changed from
"Deconfiguration data is currently not available." to "No deconfigured
resources found in the system."
- A problem was fixed for a hypervisor deadlock that results
in the system being in a "Incomplete state" as seen on the management
console. This deadlock is the result of two hypervisor tasks
using the same locking mechanism for handling requests between the
partitions and the management console. Except for the loss of the
management console control of the system, the system is operating
normally when the "Incomplete state" occurs.
- A security problem was fixed in OpenSSL where a remote
attacker could crash the service processor with malformed Elliptic
Curve private keys. The Common Vulnerabilities and Exposures
issue number is CVE-2015-0209.
- A security problem was fixed in OpenSSL where a remote
attacker could crash the service processor with a specially crafted
X.509 certificate that causes an invalid pointer, out-of-bounds write,
or a null pointer de-reference. The Common Vulnerabilities and
Exposures issue numbers are CVE-2015-0286, CVE-2015-0287, and
CVE-2015-0288.
System firmware changes that affect certain systems
- On systems with
redundant service processors and unlicensed cores, a problem was fixed
with firmware update to prevent SRC B170B838 errors on unlicensed cores
after an administrative failover (AFO) to the backup service processor.
- On systems with redundant service processors, a problem was
fixed for serviceable events being missing on the management console
for the case of a backup service processor termination error. The
error log from the failed backup service processor did not get
synchronized to the primary service processor.
- On a system with redundant service processors, a problem
was fixed for bad pointer reference in the mailbox function during data
synchronization between the two service processors. The
de-reference of the bad pointer caused a core dump, reset/reload, and
fail-over to the backup service processor.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed for a hypervisor hang at progress code C7004091
during the IPL or hangs during serviceability tasks to the I/O drawer.
- On systems using the Virtual I/O Server (VIOS) to share
physical I/O resources among client logical partitions, a problem was
fixed for memory relocation errors during page migrations for the
virtual control blocks. These errors caused a CEC termination
with SRC B700F103. The memory relocation could be part of the
processing for the Dynamic Platform Optimizer (DPO), Active Memory
Sharing (AMS) between partitions, mirrored memory defragmentation, or a
concurrent FRU repair.
- A problem was fixed that could result in unpredictable
behavior if a memory UE is encountered while relocating the contents of
a logical memory block during one of these operations:
- Using concurrent maintenance to perform a hot repair of a node.
- Reducing the size of an Active Memory Sharing (AMS) pool.
- On systems using mirrored memory, using the memory mirroring
optimization tool.
- Performing a Dynamic Platform Optimizer (DPO) operation.
- On systems using Virtual Shared Processor Pools (VSPP), a
problem was fixed for an inaccurate pool idle count over a small
sampling period.
- A problem was fixed that could result in latency or timeout
issues with I/O devices. On systems using Power7+ processors (IBM
Power 770 (9117-MMD, IBM Power 780 (9179-MHD), and IBM Power ESE
(8412-EAD)), this issue only impacts shared processor partitions.
- For a system with Virtual Trusted Platform Module (VTPM)
partitions, a problem was fixed for a management console error
that occurred while restoring a backup profile that caused the system
to to go the management console "Incomplete state". The failed
system had a suspended VTPM partition and a B7000602 SRC logged.
- On systems with redundant service processors, a problem was
fixed to add a missing check for a broken FSI link-1 pin. The
broken FSI link-1 pin was detectable during fail-over attempts to the
backup service processor which failed.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- A problem was fixed for concurrent maintenance to prevent a
hardware unavailable failure when doing consecutive concurrent remove
and add operations to an I/O Hub adapter for a drawer.
|
AM780_068_040 / FW780.21
01/07/15 |
Impact: Security
Severity: HIPER
System firmware changes that affect all systems
- A security problem was fixed in OpenSSL for padding-oracle
attacks known as Padding Oracle On Downgraded Legacy Encryption
(POODLE). This attack allows a man-in-the-middle attacker to
obtain a plain text version of the encrypted session data. The Common
Vulnerabilities and Exposures issue number is CVE-2014-3566. The
service processor POODLE fix is based on a selective disablement of
SSLv3 using the Advanced System Management Interface (ASMI) "System
Configuration/Security Configuration" menu options. The Security
Configuration options of "nist_sp800_131a", "nist_compat", and "legacy"
for affects the disablement SSLv3 and determines the level of
protection from POODLE. The management console also requires a
POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V7 R7.9.0 SP1
with PTF MH01484) to eliminate all vulnerability to POODLE and allow
use of option 1 "nist_sp800_131a" as shown below:
-1) nist_sp800_131a (SSlv3 disabled): This highest level of
security protection does not allow service processor clients to connect
using SSLv3, thereby eliminating any possibility of a POODLE
attack. All clients must be capable of using TLS v1.2 to make the
secured connections to the service processor to use this option.
This requires the management console be at a minimum level that has a
POODLE fix such as HMC V7 R7.9.0 SP1 with POODLE PTF MH01484.
-2) nist_compat (default mode - SSLv3 enabled for HMC): This
medium level of security protection disables SSLv3 (TLS v1.2 must be
used instead) for the web browser sessions to ASMI and for the CIM
clients and assures them of POODLE-free connections. But the
older management consoles are allowed to use SSLv3 to connect to the
service processor. This is intended to allow non-POODLE compliant
HMC levels to be able to connect to the CEC servers until they can be
planned and upgraded to the POODLE compliant HMC levels. Running
a non-POODLE compliant HMC to a service processor in this default mode
will prevent the ASMI-proxy sessions from the HMC from connecting as
these proxy sessions require SSLv3 support in ASMI.
-3) legacy (SSLv3 enabled): This basic level of security
protection enables SSLv3 for all service processor client
connection. It relies on all clients being at POODLE fix
compliant levels to provide full POODLE protection using the TLS
Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent
fallback to vulnerable SSLv3 connections. This legacy option is
intended for customer sites on protected internal networks that have a
large investment in older hardware that need SSLv3 to make browser and
HMC connections to the service processor. The level of POODLE
protection actually achieved in legacy mode is determined by the
percentage of clients that are at the POODLE fix compliant levels.
- A security problem was fixed in OpenSSL for memory leaks
that allowed remote attackers to cause a denial of service (out of
memory on the service processor). The Common Vulnerabilities and
Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
System firmware changes that affect certain systems
- HIPER/Pervasive:
On systems using PowerVM firmware, a performance problem was fixed that
may affect shared processor partitions where there is a mixture of
dedicated and shared processor partitions with virtual IO connections,
such as virtual ethernet or Virtual IO Server (VIOS) hosting, between
them. In high availability cluster environments this problem may
result in a split brain scenario.
- On systems with redundant service processors, a
problem was fixed so that a backup memory clock failure with SRC
B120CC62 is handled without terminating the system running on the
primary memory clock.
|
AM780_066_040 / FW780.20
10/16/14 |
Impact: Data
Severity: HIPER
New Features and Functions
- Support was added for using the Mellanox ConnectX-3 Pro
10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.
System firmware changes that affect all systems
- A problem was fixed that caused the Advanced System
Management Interface (ASMI) menu for Memory Low Power State to be
displayed even though it is not applicable to the system. These
systems do not have the DIMM type required for memory low power state.
- A problem was fixed that caused the Utility COD display of
historical usage data to be truncated on the management console.
- A problem was fixed for memory relocation failing during a
partition reboot with SRC B700F103 logged. The memory relocation
could be part of the processing for the Dynamic Platform Optimizer
(DPO), Active Memory Sharing (AMS) between partitions, mirrored memory
defragmentation, or a concurrent FRU repair.
- A problem was corrected that resulted in B7005300 error
logs.
- A problem was fixed for Utility COD Processors where
incorrect SRCs A7004735 and A7004736 are logged when utility processors
are activated. The messages try to convey a problem that does not
exist (no out of processor compliance condition actually exists).
- A problem was fixed for the Advanced System Manager
Interface (ASMI) to change the Dynamic Platform Optimizer (DPO) VET
capability setting from "False" to "True". DPO is available on
all systems to use without a license required. Even though the
VET for DPO was set to "False", it did not interfere with the running
of DPO.
- A problem was fixed for the Advanced System Manager
Interface (ASMI) that allowed possible cross-site request forgery
(CSRF) exploitation of the ASMI user session to do unwanted tasks on
the service processor.
- A problem was fixed for I/O adapters so that BA400002
errors were changed to informational for memory boundary adjustments
made to the size of DMA map-in requests. These DMA size
adjustments were marked as UE previously for a condition that is normal.
- Multiple security problems were fixed in the Network Time
Protocol (NTP) client for buffer overflows that could be exploited to
execute arbitrary code on the service processor. The Common
Vulnerabilities and Exposures issue numbers for these problems are
CVE-2009-1252 and CVE-2009-0159.
- A security problem was fixed in the OpenSSL (Secure Socket
Layer) protocol that allowed a man-in -the middle attacker, via a
specially crafted fragmented handshake packet, to force a TLS/SSL
server to use TLS 1.0, even if both the client and server supported
newer protocol versions. The Common Vulnerabilities and Exposures issue
number for this problem is CVE-2014-3511.
- A security problem was fixed in OpenSSL for formatting
fields of security certificates without null-terminating the output
strings. This could be used to disclose portions of the program
memory on the service processor. The Common Vulnerabilities and
Exposures issue number for this problem is CVE-2014-3508.
- Multiple security problems were fixed in the way that
OpenSSL handled Datagram Transport Layer Security (DLTS) packets.
A specially crafted DTLS handshake packet could cause the service
processor to reset. The Common Vulnerabilities and Exposures
issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and
CVE-2014-3507.
- A security problem was fixed in OpenSSL to prevent a denial
of service when handling certain Datagram Transport Layer Security
(DTLS) ServerHello requests. A specially crafted DTLS handshake
packet with an included Supported EC Point Format extension could cause
the service processor to reset. The Common Vulnerabilities and
Exposures issue number for this problem is CVE-2014-3509.
- A security problem was fixed in OpenSSL to prevent a denial
of service by using an exploit of a null pointer de-reference during
anonymous Diffie Hellman (DH) key exchange. A specially crafted
handshake packet could cause the service processor to reset. The
Common Vulnerabilities and Exposures issue number for this problem is
CVE-2014-3510.
- A problem was fixed that caused a service processor
reset/reload and a SRC B1818601 error log during an IPL when adjusting
the speeds of the system fans. This problem would normally have a
successful recovery with a good IPL of the system unless two other
reset/reloads of the service processor had occurred within the last 15
minutes.
- A security problem in GNU Bash was fixed to prevent
arbitrary commands hidden in environment variables from being run
during the start of a Bash shell. Although GNU Bash is not
actively used on the service processor, it does exist in a library so
it has been fixed. This is IBM Product Security Incident Response
Team (PSIRT) issue #2211. The Common Vulnerabilities and
Exposures issue numbers for this problem are CVE-2014-6271,
CVE-2014-7169, CVE-2014-7186, and CVE-2014-7187.
System firmware changes that affect certain systems
- HIPER/Pervasive:
A problem was fixed in PowerVM where the effect of the problem is
non-deterministic but may include an undetected corruption of data,
although IBM test has not been able to make this condition occur. This
problem is only possible if VIOS (Virtual I/O Server) version 2.2.3.x
or later is installed and the following statement is true: A
Shared Ethernet Adapter (SEA) with fail over enabled is configured on
the VIOS.
- A problem was fixed for Live Partition Mobility (LPM)
migrations from Power7+ systems that use the nest accelerator (NX) for
compression and encryption usage that caused the migrated partition to
revert to software compression instead of using the NX hardware.
Some operating system negotiated functions may not operate correctly
and could impact performance.
This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power
780 (9179-MHB) systems.
- A problem was fixed for performance slow-downs during Main
Storage Dump (MSD) that can happen when SR-IOV adapters are
updating. An option was added to MSD to prevent SR-IOV updates
during the dump.
This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power
780 (9179-MHB) systems.
- On systems that have Active Memory Sharing (AMS) partitions
and deduplication enabled, a problem was fixed for not being able to
resume a hibernated AMS partition. Previously, resuming a
hibernated AMS partition could give checksum errors with SRC B7000202
logged and the partition would remain in the hibernated state.
- On systems that have Active Memory Sharing (AMS)
partitions, a problem was fixed for Dynamic Logical Partitioning
(DLPAR) for a memory remove that leaves a logical memory block (LMB) in
an unusable state until partition reboot.
- On systems with a partition with SR-IOV enabled, a problem
was fixed for a partition with one or more virtual functions (VFs),
also known as a port slice, causing the system to TI with SRC
B7000103 logged.
This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power
780 (9179-MHB) systems.
- On systems with a partition with SR-IOV enabled, a
performance problem for concurrent updates was resolved by delaying
updates to SR-IOV firmware and I/O adapters as needed to minimize
impacts on running workloads. SR-IOV delayed fixes can be
activated immediately using the "Updating SR-IOV Firmware" procedure
from the IBM Knowledge Center:
IBM Power 770 (9117-MMD): http://www-01.ibm.com/support/knowledgecenter/9117-MMD/p7hb1/p7hb1_updating_sriov_firmware.htm
IBM Power 780 (9179-MHD): http://www-01.ibm.com/support/knowledgecenter/9179-MHD/p7hb1/p7hb1_updating_sriov_firmware.htm
IBM Power ESE (8412-EAD): http://www-01.ibm.com/support/knowledgecenter/8412-EAD/p7hb1/p7hb1_updating_sriov_firmware.htm
This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power
780 (9179-MHB) systems.
- On systems in IPv6 networks, a problem was fixed for
a network boot/install failing with SRC B2004158 and IP address
resolution failing using neighbor solicitation to the partition
firmware client.
- On systems that have a boot disk located on a SAN, a
problem was fixed where the SAN boot disk would not be
found on the default boot list and then the boot disk would have
to be selected from SMS menus. This problem would normally
be seen for new partitions that had tape drives configured before the
SAN boot disk.
- On systems with a partition that has a 256MB Real Memory
Offset (RMO) region size that has been migrated from a Power8 system
to Power7 or Power6 using Live Partition Mobility (LPM), a
problem was fixed that caused a failure on the next boot of the
partition with a BA210000 log with a CA000091 checkpoint just prior to
the BA210000. The fix dynamically adjusts the memory footprint of
the partition to fit on the earlier Power systems.
- On systems with redundant service processors, a problem was
fixed in the run-time error failover to the backup service processor so
it does not terminate on FRU support interface (FSI) errors. In
the case of FSI errors on the new primary service processor, the
primary will do a reset/reload instead of a terminate.
This fix does not pertain to the IBM Power ESE (8412-EAD).
- On systems with mirrored memory and a Logical Memory Block
(LMB) size of 16MB, a problem for a LMB memory leak during an IPL was
fixed that caused partition configuration errors.
- A problem was fixed for systems in networks using the
Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to
prevent network ping errors and boot from network (bootp)
failures. The Address Resolution Protocol (ARP) table information
on the Juniper aggregated switches is not being shared between the
switches and that causes problems for address resolution in certain
network configurations. Therefore, the CEC network stack code has
been enhanced to add three gratuitous ARPs (ARP replies sent without a
request received) before each ping and bootp request to ensure that all
the network switches have the latest network information for the system.
- For systems with a IBM i load source disk attached to an
Emulex-based fibre channel adapter such as F/C #5735, a problem was
fixed that caused an IBM i load source boot to fail with SRC B2006110
logged and a message to the boot console of "SPLIT-MEM Out of
Room". This problem occurred for load source disks that needed
extra disk scans to be found, such as those attached to a port other
than the first port of a fibre channel adapter (first port requires
fewest disk scans).
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- A problem was fixed for a power off failure of an expansion
drawer (F/C 5802 or F/C 5877) during a concurrent repair. The
power off commands to the drawer are now tried again using the System
Power Control Network (SPCN) serial connection to the drawer to allow
the repair to continue.
|
AM780_059_040 / FW780.11
06/23/14 |
Impact: Security
Severity: HIPER
System firmware changes that affect all systems
- HIPER/Pervasive: A security
problem was fixed in the OpenSSL (Secure Socket Layer) protocol that
allowed clients and servers, via a specially crafted handshake packet,
to use weak keying material for communication. A
man-in-the-middle attacker could use this flaw to decrypt and modify
traffic between the management console and the service processor.
The Common Vulnerabilities and Exposures issue number for this problem
is CVE-2014-0224.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL for a buffer overflow in the
Datagram Transport Layer Security (DTLS) when handling invalid DTLS
packet fragments. This could be used to execute arbitrary code on
the service processor. The Common Vulnerabilities and Exposures
issue number for this problem is CVE-2014-0195.
- HIPER/Pervasive:
Multiple security problems were fixed in the way that OpenSSL handled
read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was
enabled to prevent denial of service. These could cause the
service processor to reset or unexpectedly drop connections to the
management console when processing certain SSL commands. The
Common Vulnerabilities and Exposures issue numbers for these problems
are CVE-2010-5298 and CVE-2014-0198.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL to prevent a denial of service
when handling certain Datagram Transport Layer Security (DTLS)
ServerHello requests. A specially crafted DTLS handshake packet could
cause the service processor to reset. The Common Vulnerabilities
and Exposures issue number for this problem is CVE-2014-0221.
- HIPER/Pervasive:
A security problem was fixed in OpenSSL to prevent a denial of service
by using an exploit of a null pointer de-reference during anonymous
Elliptic Curve Diffie Hellman (ECDH) key exchange. A specially
crafted handshake packet could cause the service processor to
reset. The Common Vulnerabilities and Exposures issue number for
this problem is CVE-2014-3470.
- A security problem was fixed in the service processor
TCP/IP stack to discard illegal TCP/IP packets that have the SYN and
FIN flags set at the same time. An explicit packet discard was
needed to prevent further processing of the packet that could result in
an bypass of the iptables firewall rules.
|
AM780_056_040 / FW780.10
04/25/14 |
Impact: Serviceability
Severity: SPE
New Features and Functions
- Support for the 9117-MMD, 9179-MHD and 8412-EAD systems.
- Support was added to the Virtual I/O Server (VIOS) for
shared storage pool mirroring (RAID-1) using the virtual SCSI (VSCSI)
storage adapter to provide redundancy for data storage.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support was added to the Management Console command line to
allow configuring a shared control channel for multiple pairs of Shared
Ethernet Adapters (SEAs). This simplifies the control channel
configuration to reduce network errors when the SEAs are in fail-over
mode.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support was added for Single Root I/O Virtualization
(SR-IOV) that enables the hypervisor to share a SR-IOV-capable
PCI-Express adapter across multiple partitions. The SR-IOV mode is
supported for the following Ethernet Network Interface Controller (NIC)
I/O adapters (SR-IOV supported in both native mode and through VIOS):
- F/C EN10 and CCIN 2C4C - Integrated Multi-function Card with
Dual
10Gb
Ethernet RJ45 and Copper Twinax
- F/C EN11 and CCIN 2C4D - Integrated Multi-function Card
with Dual
10Gb
Ethernet RJ45 and Short Range (SR) Optical
- F/C EN0H and CCIN 2B93 - PCI Express Generation 2
(PCIe2)
2x10Gb
FCoE 2x1Gb Ethernet SFP+ Adapter
- F/C EN0K and CCIN 2CC1 - PCI Express Generation 2
(PCIe2)
4-port
(10Gb FCoE & 1Gb Ethernet) SFP+Copper and RJ45
System firmware updates the adapter firmware level on these adapters to
1.1.58.4 when a supported adapter is placed into SR-IOV mode.
The SR-IOV mode for Ethernet NIC is supported on the following OS
levels:
- AIX 6.1Y TL3 SP2, or later
- AIX 7.1N TL3 SP2, or later
- IBMi 7.1 with TR8, or later
- SUSE Linux Enterprise Server 11 SP3
- Red Hat Enterprise Linux 6.5
- VIOS 2.2.3.2, or later
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support was added to the Advanced System Management
Interface (ASMI) to provide a menu for "Power Supply Idle Mode".
Using the "Power Supply Idle Mode" menu, the power supplies can
be either set enabled to save power by idling power supplies when
possible or set disabled to keep all power supplies fully on and allow
a balanced load to be maintained on the power distribution units (PDUs)
of the system. Power supply idle mode enabled helps to reduce
overall power usage when the system load is very light by having one
power supply deliver all the power while the second power supply is
maintained in a low power state. All power supplies must be
present and have support for power supply idle mode before power supply
mode can be enabled.
Power Supply Idle Mode is not supported on IBM Power 770 (9117-MMB) and
IBM Power 780 (9179-MHB) systems.
- Support was added for monitored compliance of the Power
Integrated Facility for Linux (IFL). IFL is an optional lower
cost per processor core activation for Linux-only workloads on IBM
Power Systems. Power IFL processor cores can be activated that
are restricted to running Red Hat Linux or SUSE linux. In
contrast, processor cores that are activated for general-purpose
workloads can run any supported operating system. Power IFL
processor cores are enabled by feature code ELJ1 using Capacity Upgrade
on Demand (CUoD). Linux partitions can use IFL processors and the
other processor cores but AIX and IBM i5/OS cannot use the IFL
processors. The IFL monitored compliance process will send
customer alert messages to the management console if the system is out
of compliance for the number of IFL processors and general-purpose
workload processors that are in active use compared to the number that
have been licensed.
Power IFL and monitored compliance is not supported on IBM Power ESE
(8412-EAD) system because it has the AIX operating system only.
- System recovery for interrupted AC power and Voltage
Regulator Module (VRM) failures has been enhanced for systems with
multiple CEC enclosures such that a power AC or VRM fault on one CEC
drawer will no longer block the other CEC drawers from powering
on. Previously, all CEC enclosures in a system needed valid AC
power before the power on of the system could proceed.
This system recovery feature does not pertain to the IBM Power ESE
(8412-EAD) system because it is a single CEC enclosure system.
- Support for IBM PCIe 3.0 x8 dual 4-port SAS RAID adapter
with 12 GB cache with feature code EJ0L and CCIN 57CE.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support was added to the Management Console and the Virtual
I/O Server (VIOS) to provide the capability to to enable and disable
individual virtual ethernet adapters from the management console.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support was added for the IBM Flash Adapter 90
(#ES09) PCIe 2.0 x8 with 0.9TB of usable enterprise multi-level
cell (eMLC) flash memory . The system recognizes the PCI device
as a high power device needing additional cooling and increases the fan
speeds accordingly. This flash feature also provides:
- Up to 325K read IOPs and less than 100 micro
second latency.
- Four independent flash controllers.
- Capacitive emergency power loss protection.
- Half-length, full-height PCIe card form
factor.
The IBM Flash Adapter 90 is not included in base AIX installation
media. AIX feature support can be acquired at IBM Fix Central: http://www-933.ibm.com/support/fixcentral/
by selecting the Product
Group System Storage. This feature is not supported on IBM Power
770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
- Support for Management Console logical partition
Universally Unique IDs (UUIDs) so that the HMC preserves the UUID for
logical partitions on backup/restore and migration.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support for IBM PCIe 3.0 x8 non-caching 2-port SAS RAID
adapter with feature code EJ0J and CCIN 57B4.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support for Power Enterprise System Pools allows for the
aggregation of Capacity on Demand (CoD) resources, including processors
and memory, to be moved from one pool server to any other pool server
as needed.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support for a Management Console Performance and Capacity
Monitor (PCM) function to monitor and manage both physical and virtual
resources.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- Support for virtual server network (VSN) Phase 2 that
delivers IEEE standard 802.1Qbg based on Virtual Ethernet Port
Aggregator (VEPA) switching. This supports the Management Console
assignment of the VEPA switching mode to virtual Ethernet switches used
by the virtual Ethernet adapters of the logical partitions. The
server properties in the Management Console will show the capability
"Virtual Server Network Phase 2 Capable" as "True" for the system.
This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
System firmware changes that affect all systems
- A problem was fixed that prevented a HMC-managed system
from being converted to manufacturing default configuration (MDC) mode
when the management console command "lpcfgop -m <server> -o
clear" failed to create the default partition. The management
console went to the incomplete state for this error.
- A problem was fixed that logged an incorrect call home
B7006956 NVRAM error during a power off of the system. This error
log indicates that the NVRAM of the system is in error and will be
cleared on the next IPL of the system. However, there is no NVRAM
error and the error log was created because a reset/reload of the
service processor occurred during the power off.
- Help text for the Advanced System Management Interface
(ASMI) "System Configuration/Hardware Deconfiguration/Clear All
Deconfiguration Errors" menu option was enhanced to clarify that when
selecting "Hardware Resources" value of "All hardware resources", the
service processor deconfiguration data is not cleared. The
"Service processor" must be explicitly selected for that to be cleared.
- A firmware code update problem was fixed that caused the
Hardware Management Console (HMC) to go to "Incomplete State" for the
system with SRC E302F880 when assignment of a partition universal
unique identifier (UUID) failed for a partition that was already
running. This problem happens for disruptive code updates from
pre-770 levels to 770 or later levels.
- A problem was fixed that caused frequent SRC B1A38B24 error
logs with a call home every 15 seconds when service processor network
interfaces were incorrectly configured on the same subnet. The
frequency of the notification of the network subnet error has been
reduced to once every 24 hours.
- A problem was fixed that caused a memory clock failure to
be called out as failure in the processor clock FRU.
- A problem was fixed where a 12V DC power-good (pGood) input
fault was reported as a SRC 11002620 with the wrong FRU callout of
Un-P1 for system backplane. The FRU callout for SRC 11002620 has
been corrected to Un-P2 for I/O card.
- A problem was fixed that prevented guard error logs from
being reported for FRUs that were guarded during the system power
on. This could happen if the same FRU had been previously
reported as guarded on a different power on of the system. The
requirement is now met that guarded FRUs are logged on every power on
of the system.
- A problem was fixed for the Advanced System Management
Interface (ASMI) "Login Profile/Change Password" menu where ASMI would
fail with "Console Internal Error, status code 500" displayed on the
web browser when an incorrect current password was entered.
- A problem was fixed for a system with pool resources for a
resource remove operation that caused the number of unreturned
resources to become incorrect. This problem occurred if the
system first became out of compliance with overdue unreturned resources
and then another remove of a pool resources from the server was
attempted.
- A problem was fixed for the Advanced System Management
Interface (ASMI) "System Information/Firmware Maintenance
History" menu option on the service processor to display the firmware
maintenance history instead of the message "No code update
history log was found".
- A problem was fixed for a Live Partition Mobility (LPM)
suspend and transfer of a partition that caused the time of day to skip
ahead to an incorrect value on the target system. The problem
only occurred when a suspended partition was migrated to a target CEC
that had a hypervisor time that was later than the source CEC.
- A problem was fixed for IBM Power Enterprise System Pools
that prevented the management console from changing from the backup to
the master role for the enterprise pool. The following error
message was displayed on management console: "HSCL90F7 An
internal error occurred trying to set a new master management console
for the Power enterprise pool. Try the operation again. If this
error persists, contact your service representative."
This defect does not pertain to the IBM Power 770 (9117-MMB) and IBM
Power 780 (9179-MHB) systems.
- A problem was fixed for Live Partition Mobility (LPM) where
a 2x performance decrease occurs during the resume phase of the
migration when migrating from a system with 780 or later firmware back
to a system with a pre-780 level of firmware.
System firmware changes that affect certain systems
- On systems with multiple CEC drawers or nodes, a problem
was fixed in the service processor Advanced System Management Interface
(ASMI) performance dump collection that only allowed performance data
to be collected for the first node of the system. The
"System Service Aids/Performance Dump" menu of the ASMI is used to work
with the performance dump.
- On systems involved in a series of consecutive Live
Partition Mobility (LPM) operations, a memory leak problem was fixed in
the run time abstraction service (RTAS) that caused a partition run
time AIX crash with SRC 0c20. Other possible symptoms include
error logs with SRC BA330002 (RTAS memory allocation failure).
- On systems running Dynamic Platform Optimizer (DPO) with
one or more unlicensed processors, a problem was fixed where the system
performance was significantly degraded during the DPO operation.
The amount of performance degradation was more for systems with larger
numbers of unlicensed processors.
- On systems with a redundant service processor, a problem
was fixed where the service processor allowed a clock failover to occur
without a SRC B158CC62 error log and without a hardware deconfiguration
record for the failed clock source. This resulted in the system
running with only one clock source and without any alerts to warn that
clock redundancy had been lost.
- DEFERRED: On
systems with a redundant service processor, a problem was fixed that
caused a system termination with SRC B158CC62 during a clock failover
initiated by certain types of clock card failures. This deferred
fix addresses a problem that has a very low probability of
occurrence. As such customers may wait for the next planned
service window to activate the deferred fix via a system reboot.
This problem does not pertain to IBM Power 770 (9117-MMB) and IBM Power
780 (9179-MHB) systems.
- On systems with a management console and service processors
configured with Internet Protocol version 6 (IPv6) addresses, a
problem was fixed that prevented the management console from
discovering the service processor. The Service Location Protocol
(SLP) on the service processor was not being enabled for IPv6, so it
was unable to respond to IPv6 queries.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed that occurred during Offline Converter Assembly (OCA)
replacement operations. The fix prevents a false Voltage
Regulator Module (VRM) fault and the logging of SRCs 10001511 or
10001521 from occurring. This resulted in the OCA LED
getting stuck in an on or "fault" state and the OCA not powering on.
- On systems with one memory clock deconfigured, a problem
was fixed where the system failed to IPL using the second memory clock
with SRCs B158CC62 and B181C041 logged.
- On systems that require in-band flash to update system
firmware, a problem was fixed so in-band update would not fail if the
Permanent (P) or the Temporary (T) side of the service processor was
marked invalid. Attempting to in-band flash from the AIX or
Linux command line failed with a BA280000 log reported.
Attempting to in-band flash from the AIX diagnostics menus also failed
because the flash menu options did not appear in this case.
- On a system with a partition with a AIX and Linux boot
source to support dual booting, a problem was fixed that caused the
Host Ethernet Adapter (HEA) to be disabled when rebooting from Linux to
AIX. Linux had disabled interrupts for the HEA on power down,
causing an error for AIX when it tried to use the HEA to access the
network.
- On a system with a disk device with multiple boot
partitions, a problem was fixed that caused System Management Services
(SMS) to list only one boot partition. Even though only one boot
partition was listed in SMS, the AIX bootlist command could still be
used to boot from any boot partition.
- On systems with a redundant service processor with AC power
missing to the node containing the anchor card, a problem was fixed
that caused an IPL failure with SRC B181C062 when the anchor card could
not be found in the vital product data (VPD) for the system. With
the fix, the system is able to find the anchor card and IPL since the
anchor card gets its power from the service processor cable, not from
the node where it resides.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- On a system with sixteen or more logical partitions, a
problem was fixed for a memory relocation error during concurrent hot
node repair that caused a hang or a failure. The problem can also
be triggered by mirrored memory defragmentation on a system with
selective memory mirroring.
|
AM780_054_040 / FW780.02
04/18/14 |
Impact: Security
Severity: HIPER
System firmware changes that affect all systems
- HIPER/Pervasive:
A security problem was fixed in the OpenSSL Montgomery ladder
implementation for the ECDSA (Elliptic Curve Digital Signature
Algorithm) to protect sensitive information from being obtained with a
flush and reload cache side-channel attack to recover ECDSA nonces from
the service processor. The Common Vulnerabilities and Exposures
issue number is CVE-2014-0076. The stolen ECDSA nonces could be
used to decrypt the SSL sessions and compromise the Hardware Management
Console (HMC) access password to the service processor.
Therefore, the HMC access password for the managed system should be
changed after applying this fix.
- HIPER/Pervasive:
A security problem was fixed in the OpenSSL Transport Layer
Security (TLS) and Datagram Transport Layer Security (DTLS) to not
allow Heartbeat Extension packets to trigger a buffer over-read to
steal private keys for the encrypted sessions on the service
processor. The Common Vulnerabilities and Exposures issue number
is CVE-2014-0160 and it is also known as the heartbleed
vulnerability. The stolen private keys could be used to decrypt
the SSL sessions and and compromise the Hardware Management Console
(HMC) access password to the service processor. Therefore, the
HMC access password for the managed system should be changed after
applying this fix.
- A security problem was fixed for the Lighttpd web
server that allowed arbitrary SQL commands to be run on the service
processor. The Common Vulnerabilities and Exposures issue number
is CVE-2014-2323.
- A security problem was fixed for the Lighttpd web server
where improperly-structured URLs could be used to view arbitrary files
on the service processor. The Common Vulnerabilities and
Exposures issue number is CVE-2014-2324.
|
AM780_050_040 / FW780.01
03/10/14 |
Impact: Data
Severity: HIPER
System firmware changes that affect all systems
- HIPER/Non-Pervasive:
A problem was fixed for a potential silent data corruption issue that
may occur when a Live Partition Mobility (LPM) operation is performed
from a system (source system) running a firmware level earlier than
AH780_040 or AM780_040 to a system (target system) running AH780_040 or
AM780_040.
|
AM780_040_040 / FW780.00
12/06/13 |
Impact:
New
Severity: New
New Features and Functions
- Support was added to upgrade the service processor to
openssl version 1.0.1 and for compliance to National Institute of
Standards and Technologies (NIST) Special Publications 800-131a.
SP800-131a compliance required the use of stronger cryptographic keys
and more robust cryptographic algorithms.
- Support was added to the Virtual I/O Server (VIOS) for
Universal Serial Bus (USB) removable hard-disk drive (HDD) devices.
- Support was added in Advanced System Management Interface
(ASMI) to facilitate capture and reporting of debug data for system
performance problems. The "System Service Aids/Performance
Dump" menu was added to ASMI to perform this function.
- Support was added to the Management Console for group-based
LDAP authentication.
- Partition Firmware was enhanced to to be able to recognize
and boot from disks formatted with the GUID Partition Table (GPT)
format that are capable of being greater than 2TB in size. GPT is
a standard for the layout of the partition table on a physical hard
disk, using globally unique identifiers (GUID), that does not have the
2TB limit that is imposed by the DOS partition format.
- The call home data for every serviceable event of the
system was enhanced to include information on every guarded element
(processor, memory,I/O chip, etc) and contains the part number and
location codes of the FRUs and the service processor de-configuration
policy settings.
- Support for Dynamic Platform Optimizer (DPO) enhancements
to show the logical partition current and potential affinity
scores. The Management Console has also been enhanced to show the
partition scoring. The operating system (OS) levels that support
DPO:
◦
AIX 6.1 TL8 or later
◦
AIX 7.1 TL2 or later
◦
VIOS 2.2.2.0
◦
IBM i 7.1 PTF MF56058
◦
Linux RHEL7
◦
Linux SLES12
Note:
If DPO is used with an older version of the OS that predates the above
levels, either:
- The partition needs to be rebooted after DPO completes to optimize
placement, or
- The partition is excluded from participating in the DPO operation
(through a command line option on the "optmem" command that is used to
initiate a
DPO operation).
- Support for Dynamic Platform Optimizer (DPO) on 9117-MMB
annd 9179-MHB systems.
- Support for Management Console command line to configure
the ECC call home path for SSL proxy support.
- Support for Management Console to minimize recovery state
problems by using the hypervisor and VIOS configuration data to
recreate partition data when needed.
- Support for Management Console to provide scheduled
operations to check if the partition affinity falls below a threshold
and alert the user that Dynamic Platform Optimizer (DPO) is needed.
- Support for enhanced platform serviceability to extend call
home to include hardware in need of repair and to issue periodic
service events to remind of failed hardware.
- Support for Virtual I/O Server (VIOS) to support 4K block
size DASD as a virtual device.
- Support for performance improvements for concurrent Live
Partition Mobility (LPM) migrations.
- Support for Management Console to handle all Virtual I/O
Server (VIOS) configuration tasks and provide assistance in configuring
partitions to use redundant VIOS.
- Support for Management Console to maintain a profile that
is synchronized with the current configuration of the system, including
Dynamic Logical Partitioning (DLPAR) changes.
- Support for Virtual I/O Server (VIOS) for an IBMi client
data connection to a SIS64 device driver backed by VSCSI physical
volumes.
- Support was dropped for Secured Socket Layer (SSL) protocol
version 2 and SSL weak and medium cipher suites in the service
processor web server (Ligthttpd) . Unsupported web browser
connections to the Advanced System Management Interface (ASMI) secured
port 443 (using https://) will now be rejected if those browsers do not
support SSL version 3. Supported web browsers for Power7 ASMI are
Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0),
Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
- Support was added in Advanced System Management Interface
(ASMI) "System Configuration/Firmware Update Policy" menu to detect and
display the appropriate Firmware Update Policy (depending on whether
system is HMC managed) instead of requiring the user to select the
Firmware Update Policy. The menu also displays the "Minimum Code
Level Supported" value.
System firmware changes that affect all systems
- A problem was fixed that caused a service processor OmniOrb
core dump with SRC B181EF88 logged.
- A problem was fixed that caused the system attention LED to
stay lit when a bad FRU was replaced.
- A problem was fixed that caused a memory leak of 50 bytes
of service processor memory for every call home operation. This
could potentially cause an out of memory condition for the service
processor when running over an extended period of time without a reset.
- A problem was fixed that caused a L2 cache error to not
guard out the faulty processor, allowing the system to checkstop again
on an error to the same faulty processor.
- A problem was fixed that caused a HMC code update failure
for the FSP on the accept operation with SRC B1811402 or FSP is unable
to boot on the updated side.
- A problem was fixed that caused a system checkstop during
hypervisor time keeping services.
- A problem was fixed that caused a built-in self test (BIST)
for GX slots to create corrupt error log values that core dumped the
service processor with a B18187DA. The corruption was caused by a
failure to initialize the BIST array to 0 before starting the tests.
- The Hypervisor was enhanced to allow the system to continue
to boot using the redundant Anchor (VPD) card, instead of stopping the
Hypervisor boot and logging SRC B7004715, when the primary Anchor
card has been corrupted.
- A problem was fixed with the Dynamic Platform Optimizer
(DPO) that caused memory affinity to be incorrectly reported to the
partitions before the memory was optimized. When this
occurs, the performance is impacted over what would have been gained
with the optimized memory values.
- A problem was fixed that caused a migrated partition to
reboot during transfer to a VIOS 2.2.2.0, and later, target system. A
manual reboot would be required if transferred to a target system
running an earlier VIOS release. Migration recovery may also be
necessary.
- A problem was fixed that can cause Anchor (VPD) card
corruption and A70047xx SRCs to be logged. Note: If a
serviceable event with SRC A7004715 is present or was logged
previously, damage to the VPD card may have occurred. After the fix is
applied, replacement of the Anchor VPD card is recommended in
order to restored full redundancy.
- The firmware was enhanced to display on the management
console the correct number of concurrent Live Partition Mobility (LPM)
operations that is supported.
- A problem was fixed that caused a 1000911E platform event
log (PEL) to be marked as not call home. The PEL is now a call
home to allow for correction. This PEL is logged when the
hypervisor has changed the Machine Type Model Serial Number (MTMS) of
an external enclosure to UTMP.xxx.xxxx because it cannot read the vital
product data (VPD), or the VPD has invalid characters, or if the MTMS
is a duplicate to another enclosure
- A problem was fixed that caused the state of the Host
Ethernet Adapter (HEA) port to be reported as down when the physical
port is actually up.
- When powering on a system partition, a problem was fixed
that caused the partition universal unique identifier (UUID) to not get
assigned, causing a B2006010 SRC in the error log.
- For the sequence of a reboot of a system partition followed
immediately by a power off of the partition, a problem was fixed where
the hypervisor virtual service processor (VSP) incorrectly retained
locks for the powered off partition, causing the CEC to go into
recovery state during the next power on attempt.
- A problem was fixed that caused an error log generated by
the partition firmware to show conflicting firmware levels. This
problem occurs after a firmware update or a Live Partition Mobility
(LPM) operation on the system.
- A problem was fixed that caused the system attention LED
to be lit without a corresponding SRC and error log for the
event. This problem typically occurs when an operating system on
a partition terminates abnormally.
- A problem was fixed that caused the slot index to be
missing for virtual slot number 0 for the dynamic reconfiguration
connector (DRC) name for virtual devices. This error was visible
from the management console when using commands such as "lshwres -r
virtualio --rsubtype slot -m machine" to show the hardware resources
for virtual devices.
- A problem was fixed that caused a system checkstop with SRC
B113E504 for a recoverable hardware fault.
- A problem was fixed during resource dump processing that
caused a read of an invalid system memory address and a SRC
B181C141. The invalid memory reference resulted from the service
processor incorrectly referencing memory that had been relocated by the
hypervisor.
System firmware changes that affect certain systems
- A problem was fixed that caused fans to increase to maximum
speeds with SRC B130B8AF logged as a result of thermal sensors with
calibration errors.
- On systems with an I/O tower attached, a problem was fixed
that caused multiple service processor reset/reloads if the tower was
continuously sending invalid System Power Control Network (SPCN) status
data.
- On systems with a redundant service processor, a problem
was fixed that caused fans to run at a high-speed after a failover to
the sibling service processor.
- On systems with a F/C 5802 or 5877 I/O drawer installed,
the firmware was enhanced to guarantee that an SRC will be generated
when there is a power supply voltage fault. If no SRC is
generated, a loss of power redundancy may not be detected, which can
lead to a drawer crash if the other power supply goes down. This
also fixes a problem that causes an 8 GB Fiber channel adapter in
the drawer to fail if the 12V level fails in one Offline Converter
Assembly (OCA).
- On systems managed by an HMC with a F/C 5802 or 5877 I/O
drawer installed, a problem was fixed that caused the hardware topology
on the management console for the managed system to show "null" instead
of "operational" for the affected I/O drawers.
- On systems with a redundant service processor, a problem
was fixed that caused a guarded sibling service processor
deconfiguration details to not be able to be shown in the Advanced
System Management Interface (ASMI).
- On systems with a redundant service processor, a problem
was fixed that caused a SRC B150D15E to be erroneously logged after a
failover to the sibling service processor.
- On systems with a F/C 5802 or 5877 I/O drawer installed, a
problem was fixed that where a Offline Converter Assembly (OCA) fault
would appear to persist after a OCA micro-reset or OCA
replacement. The fault bit reported to the OS may not be cleared,
indicating a fault still exists in the I/O drawer after it has been
repaired.
- When switching between turbocore and maxcore mode, a
problem was fixed that caused the number of supported partitions to be
reduced by 50%.
- On systems in turbocore mode with unlicensed processors, a
problem was fixed that caused an incorrect processor count. The
AIX command lparstat gave too high a value for "Active Physical CPUs in
system" when it included unlicensed turbocore processors in the count
instead of just counting the licensed processors.
- A problem was fixed that was caused by an attempt to modify
a virtual adapter from the management console command line when the
command specifies it is an Ethernet adapter, but the virtual ID
specified is for an adapter type other than Ethernet. The managed
system has to be rebooted to restore communications with the management
console when this problem occurs; SRC B7000602 is also logged.
- On systems running AIX or Linux, a problem was fixed that
caused the operating system to halt when an InfiniBand Host Channel
Adapter (HCA) adapter fails or malfunctions.
- On systems running AIX or linux, a hang in a Live Partition
Mobility (LPM) migration for remote restart-capable partitions was
fixed by adding a time-out for the required paging space to become
available. If after five minutes the required paging space is not
available, the start migration command returns a error code of
0x40000042 (PagingSpaceNotReady) to the management console.
- On systems running Dynamic Platform Optimizer (DPO) with no
free memory, a problem was fixed that caused the Hardware
Management System (HMC) lsmemopt command to report the wrong status of
completed with no partitions affected. It should have indicated
that DPO failed due to insufficient free memory. DPO can only run
when there is free memory in the system.
- On systems with partitions using physical shared processor
pools, a problem was fix that caused partition hangs if the shared
processor pool was reduced to a single processor.
- On a system running a Live Partition Mobility (LPM)
operation, a problem was fixed that caused the partition to
successfully appear on the target system, but hang with a 2005 SRC.
- A problem was fixed that caused SRC BA330000 to be logged
after the successful migration of a partition running Ax740_xxx or
Ax730_xxx firmware to a system running Ax760, or a later release, or
firmware. This problem can also cause SRCs BA330002, BA330003,
and BA330004 to be erroneously logged over time when a partition is
migrated from a system running Ax760, or a later release, to a system
running Ax740_xxx or Ax730_xxx firmware.
- On systems using IPv6 addresses, the firmware was enhanced
to reduce the time it take to install an operating system using the
Network Installation Manager (NIM).
- On systems managed by a management console, a problem was
fixed that caused a partition to become unresponsive when the AIX
command "update_flash -s" is run.
- On systems with turbo-core enabled that are a target of
Live Partition Mobility (LPM), a problem was fixed where
cache properties were not recognized and SRCs BA280000 and BA250010
reported.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
- A problem was fixed that caused a concurrent hot add/repair
maintenance operation to fail on an erroneously logged error for the
service processor battery with SRCs B15A3303, B15A3305, and
B181EA35 reported.
- A problem was fixed that caused a concurrent hot add/repair
maintenance operation to fail if a memory channel failure on the CEC
was followed by a service processor reset/reload.
- A problem was fixed that caused SRC B15A3303 to be
erroneously logged as a predictive error on the service processor
sibling after a successful concurrent repair maintenance operation for
the real-time clock (RTC) battery.
- A problem was fixed that prevented the I/O slot information
from being presented on the management console after a concurrent node
repair.
- A problem was fixed that caused Capacity on Demand (COD)
"Out of Compliance" messages during concurrent maintenance operations
when the system was actually in compliance for the licensed amount of
resources in use.
|