PCIe3 2 PORT 25/10 Gb NIC&ROCE SR/Cu ADAPTER (FC: EC2U, EC2T)
******* PLEASE
READ THIS ENTIRE NOTICE *********
DATE: July 18th, 2022
1.0 Microcode and Document
Revision History
7.0
Determine the Current Microcode Level for AIX
8.0 Downloading the RPM Format File to the
Target Server for AIX
9.0 Discovery Tool
Microcode CD-ROM creation and download instructions
10.0 Verifying
microcode before download and notes for AIX
11.0 Microcode Download Procedure for AIX
11.1 Setting up for Microcode download
11.2 Downloading Microcode to the
Adapter
11.3 Verify and Re-configure adapters
Firmware Level |
Description |
14.31.1014 / 001400311014 |
Impact:
Availability Severity: ATT 1.
Fixed classification issues for "Passive" cables to be more robust.
2.
Fixed an issue that caused a fatal error, and eventually resulted in the HCA
hanging when a packet was larger than a strided receive WQE that was being
scattered. 3.
Fixed an issue that caused Tx to hang when a duplicate packet rollback
occurred. 4.
Fixed an issue that prevented events from being sent when only the DCBX oper
version was changed. 5.
Fixed an issue that prevented a SFP28 cable from linking up in a 25GbE speed. |
The Firmware Levels
Below Are No Longer Supported By IBM Once They Have Been Removed From The
Microcode Download Website. It is best practices
to update to the latest FW level not only for IBM support of these products,
but for optimal performance and to ensure that all of the required HW/FW
fixes are installed. Once new FW has been released to the field, we will
provide a 6 month grace period for customers to update these products to the
currently supported FW level. Please Update
To The Latest Level At Your Earliest Convenience
|
|
14.29.1017 /
001400291017 |
Impact: Availability
Severity: ATT 1. Update to Fcode - Fixed an issue with the TX buffer index not properly wrapping causing iSCSI boot to fail. 2. Fixed an issue that caused packets to drop due to header size issues and/or failing checks. The issue was caused due to a Linux issue that caused VFs to set the wrong header size value in wqe_inline_header_mode input.
3. On rare cases, a fatal error related to errors from the PCI transport layer might be reported during FLR.
4. Fixed an issue that caused the device to go to dead IRISC as one of the firmware semaphores could not be released when a speed change or port state change was triggered.
5. Fixed a rare race condition that caused an erroneous write to the firmware image during certain power-down scenarios. This resulted in firmware being recognized as invalid and prevented the adapter card from being recognized by the system due to missing valid Flash images.
6. Fixed an issue that caused the sent packet to hang while the device entered FLR mode.
7. Enabled bytewise BAR configuration. 8. Fixed a rare case where the device hung while running the sw reset flow under heavy stress and with many open resources.
9. Fixed an error that prevented the completions (CQ) from being completed due to a race condition in the firmware transport error handlers.
10. Fixed an issue that cause fragmented IP packets to drop.
11. Fixed a firmware fatal assert that showed an IRISC HANG due to init_hca
12. Fixed a performance degradation issue, and a high packet drop when SR-IOV was enabled and packets went through the FDB default behavior.
13. Fixed a rare issue that caused other active functions to receive a malformed CQE during driver (PF or VF) unload or FLR flows.
14. Fixed an issue that caused the firmware to hang when an FLR occurred at the same time as the teardown.
Please review comments about
SRIOV and AIX APARs in General Information section of this release notes.
|
14.24.8000 /
001400248000 |
Impact: Availability Severity: ATT 1. Adds Secure Fcode support 2. Fixes Health Syndrome : HW_FATAL_ERR due to halt in
health counter sometimes seen after a NIM install (The following
are from vendor Bug Fixes History from fw release notes that could
potentially affect POWER products) 3. Fixes a standby deadlock that was caused when PCIe
reset arrived during traffic 4. Fixes an issue that occasionally caused the link to
drop after running the ‘reboot’ command due to I2C being stuck. |
14.21.3002 (for AIX
and Linux) |
Impact: NEW Severity: NEW Original Release for EC2U and EC2T adapter |
Document Revision History |
Description |
V1.0 – 03/13/2018 |
Original Release |
V2.0 – 04/25/2019 |
Updated instructions for new fw 14.24.8000 / 001400248000
release |
V3.0 – 05/11/2021 |
Updated instructions for new fw 14.29.1017/ 001400291017
release |
V4.0 – 07/18/2022 |
Updated instructions for new fw 14.31.1014/ 001400311014
release |
This Readme file is intended to give directions on how to update the
microcode found on the PCIe3
2 PORT 25/10 Gb NIC&ROCE SR/Cu ADAPTER
1. Non-Concurrent Download (Linux
Only)
The microcode installation does NOT support
concurrent download in Linux. The device can be used during and after the
download, but update will not go into effect until a reboot is performed.
2. It is recommended that the
installation be scheduled during a maintenance window or during non-peak
production periods.
3. It is best practice to update to latest FW
level not only for IBM support of these products, but for optimal performance
and to ensure that all the required HW/FW fixes are installed.
4. Once new FW has been released
to the field, we will provide a 6 month grace period for customers to update
these products to the currently supported FW level.
5. Adapter in PowerVM SRIOV
shared mode
- This adapter firmware release notes applies to adapter configured in
dedicated mode.
- When adapter is transitioned to SRIOV mode, the system firmware updates the
adapter firmware, which may differ from the firmware used in dedicated mode.
- When adapter is moved back to dedicated mode, user will need to update the
adapter firmware to the level mentioned in this release notes.
6. Release Notes for
adapter firmware in PowerVM SRIOV shared mode.
Please visit fix central (http://www.ibm.com/support/fixcentral/ ) and review the release notes
pertaining to your system MTM and
installed system firmware.
7. For more information about
adapters running in PowerVM SRIOV shared
mode visit: SRIOV
FAQs and vNIC
FAQs
8. To avoid hitting the below mentioned
AIX driver issues with the new adapter microcode, please either install that
APARs corresponding to your AIX/VIOS version mentioned below,
OR update your AIX/VIOS to latest level available on fix central .
a)
Fix to resolve Adapter diagnostics run with
newer adapter firmware.
AIX Level APAR
7300-00
(73) = IJ37742
7200-05
(72 TL5)= IJ37699
7100-05
(71 TL5)= IJ37754
b)
Driver fix to resolve Adapter microcode
downgrade from 0014291017 to prior level fails
AIX
Level APAR
7200-05 (72 TL5)=
IJ30441
7100-05 (71 TL5)=
IJ30442
c) Driver update to support transition
from 4MB to 8 MB microcode file size.
AIX Level APAR
7200-04
(72 TL4) = IJ14175
6100-09
(61 TL9) = IJ14204
7100-05
(71 TL5) = IJ14334
7200-03
(72 TL3) = IJ14355
7200-02
(72 TL2) = IJ14640
7200-01
(72 TL1) = IJ14691
7100-04
(71 TL4) = IJ14807
d) If upgrading from 14.21.3002 to a
newer fw, have the following APARs otherwise a hotplug may be needed to apply
adapter fw after a microcode update task.
AIX Level APAR
6100-09 (61 TL9) = IJ04287
7100-05 (71 TL5) = IJ04286
7200-02 (72 TL2) = IJ04285
7100-04 (71 TL4) = IV89962
7200-01 (72 TL1) = IV89963
Approximately 20 minutes.
Feature
Code: EC2U
· 9008-22L
· 9009-22A
· 9009-22G
· 9009-41A
· 9009-41G
· 9009-42A
· 9009-42G
· 9223-22H
· 9223-22S
· 9223-42H
· 9223-42S
· 9040-MR9
· 9043-MRX
· 9080-HEX
· 9080-M9S
· 9105-22A
· 9105-22B
· 9105-41B
· 9105-42A
· 9786-22H
· 9786-42H
Feature
Code: EC2T
· 9008-22L
· 9009-22A
· 9009-22G
· 9223-22H
· 9223-22S
· 8335-GTC
· 8335-GTH
· 8335-GTW
· 8335-GTX
· 9080-HEX
· 9080-M9S
· 9043-MRX
· 9105-22A
· 9105-22B
· 9786-22H
For Linux operating systems, use the following
procedure:
1. Find the PCI slot the ConnectX-4 Lx adapter is
plugged into. Issue the following command:
lspci –nn | grep “1015"
For example:
# lspci -nn |grep 1015
0001:01:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710
Family [ConnectX-4 Lx] [15b3:1015]
0001:01:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710
Family [ConnectX-4 Lx] [15b3:1015]
Note:
a.
This tells us that the adapter is in the PCIe slot 0001:01:00.0 which will be
needed in next steps.
b.
Each port is listed as its own PCIe ID
0001:01:00.0 – Port 1 of
the adapter
0001:01:00.1 – Port 2 of the adapter.
c. Either
of the port PCIe IDs can be used to perform the microcode install.
2. Start MFT tools by running “mst start”
#mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
If the system does not have the mst command
available, please install the Mellanox Firmware Tools (MFT) available here: http://www.mellanox.com/page/management_tools
3. Verify the current Firmware level by running “flint -d <pci bus ID> -q”. Replace <pci bus ID> with the ID
found in the lscpi command from step 2.
flint -d 0001:01:00.0 q
Image type: FS3
FW Version: 14.31.1014
FW Release Date: 30.6.2021
Rom Info:
type=FCODE version=0.0.390 proto=ETH
Description:
UID GuidsNumber
Base GUID: 248A070300B6844E 4
Base MAC: 0000248A07B6844E 4
Image VSD:
Device VSD:
PSID:
IBM0000000008
This command reveals the current
version of the FW; which is 14.31.1014 in this
particular case. If the version is less
than 14.31.1014, please update.
4. Download the firmware RPM file to the
destination machine.
Use this method to download the new
microcode to a Linux system:
NOTE: The instructions that follow are specific Linux
commands. Linux commands are CASE (lower and upper) SENSITIVE, and must
be entered exactly as shown, including filenames.
rpm
-ivh /tmp/b315151014101e06.001400311014.Linux.rpm
The microcode package will install the
firmware images in the /lib/firmware directory. If a message is displayed
saying the "package <package_name> is already installed", you
will need to uninstall the listed rpm package. On the command line type:
rpm
-e <package_name>
where <package_name> is the name of
the package that was returned in the message. Return to Step 2 and attempt to
install the file again.
5. Verify the contents of the image before flashing:
ls
-l /lib/firmware/b315151014101e06.001400311014 to verify file size:
·
b315151014101e06.001400311014 = 16744768
sum
/lib/firmware/ b315151014101e06.001400311014 to verify Checksum:
·
b315151014101e06.001400311014 = 63317
6. Update the FW with "flint -d <pci bus
id> -i <image filename> burn"
# flint -d 0001:01:00.0 -i b315151014101e06.001400311014
burn
Current FW version on flash: 14.29.1017
New FW version: 14.31.1014
Burning FW image without
signatures - OK
Restoring signature - OK
7. At this point we have burned the new FW on to
the Mellanox adapter. However, this FW will not take effect till the system is
rebooted.
Reloading the drivers is not enough; it requires the system linux partition to be rebooted.
· AIX 7.2 with Technology Level 2 and Service Pack 2
and above
· AIX 7.1 with Technology Level 5 and Service Pack 2
and above
· AIX 7.1 with Technology Level 4 and Service Pack 6
and above
· VIOS 2.2.6.21 and above
If you are using another release of AIX, ensure that the adapter is
supported on that release before you install the adapter. Contact service and
support for assistance.
Before you install the microcode, it is important to determine the
microcode level of the Adapter installed in the target system. Use the
following instructions to read the ROM level stored in the Adapter's VPD.
A.
List all PCIe3 2 PORT 25/10 Gb NIC&ROCE SR/Cu
ADAPTER installed in the system by typing:
lsdev |grep b315151014101e06
B.
To check the current microcode level for the adapter
or controller enter the following command:
lsmcode -cd entX
Where "X" is the instance of the adapter. The command will
produce output similar to:
# lsmcode -cd ent1
The current microcode
level for ent1 is 001400311014
If the ROM Level is less than 001400311014 you should update the microcode.
Use
this method to download to an AIX system:
NOTE: The instructions that follow are specific AIX
commands. AIX commands are CASE (lower and upper) SENSITIVE, and must
be entered exactly as shown, including filenames.
A.
Make two directories on your AIX system to receive
the RPM format file.
Enter: "mkdir /tmp/microcode"
and then create this directory
Enter: "mkdir /tmp/microcode/RPM"
B.
Transfer the RPM format file to the
/tmp/microcode/RPM directory (using "Save as ..."). Change to
that directory, "cd /tmp/microcode/RPM".
You'll see the filename for the RPM file.
"rpm -ihv --ignoreos b315151014101e06.001400311014.aix.rpm"
C.
For AIX: The microcode files will be added to
/etc/microcode/.
D.
Microcode file will be copied to
"/etc/microcode". The file
size and checksum of the microcode image will be verified in Section 10.0.
File Names:
b315151014101e06.001400311014
NOTE:
- "/etc/microcode" is a symbolic link to
"/usr/lib/microcode".
- If permission does not allow the copy to the above stated directory or
file then the user will be prompted for a
new location.
To obtain information how to burn a CD-ROM and run the Discovery Tool
for an AIX or Linux System please go to:
http://www-304.ibm.com/webapp/set2/firmware/lgjsn?mode=10&page=cdrom.html
A.
After running the Discovery Tool successfully the
"/tmp/microcode/RPM" directory was created and your rpm files are
copied from the CD-ROM.
B.
Change to that directory, "cd
/tmp/microcode/RPM".
C.
Unpack the file by executing the instructions
below:
Enter the command:
"rpm -ihv --ignoreos b315151014101e06.001400311014.aix.rpm "
D.
Microcode file will be copied to
"/etc/microcode". The file
size and checksum of the microcode image will be verified in Section 10.0.
File Names:
b315151014101e06.001400311014
NOTE:
- "/etc/microcode" is a symbolic link to
"/usr/lib/microcode".
- If permission does not allow the copy to the above stated directory or
file then the user will be prompted for a new location.
- For customers using the AIX Diagnostics CD, please refer to the IBM
System Hardware information Center for instructions.
Please verify the file size and checksum of the raw microcode files
matches what is listed below.
ls -l /etc/microcode/b315151014101e06.001400311014to verify file size:
·
b315151014101e06.001400311014= 16744768
sum
/etc/microcode/ b315151014101e06.001400311014to verify Checksum:
·
b315151014101e06.001400311014= 63317
A.
Stop all applications that use this
interface/adapter.
B.
Remove the interface/IP address from the all ports identified
in section 7.0 for the adapters that will be upgraded.
a.
Before detaching the interface, record the IP
address and any other pertinent information that was configured on the
Adapter. This information may be needed
if the microcode update overwrites this section on the Adapter.
C.
If the interfaces are members of an SEA, the SEA
devices must be moved to a defined state.
a.
"rmdev -l enX" - where "X" is
the interface number for the Shared Ethernet Adapter.
b.
"rmdev -l entX" - where "X" is
the interface number for the Shared Ethernet Adapter.
D.
If the interfaces are members of an EtherChannel,
the EtherChannel device must be moved to a define state.
a.
"rmdev -l enX" - where "X” is the
interface number for the EtherChannel adapter.
b.
"rmdev -l entX" - where "X" is
the interface number for the EtherChannel adapter.
E.
For every port associated with the adapter, the enX
interfaces must be changed to a defined state.
a.
"rmdev -l enX" - where "X" is
the interface number for the adapter port.
b.
This command will be run 2 times, once for each
port on the adapter.
A.
At the command line type "diag"
B.
Select the "Task Selection" from
diagnostics menu.
C.
Select "Microcode Tasks” then select “Download
Microcode" from the menu.
D.
Select all of the entX PCIe3 2 PORT 25/10 Gb
NIC&ROCE SR/Cu Adapters that need to be updated from the list of
devices by using the arrow keys to highlight the entry and pressing
"Enter" to mark it. Press "F7" or "ESC+7"
when you are done marking all the adapters you want to flash.
E.
If a source selection menu is displayed, Select
"/etc/microcode".
F.
A dialogue box may be displayed on screen. It
will state that the current microcode level on the adapter is not in the
/etc/microcode directory. This is acceptable because the adapter will
reject any incorrect code. Press "Enter" to continue.
G.
Select 001400311014 level and press "Enter" to flash the adapter.
H.
The following message will appear on the screen
when download is completed: "Microcode download complete
successfully. The current microcode level for the device is ...
Please run diagnostics on the adapter to ensure that it is functioning
properly."
I.
If you selected more than one adapter to update,
then steps 6-9 will repeat until all adapters are updated.
J.
Exit diagnostics.
A.
Run "cfgmgr" to reconfigure the adapters
that were moved to defined before the update.
B.
Verify the code level is 001400311014
by typing "lsmcode -cd entX" for each
adapter updated, where "X" is the instance of the PCIe3 2 PORT 25/10
Gb NIC&ROCE SR/Cu ADAPTER.