Readme file for IBM® Spectrum LSF 10.1 Fix 601357
Abstract
P104765. This fix supports Nvidia Multi-Instance GPU (MIG) device
isolation and corrects GPU device enforcement when the GPU ID is not same as
the minor ID.
Description
Readme documentation for IBM Spectrum LSF 10.1 Fix 601357 including
installation-related instructions, prerequisites and co-requisites, and list of
fixes.
This fix provides the following solutions:
1.
Support for Nvidia Multi-Instance GPU (MIG)
device enforcement by cgroups
when setting LSB_RESOURCE_ENFORCE="gpu" in the lsf.conf configuration file.
2. When the GPU ID is
not same as the GPU minor ID, jobs use GPUs which are not allocated to it.
After this fix, jobs are enforced to use GPUs that LSF allocate.
Readme file for: IBM® Spectrum LSF
Product or component release: 10.1
Update name: Fix 601357
Fix ID: LSF-10.1-build601357
Publication date: 17 November 2022
Contents
1. List of fixes
2. Download location
3. Product or components affected
4. System requirements
5. Installation and configuration
6. List of files
7. Product notifications
8. Copyright and trademark information
1. List of fixes
P104765
2. Download locations
Download Fix 601357 from the following location: https://www.ibm.com/support/fixcentral
3. Product or components affected
Affected product or components include:
LSF/sbatchd
LSF/res
4. System requirements
linux2.6-glibc2.3-x86_64
linux3.10-glibc2.17-x86_64
5. Installation and configuration
Before you install
LSF_TOP is the full path to the top-level installation
directory of LSF.
1. Before you apply this fix, ensure that you
installed LSF 10.1 Fix Pack 12 or above. You can download Fix Pack 12 from https://www.ibm.com/support/fixcentral
and search for build600488. Contact IBM LSF Support if you have any questions
or problems with installing Fix Pack 12.
2. Starting in LSF 10.1 Fix Pack 13, the default values of the following
three GPU parameters are changed to:
LSF_GPU_AUTOCONFIG=Y
LSB_GPU_NEW_SYNTAX=extend
LSF_GPU_RESOURCE_IGNORE=Y
If you have Fix Pack 13
installed, and these GPU parameters are not configured in the lsf.conf configuration file, they will take the default
values. If the parameters are already configured in the lsf.conf file with other values, they will not be
affected.
If you want to keep the
former GPU behavior, and if any of the three parameters are missing in the lsf.conf
configuration file, you must explicitly configure the following default
settings that are defined in Fix Pack 12 or earlier:
LSF_GPU_AUTOCONFIG=N
LSB_GPU_NEW_SYNTAX=N
LSF_GPU_RESOURCE_IGNORE=N
3. Log on to the LSF management host as the LSF
primary administrator.
4. Set your environment:
-For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf
-For sh, ksh, or bash: $ . LSF_TOP/conf/profile.lsf
Installation steps
1. Run badmin hclose
all
2. Run badmin qinact
all
3. Log on to the LSF management host as root
and set the LSF cluster environment
4. Go to the install
directory: cd
$LSF_ENVDIR/../10.1/install/
5. Copy the fix file to the install
directory: $LSF_ENVDIR/../10.1/install/
4. Run patchinstall: ./patchinstall
<fix>
After you install
1. Log on to the LSF management host as the LSF
primary administrator and set your environment:
-For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf
-For sh, ksh,
or bash: $ . LSF_TOP/conf/profile.lsf
2. Run lsadmin resrestart all
3. Run badmin hrestart
all
4. Run badmin hopen
all
5. Run badmin qact
all
Uninstallation
1. Log
on to the LSF management host as the LSF cluster primary administrator and set
the LSF cluster environment.
2. Run badmin hclose
all
3. Run badmin qinact
all
4. Log on to the LSF management host as root
and set the LSF cluster environment.
5. Go to the fix install directory: cd $LSF_ENVDIR/../10.1/install/
6. Run ./patchinstall
-r <fix>
7. Log on to the LSF management host as the
LSF cluster primary administrator and set the LSF cluster environment.
8. Run lsadmin resrestart
all
9. Run badmin hrestart
all
10. Run badmin hopen
all
11. Run badmin qact
all
6. List of files
The following components in all Linux and Unix packages:
LSF/sbatchd
LSF/res
7. Product Notifications
To receive information about product solution and fix updates
automatically, subscribe to product notifications on the My notifications page
(www.ibm.com/support/mynotifications) on the
IBM Support website (support.ibm.com). You can edit your
subscription settings to choose the types of information you want to get
notification about, for example, security bulletins, fixes, troubleshooting,
and product enhancements or documentation changes.
8. Copyright and Trademark Information
©Copyright IBM Corporation 2022
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
IBM®, the IBM logo, and ibm.com® are trademarks of International
Business Machines Corp., registered in many jurisdictions worldwide. Other
product and service names might be trademarks of IBM or other companies. A
current list of IBM trademarks is available on the web at "Copyright and
trademark information" at www.ibm.com/legal/copytrade.shtml.