IBM Spectrum LSF 10.1 Fix Pack 3 for LSF Suite HPC Edition (463462) Readme File

Abstract

IBM Spectrum LSF Version 10.1 Fix Pack 3 for LSF Suite HPC Edition. This Fix Pack includes new issues and solutions resolved between 16 February 2017 and 06 July 2017.

Description

Readme documentation for IBM Spectrum LSF 10.1 Fix Pack 3 for LSF Suite HPC Edition (463462) including installation-related instructions, prerequisites and co-requisites, and list of fixes.

The new issues addressed in LSF Version 10.1 Fix Pack 3 for LSF Suite HPC Edition:

ID

Fixed Date

Description

P102288

2017/06/30

This fix addresses an issue with parallel or multiple host jobs. When the bresize command is used to release one host's slots, LSF might abort the tasks on other hosts because of improper signals from the job res.

P102282

2017/06/05

This fix allows jobs to be scheduled even if the LSF_LOGDIR parameter is not defined in the lsf.conf file.

P102278

2017/07/06

This fix addresses an issue where if the load schedule and stop policy has been configured on hosts (to control swap resources), running jobs on those hosts would incorrectly enter SSUSP status after executing lsadmin reconfig.

P102277

2017/07/06

This fix addresses an issue when using the bsub -pack option. If you use an esub to modify the host options, the bsub command core dumps.

P102273

2017/06/29

This fix addresses an issue where if the job submission has compound resource requirements that include compute units, LSF might over-schedule jobs that exceed the specified limits definition.

P102272

2017/06/29

This fix addresses an issue where if the job's dependencies contains the wildcard character (*) on both sides of the job name (for example, "**") and the mbatchd daemon restarts, it will take much more time to parse this job's JOB_NEW event.

P102267

2017/06/30

This fix addresses an issue where if a user runs an invalid bmod command where the argument for an option is a multi-line command, the content of the command is recorded literally in the lsb.events file, causing the bhist command to report a "Bad event format" error for each line of that record.

P102266

2017/06/23

This fix addresses an issue where if JOB_INCLUDE_POSTPROC=Y is set, a child job will be started when the brequeue command is used for the parent job in the post-execution phase.
This fix prevents a child job from being triggered in this instance.

P102253

2017/06/17

This fix addresses an issue where if the LSF_DAEMON_WRAP parameter is enabled in the lsf.conf file and the DJOB_ENV_SCRIPT parameter is not configured in the lsb.applications file, the user-defined script is incorrectly invoked as if the DJOB_ENV_SCRIPT parameter is defined.

P102252

2017/06/13

This fix addresses the following issue: The environment variable DAEMON_WRAP_ENABLE_BPOST is added to control bpost call in daemons.wrap. The environment variable is set to DAEMON_WRAP_ENABLE_BPOST=y to enable a bpost call when cleartool setview fails.

P102241

2017/06/21

This fix addresses an issue that occurs in multicluster environments where the version of the submission cluster is lower than LSF 10.1 and the version of the execution cluster is LSF 10.1.
When the runtime resource usage limit of a forwarded job is modified in the submission cluster and the mbatchd daemon of the submission cluster can recognize the name of the execution host, the execution host enters an unreach status.

P102236

2017/06/06

This fix addresses an issue where the bhist command shows the incorrect pending time when there are two exit JOB_STATUS events in the lsb.events file.

P102234

2017/06/21

This fix addresses an issue where if the scheduler binary file is obtained from compiler optimization, the child scheduler might crash. This is because the child scheduler uses a function that might cause a core dump under a specific case, but the child scheduler normally checks for this case before running the function to prevent the core dump. The compiler optimization removes this check, which causes the child scheduler to run the function regardless of the circumstances and potentially cause a core dump. This fix restores the check to prevent the function from running.

P102232

2017/05/26

This fix addresses an issue where if a queue has either the SLOT_RESERVE parameter or RESOURCE_RESERVE parameter enabled in the lsb.queues file (to enable resource reservation), and there are many pending jobs in the queue, the scheduling performance drops significantly and takes a long time to make reservations.

P102231

2017/06/14

Prior to this patch MBD didn't distinguish between expected and unexpected error cases during recovery of the jobinfo cache. This patch will properly distinguish these cases and treat unexpected errors as fatal errors. When a fatal error occurs, MBD will exit. Cache recovery will be retried when SBD restarts MBD. This patch also improves the logging of file system access errors during recovery of the lsb.jobinfo.events file.

P102230

2017/06/05

This fix addresses the following issue: A pending interactive job cannot be modified by the bmod command if automatic job rerun is enabled in the queue (that is, the RERUNNABLE parameter is set to "yes" in the lsb.queues file).

P102229

2017/06/05

This fix addresses an issue where a pending interactive job cannot be modified by the bmod command if automatic job rerun is enabled in the queue (that is, the RERUNNABLE parameter is set to "yes" in the lsb.queues file).

P102224

2017/05/18

In the lsb.resources file, when the length of a resource limit name is 40 characters, its limit usage cannot be shown with the blimits command.

P102217

2017/05/15

This fix adds new regions to the list of regions that are supported by AWS host provider used by the resource connector. The following new regions will be supported: (us-east-2, eu-west-2, ap-south-1, ca-central-1).

P102208

2017/05/12

This fix addresses the following issue: When the "PREEMPT_JOBTYPE" parameter is configured as "EXCLUSIVE" in the lsb.params file, the mbschd daemon crashes when resuming the cross-host preemptive parallel suspended job

P102199

2017/05/02

This fix addresses the issue where a migrated job has an incorrect runtime after mbatchd restart, caused when part of the job event is switched.

P102198

2017/05/02

This fix addresses the issue where if a host group contains hosts that are unknown, some good status hosts might be excluded from this host group after the mbatchd daemon restarts.
In addition, this might result in the host group containing no hosts, which results in the entire host group being unavailable.

P102196

2017/04/26

This fix addresses the issue where if an LSF host group contains some hosts that are not static servers, some good status hosts might be excluded from this host group after a reconfiguration. In addition, this might result in the host group containing no hosts, which results in the entire host group being unavailable.

P102188

2017/04/04

This fix addresses the issue where jobs with run limits at the queue and application level fail to submit if the run limits at the queue and application level are both identical and the ABS_RUNLIMIT parameter is set to Y in the lsb.params file.

P102186

2017/04/21

This fix addresses the issue where if LSF is rotating event files, bhist might not be able to display the complete job history due to this race condition.

P102178

2017/04/04

This fix addresses the issue where if parallel restart happens at the same time as lsb.jobinfo.events are rewritten, then records in that event file may be corrupted. This fix prevents the corruption by properly synchronizing the two operations.

P102171

2017/04/10

This fix addresses the issue where ENABLE_HOST_INTERSECTION is defined as Y in the lsb.params file and jobs are submitted with a specified host list to a queue and some specified hosts are removed from the queue's host list later.
After mbatchd is reconfigured and the jobs are requeued, they are incorrectly dispatched to the removed hosts.

P102170

2017/04/13

This fix addresses the issue where if LSB_KRB_TGT_FWD is set to Y in the lsf.conf file and JOB_INFO_MEMORY_CACHE_SIZE is set to a non-zero value in the lsb.params file, the job's Kerberos TGT file in the LSB_JOBINFO_DIR directory cannot be deleted.

P102169

2017/03/22

This fix addresses the issue where liblsbstream.a and libfairsharedjust.a static libraries are missing from the LSF lib directory.

P102164

2017/04/11

This fix addresses the issue where if cgroup enforcement is enabled for GPUs, jobs that require more GPUs (2 out of 2 available GPUs or 3 - 4 out of 4 available GPUs) often get terminated. The percentage of failure is close to 100%.

P102158

2017/04/11

This fix addresses the issue where rerunnable jobs get requeued when execution hosts are unavailable. This means that IBM Spectrum LSF Analytics does not correctly calculate the job pending time.
This fix adds a job signal event before the job finish event in lsb.stream, allowing Analytics to correctly calculate pending time for these jobs.
A new parameter, LOG_JOB_SIGNAL_FOR_HOST_UNAVAIL, is added to lsb.params to control whether to log an additional job signal event.
Syntax: LOG_JOB_SIGNAL_FOR_HOST_UNAVAIL = Y|y|N|n
Description: If enabled (set to Y|y), when slave hosts become unavailable, LSF logs a job signal event before the first job finish event in the lsb.stream file for rerunnable jobs that run on those slave hosts.
Default: N

P102152

2017/03/30

This fix addresses the issue where after running the badmin mbdrestart command, if the mbatchd daemon exits abnormally when starting up, the sbatchd daemon will wait for a very long time before trying to restart the mbatchd again.

P102147

2017/03/24

This fix addresses the issue where if cgroup is enabled, after JOB_POSTPROC_TIMEOUT is expired, LSF does not kill all processes launched by post-exec, but cgroup is still tracking those processes.

P102141

2017/03/22

This fix addresses the issue where the bmgroup command does not display host groups with names that contain the text "others".

P102139

2017/03/24

This fix addresses the issue where jobs are not being scheduled in the specified time window even when the bqueues command shows that the queue is open and active. This issue happens only when the RUN_WINDOW parameter in the lsb.queues file is configured to always be open.
For example, if you set the RUN_WINDOW parameter as follows:
RUN_WINDOW = 0:0:0-4:09:00 4:13:00-5:13:00 5:14:00-1:13:00 01:14:00-0:0:0
The bqueues binary shows the queue is still open and active, but jobs in the according queue cannot be scheduled with the pending reason - "New job is waiting for scheduling;"
This is because the logic for calculating the run window close time in LSF does not correctly handle the case when the configured run window is always opened.

P102136

2017/03/17

This fix addresses the issue where the RUNLIMIT cannot be enforced for jobs with pre-execution functions if they were submitted to an LSF cluster that only has Fix Packs applied that are older than 390354 and the pre-execution functions finish before applying a newer Fix Pack (and restarting sbatchd to apply the changes).

P102119

2017/03/10

This fix addresses the issue where if the LSF_COLLECT_ENERGY_USAGE parameter is configured as Y in the lsf.conf file, the sbatchd daemon that is running on the host is unresponsive after multiple jobs are dispatched to the same host. When attempting to restart sbatchd on the host with the "badmin hrestart" command, the command fails with one of the following error messages:
1. "Host control failed: Failed in an LSF library call: Failed in sending/receiving a message: Connection reset by peer"
2. "Host control failed: Failed in an LSF library call: Communication time out"

P102115

2017/03/07

This fix addresses the issue where if a user defines the JOB_SPOOL_DIR parameter in the lsb.params file with variable substitutions, the bpeek command cannot read the job output.
After applying this fix, the bpeek command will support the following user defined variables to be used in JOB_SPOOL_DIR:
%U: username
%H: (first) execution host name
%P: project name
%JG: job group name
%C: execution cluster name
NOTE: This fix is not supported on Windows operating systems.

P102105

2017/03/10

This fix addresses the issue where if bsub is rejected on non-LSF hosts, bsub does not call the epsub file.

P102103

2017/03/02

This fix addresses the issue where if bpeek is run in a pseudo terminal that is started by an interactive job, the error "Job is not found" is always returned.

P102100

2017/02/17

This fix addresses the issue where if the file attribute of lsb.lease.state has been changed and the cluster manager cannot open it, the mbatchd daemon cannot start in an LSF multicluster environment.

P102071

2017/02/22

This fix is required if your cluster uses a third party scheduler plugin that uses the external scheduler API extsched_modifyJob(). When the job parameter extsched_modifyJob() specifies a job array with arbitrary elements, the job array may remain unaffected by the parameter setting.

158094

2017/07/05

This fix addresses the issue where the sched_setaffinity() API repeatedly shows error messages when attempting CPU binding. The error messages resemble the following:
setCPUAffinity: sched_setaffinity() failed to bind to all 1024 logical CPU with error number <22>.
This issue occurs in RHEL 5 kernel 2.6.18 platforms.

157354

2017/07/04

This fix allows users with a restrictive user mask (0077, for instance) to start a Docker-based LSF job successfully.

156718

2017/07/06

This fix addresses the issue where LSF uses submission environment variables to overwrite environment variables in the Docker container. Users need special environment variables (like LD_LIBRARY_PATH in the image). This solution merges PATH, LD_LIBRARY_PATH, and LD_PRELOAD from the job environment with the image settings.
The merging rule is putting the variable value in user environment first which is shown below:
PATH=job_PATH:image_PATH
LD_LIBRARY_PATH=job_LD_LIBRARY_PATH:image_LD_LIBRARY_PATH
LD_PRELOAD=job_LD_PRELOAD:image_LD_PRELOAD

156008

2017/06/30

This fix prevents mbschd from generating large numbers of core when using the LSF Express version entitlement file.

155022

2017/07/06

MPS fails to start in multi-host job if LSF_TMPDIR is pointing to a shared file system.

152237

2017/05/15

This fix allows the sbatchd daemon to correctly log CGROUP related error message.

151340

2017/05/09

This fix addresses the issue where if you create an advance reservation and submit an exclusive job (without a RUN LIMIT or the job cannot finish before the current end time of the advance reservation) to the advance reservation after it becomes active, the brsvmod command cannot extend the advance reservation when the exclusive job is running.

149041

2017/05/23

This fix addresses the issue where fixed Parallel Environment (PE) jobs remain pending when a user disables the fair share plugin in all LSF queues.

148457

2017/04/20

This fix prevents the mbatchd daemon from corrupting files in the directory LSB_LOCALDIR/LSB_SHAREDIR when LSB_LOCALDIR and LSB_SHAREDIR are set as the same value.

147765

2017/06/30

This fix addresses the issue where the CUDA_VISIBLE_DEVICES environment variable is not visible host-based pre-exec and post-exec.

147039

2017/04/17

This fix allows the mbatchd daemon to release the memory resource usage on leased-in hosts.

141493

2017/02/17

The fix prevents the mbatchd daemon from crashing when LSF_LOGDIR and DIAGNOSE_LOGDIR are not configured.


The new solutions in LSF Version 10.1 Fix Pack 3 for LSF Suite HPC Edition:

ID

Fixed Date

Description

RFE#100139
RFE#99873
RFE#100140
RFE#104480
150716

2017/07/06

This fix addresses the following issues:
- Simplify the GPU resource requirement by adding a new bsub –gpu option, support GPU_REQ at app/queue level.
- Add a new parameter LSB_GPU_NEW_SYNTAX to control whether the new syntax is used.
- Support dynamic mode switching for GPU compute mode.
- Refine the GPU resource release mechanism.
- Add an environment variable CUDA_VISIBLE_DEVICES to host-based pre-proc.
- Support setting CUDA_MPS_PIPE_DIRECTORY to a shared directory.
- Remove the CUDA call in LSF.
- Separate the physical GPUs allocated to different MPI tasks using CUDA_VISIBLE_DEVICES environmental variables.
- Support CPU-GPU affinity, choosing GPUs close to the allocated CPUs for the same task using best-effort policy.

RFE#105189

2017/07/06

This integration allows LSF users to use nvidia-docker to run a job using GPU resources in Docker containers through LSF.

142210

2017/07/05

The RC_DEMAND_POLICY parameter in the lsb.queues file defines threshold conditions to determine whether demand is triggered to borrow resources through resource connector for all the jobs in the queue. As long as pending jobs at the queue meet at least one threshold condition, LSF expresses the demand to resource connector to trigger borrowing.
The demand policy defined by the RC_DEMAND_POLICY parameter can contain multiple conditions, in an OR relationship. A condition is defined as [ num_pend_jobs[,duration]]. The queue has more than the specified number of eligible pending jobs that are expected to run at least the specified duration in minutes. The num_pend_jobs option is required, and the duration is optional. The default duration is 0 minutes.

148682

2017/07/05

This feature enables LSF clusters to borrow virtual compute hosts from the IBM SoftLayer (Bluemix) service if the workload demand exceeds cluster capacity. The Resource Connector generates requests for additional hosts from SoftLayer and dispatches jobs to dynamic hosts that join the LSF cluster. When the demand reduces, the Resource Connector shuts down slave LSF daemons and cancels allocated SoftLayer virtual servers.

152022

2017/06/29

The new bresize request subcommand option allows you to request additional tasks to be allocated to a running resizable job, which grows the resizable job. This means that you can both grow and shrink a resizable job by using the bresize command.

142185

2017/06/28

The instanceTags attribute in the awsprov_templates.json file can tag EBS volumes with the same tag as the instance. EBS volumes in AWS are persistent block storage volumes used with an EC2 instance. EBS volumes are expensive, so you can use the instance ID that tags the volumes for the accounting purposes.
Note: The tags cannot start with the string aws:. This prefix is reserved for internal AWS tags. AWS gives an error if an instance or EBS volume is tagged with a keyword starting with aws:. Resource connector removes and ignores user-defined tags that start with aws:.

142216

2017/06/28

Resource connector supports federated accounts for LSF resource connector as an option instead of requiring permanent AWS IAM account credentials. Federated users are external identities that are granted temporary credentials with secure access to resources in AWS without requiring creation of IAM users. Users are authenticated outside of AWS (for example, through Windows Active Directory).
Use the AWS_CREDENTIAL_SCRIPT parameter in the awsprov_config.json file to specify a path to the script that generates temporary credentials for federated accounts. For example,
AWS_CREDENTIAL_SCRIPT=/shared/dir/generateCredentials.py
LSF executes the script as the primary LSF administrator to generate a temporary credentials before it creates the EC2 instance.

148740

2017/06/28

This feature enables LSF clusters to launch instances from Azure to satisfy pending workload. The instances will join the LSF cluster. If instances become idle, LSF resource connector terminates them.

RFE#98501

2017/06/21

This solution allows the end user to submit jobs with remote hosts using the "-m remote_host@remote_cluster ..." option to send-jobs queues in the job forwarding model when using the LSF multicluster capability.

139905

2017/06/15

LSF provides a feature to log profiling information for the mbatchd and mbschd daemons to track the time that the daemons spend on key functions. This can assist IBM Support with diagnosing daemon performance problems.
To enable daemon profiling with the default settings, edit the lsf.conf file, then specify LSB_PROFILE_MBD=Y for the mbatchd daemon or specify LSB_PROFILE_SCH=Y for the mbschd daemon. You can also add keywords within these parameters to further customize the daemon profilers.

RFE#101255

2017/06/09

This solution provides a way to directly get a list of jobs using a particular reservation ID by using the new bjobs -U option.
This solution also adds the "rsvid" field to the bjobs -o option to display individual advance reservation IDs.

RFE#92852

2017/06/07

This enhancement enlarges bsub -u email address from 63 to 511 characters.

127032

2017/05/30

IAM roles group AWS access control privileges together. A role can be assigned to an IAM user or an IAM instance profile. IAM Instance Profiles are containers for IAM roles that allow you to associate an EC2 instance with a role through the profile. The EC2 runtime environment contains temporary credentials that have the access control permissions of the profile role.
To make the roles available for resource connector to create instances, use the instanceProfile attribute in the awsprov_templates.json file to specify an AWS IAM instance profile to assign to the requested instance. Jobs running in that instance can use the instance profile credentials to access other AWS resources. Resource connector uses that information to request EC2 compute instances with particular instance profiles. Jobs that run on those hosts use temporary credentials provided by AWS to access the AWS resources that the specified role has privileges for.

RFE#75418

2017/05/26

This fix enables the bpeek -f command to exit when the peeked job is completed.
If the peeked job is requeued or migrated, the bpeek command only exits if the job is completed again. In addition, the bpeek command cannot get the new output of the job. To avoid these issues, abort the previous bpeek -f command and rerun the bpeek -f command after the job is requeued or migrated.

141430

2017/05/20

Use Spot instances to bid on spare Amazon EC2 computing capacity. Since Spot instances are often available at a discount compared to the pricing of On-Demand instances, you can significantly reduce the cost of running your applications, grow your application’s compute capacity and throughput for the same budget, and enable new types of cloud computing applications.
With Spot instances you can reduce your operating costs by up to 50-90%, compared to on-demand instances. Since Spot instances typically cost 50-90% less, you can increase your compute capacity by 2-10 times within the same budget.
Spot instances are supported on any Linux x86 system that is supported by LSF.

RFE#94879
RFE#82487

2017/05/15

This enhancement modifies the MAX_PEND_JOBS parameter to limit the maximum number of pending jobs. This enhancement also adds the new MAX_PEND_SLOTS parameter to replace the current MAX_PEND_JOBS parameter. Both of these parameters provide ways to protect cluster service globally across the cluster or at the user level.

RFE#79255
RFE#92576
RFE#98712

2017/05/15

This solution improves the performance of daemons.wrap by not checking the ClearCase view and provides the failure reason of "cleartool setview" called by daemons.wrap in bjobs -l, bhist -l, bstatus and bread. This solution also adds pid in daemons.wrap.log.
After applying this solution, the ClearCase view set by CLEARCASE_ROOT will not be checked under any condition. This means NOCHECKVIEW_POSTEXEC becomes obsolete.
The ClearCase environment will be set directly by "cleartool setview" in daemons.wrap. And the failure reason will be passed to mbatchd by the bpost command. Using bjobs -l, bhist -l, bstatus, or bread will display the message.

146129

2017/05/10

This enhancement enables users to create and schedule advance reservations in the same way as a job. Once the reservation is active, the jobs submitted to the reservation can run within it.
This fix provides 3 scripts to create schedulable advance reservation and query advance reservation jobs. The scripts are installed under $LSF_BINDIR.
- brsvsub: create a schedulable advance reservation.
- lsfrsv: part of brsvsub. Submit a job to run lsfrsv to update the time window and hosts in an advance reservation.
- brsvjob: query the job information submitted with advance reservation name.
To write your own script to create schedulable advance reservation, you need to know the following information:
- This fix adds a new option -p to brsvadd to create an advance reservation without a time window or hosts. This is referred to as a placeholder advance reservation.
- Use brsvmod to add a time window and hosts to the placeholder advance reservation.
- Submit jobs to the placeholder advance reservation. After the advance reservation is filled and active, jobs can run within it.

146128

2017/04/20

This solution allows you to configure a script in the options field of the CONTAINER parameter in the lsb.applications file. Before the container job runs, LSF first runs a script with LSF administrator privileges. While the script is running, all the jobs’ environment variables are passed to the script in run time. When the script finishes running, the output is used in the container startup options.

P102156

2017/04/20

This enhancement adds new functionalities to LSF:
- Adds the new option "-o" to the lsload command to set the customized output format.
- Adds the new parameter LSF_LSLOAD_FORMAT to the lsf.conf file and new runtime environment variable LSF_LSLOAD_FORMAT to define the default lsload output format.
- Adds the new option "-json" to the lsload command to display the customized output in JSON format. This option must be used together with "-o" or LSF_LSLOAD_FORMAT.

142500

2017/04/14

This fix adds additional resources to elim.gpu.ext, including gpu_mut, gpu_mtotal, gpu_mused, gpu_pstate and gpu_shared_avg_mut.

RFE#98823

2017/04/13

For jobs that are pending because there are not enough licenses available, the bjobs -p, -p1, -p2, and -p3 options only show which licenses do not have enough available, and do not display the project or cluster to which the licenses belong. This might be confusing because this can show that the license server has free licenses, but jobs cannot run, without showing that the project or cluster has limits on the licenses that are available.
After this enhancement, the bjobs -p, -p1, -p2, -p3 options also show the project name for jobs that request project mode or fast dispatch mode features, and the bjobs -p options also show the cluster name for jobs that request cluster mode features. The bjobs -p0 and bjobs -l output are not affected by this enhancement.


For further details on these solutions, refer to https://www.ibm.com/support/knowledgecenter/SSWRJV_10.1.0/lsf_release_notes/lsf_relnotes_whatsnew10.1.0.3.html

Readme file for: IBM® Spectrum LSF

Product/Component Release: 10.1

Update Name: Fix 463462

Fix ID: lsfshpc-10.1.1-lsf-10.1-build463462

Publication date: 16 August 2017

Last modified date: 16 August 2017

Contents:

1.     List of fixes

2.     Download location

3.     Products or components affected

4.     System requirements

5.     Installation and configuration

6.     List of files

7.     Product notifications

8.     Copyright and trademark information

 

1.   List of fixes

P102288,P102282,P102278,P102277,P102273,P102272,P102267,P102266,P102253,P102252,P102241,P102236,
P102234,P102232,P102231,P102230,P102229,P102224,P102217,P102208,P102199,P102198,P102196,P102188,
P102186,P102178,P102171,P102170,P102169,P102164,P102158,P102152,P102147,P102141,P102139,P102136,
P102119,P102115,P102105,P102103,P102100,P102071,158094(No APAR),157354(No APAR),156718(No APAR),
156008(No APAR),155022(No APAR),152237(No APAR),151340(No APAR),149041(No APAR),148457(No APAR),
147765(No APAR),147039(No APAR),141493(No APAR),RFE#100139,RFE#99873,RFE#100140,RFE#104480,
150716(No APAR),RFE#105189,142210(No APAR),148682(No APAR),152022(No APAR),142185(No APAR),
142216(No APAR),148740(No APAR),RFE#98501,139905(No APAR),RFE#101255,RFE#92852,127032(No APAR),
RFE#75418,141430(No APAR),RFE#94879,RFE#82487,RFE#79255,RFE#92576,RFE#98712,146129(No APAR),
146128(No APAR),P102156,142500(No APAR),RFE#98823

2.   Download Location

Download Fix 463462 from the following location: http://www.ibm.com/eserver/support/fixes/

3.   Products or components affected

Components affected by the new issues addressed in LSF Version 10.1 Fix Pack 3 for LSF Suite HPC Edition include:
LSF/lsbatch.h
LSF/lsf.h
LSF/lssched.h
LSF/aws_enable.sh
LSF/bacct
LSF/badmin
LSF/bapp
LSF/bconf
LSF/bhist
LSF/bhosts
LSF/bjobs
LSF/bmod
LSF/bparams
LSF/bpeek
LSF/bqueues
LSF/bresize
LSF/brestart
LSF/brsvadd
LSF/brsvjob
LSF/brsvs
LSF/brsvsub
LSF/bsub
LSF/busers
LSF/lsfrsv
LSF/lsid
LSF/lsload
LSF/bclusters
LSF/bgadd
LSF/bgmod
LSF/blaunch
LSF/blimits
LSF/bresources
LSF/bresume
LSF/brsvmod
LSF/bslots
LSF/lsadmin
LSF/lsgrun
LSF/lshosts
LSF/lsinfo
LSF/lsloadadj
LSF/lslogin
LSF/lsmake
LSF/lsmake4
LSF/lsmon
LSF/lsplace
LSF/lsreghost
LSF/lsrun
LSF/bwait
LSF/daemons.wrap
LSF/ebrokerd
LSF/elim.gpu
LSF/elim.gpu.ext
LSF/elim.gpu.topology
LSF/lim
LSF/mbatchd
LSF/mbschd
LSF/res
LSF/sbatchd
LSF/eauth.krb5
LSF/ego_client
LSF/egosc
LSF/krbrenewd
LSF/mesub
LSF/nios
LSF/pim
LSF/rla
LSF/eauth.cve
LSF/libbat.a
LSF/libbat.so
LSF/liblsf.a
LSF/liblsf.so
LSF/schmod_demand.so
LSF/libptmalloc3.so
LSF/cal_jobweight.so
LSF/liblsbstream.so
LSF/schmod_advrsv.so
LSF/schmod_affinity.so
LSF/schmod_aps.so
LSF/schmod_bluegene.so
LSF/schmod_cpuset.so
LSF/schmod_craylinux.so
LSF/schmod_crayx1.so
LSF/schmod_dc.so
LSF/schmod_default.so
LSF/schmod_dist.so
LSF/schmod_fairshare.so
LSF/schmod_fcfs.so
LSF/schmod_jobweight.so
LSF/schmod_limit.so
LSF/schmod_mc.so
LSF/schmod_parallel.so
LSF/schmod_preemption.so
LSF/schmod_ps.so
LSF/schmod_pset.so
LSF/schmod_reserve.so
LSF/schmod_rms.so
LSF/schmod_xl.so
LSF/libfairshareadjust.a
LSF/liblsbstream.a
LSF/libsec_ego_default.so
LSF/lsf_release
LSF/misc/examples/external_plugin/allocexample.c
LSF/misc/examples/external_plugin/Makefile
LSF/misc/examples/external_plugin/matchexample.c
LSF/misc/examples/external_plugin/myplugin.c
LSF/misc/examples/external_plugin/README
LSF/misc/examples/external_plugin/sch.mod.fcfs.c
LSF/misc/examples/clearcase/daemons.wrap.c
LSF/resource_connector/azure/lib/AzureTool.jar
LSF/resource_connector/azure/conf/azureprov_config.json
LSF/resource_connector/azure/conf/azureprov_templates.json
LSF/resource_connector/azure/scripts/getAvailableTemplates.sh
LSF/resource_connector/azure/scripts/getRequestStatus.sh
LSF/resource_connector/azure/scripts/requestMachines.sh
LSF/resource_connector/azure/scripts/requestReturnMachines.sh
LSF/resource_connector/azure/scripts/user_data.sh
LSF/resource_connector/ego/scripts/Main.py
LSF/resource_connector/openstack/scripts/Main.py
LSF/resource_connector/openstack/scripts/OpenStackClient.py
LSF/resource_connector/openstack/scripts/userscript.sh
LSF/resource_connector/openstack/scripts/MachineFile.py
LSF/resource_connector/openstack/scripts/user_data.sh
LSF/resource_connector/openstack/conf/osprov_templates.json
LSF/resource_connector/openstack/conf/osprov_config.json
LSF/resource_connector/aws/conf/awsprov_templates.json
LSF/resource_connector/aws/conf/awsprov_config.json
LSF/resource_connector/aws/conf/credentials
LSF/resource_connector/aws/lib/AwsTool.jar
LSF/resource_connector/aws/scripts/user_data.sh
LSF/resource_connector/aws/scripts/getAvailableMachines.sh
LSF/resource_connector/aws/scripts/getAvailableTemplates.sh
LSF/resource_connector/aws/scripts/getRequestStatus.sh
LSF/resource_connector/aws/scripts/getReturnRequests.sh
LSF/resource_connector/aws/scripts/requestMachines.sh
LSF/resource_connector/aws/scripts/requestReturnMachines.sh
LSF/resource_connector/policy/Main.py
LSF/resource_connector/policy/Log.py
LSF/resource_connector/policy/PolicyFile.py
LSF/resource_connector/softlayer/lib/SoftLayerTool.jar
LSF/resource_connector/softlayer/scripts/getRequestStatus.sh
LSF/resource_connector/softlayer/scripts/requestReturnMachines.sh
LSF/resource_connector/softlayer/scripts/getAvailableTemplates.sh
LSF/resource_connector/softlayer/scripts/requestMachines.sh
LSF/resource_connector/softlayer/scripts/postprovision.sh
LSF/resource_connector/softlayer/conf/credentials
LSF/resource_connector/softlayer/conf/softlayerprov_config.json
LSF/resource_connector/softlayer/conf/softlayerprov_templates.json
LSF/resource_connector/hostProviders.json
LSF/util/elim.mic.ext/README
LSF/esub.p8aff(Only need on lnx3.10-glibc2.17-ppc64le)

 

4.   System requirements

Linux2.6-glibc2.3-x86_64
Linux3.10-glibc2.17-x86_64
Linux3.10-glibc2.17-ppc64le

 

5.   Installation and configuration

 

5.1          Before installation

 

 LSF_TOP=Full path to the top-level installation directory of LSF.

1)    Log on to the LSF master host as root

2)    Set your environment:

-      For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf

-      For sh, ksh, or bash: $ . LSF_TOP/conf/profile.lsf

 

5.2          Installation steps

 

1)    Go to the patch install directory: cd $LSF_ENVDIR/../10.1/install/

2)    Copy the patch file to the install directory $LSF_ENVDIR/../10.1/install/

3)    Run
badmin hclose all
badmin qinact all

4)    Run patchinstall: ./patchinstall <patch>

 

5.3          After installation

 

1)    Run
badmin hshutdown all
lsadmin resshutdown all
lsadmin limshutdown all

2)    Run
lsadmin limstartup all
lsadmin resstartup all
badmin hstartup all

3)    Run
badmin hopen all
badmin qact all

 

5.4          Uninstallation

 

To roll back a patch:

1)    Log on to the LSF master host as root

2)    Set your environment:

-      For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf

-      For sh, ksh, or bash: $ . LSF_TOP/conf/profile.lsf

3)    Run
badmin hclose all
badmin qinact all

4)    Run ./patchinstall -r <patch>

5)    Run
badmin hshutdown all
lsadmin resshutdown all
lsadmin limshutdown all

6)    Run
lsadmin limstartup all
lsadmin resstartup all
badmin hstartup all

7)    Run
badmin hopen all
badmin qact all

6.   List of files in package

 

filelist.txt
fixlist.txt
lsbatch.h
lsf.h
lssched.h
aws_enable.sh
bacct
badmin
bapp
bconf
bhist
bhosts
bjobs
bmod
bparams
bpeek
bqueues
bresize
brestart
brsvadd
brsvjob
brsvs
brsvsub
bsub
busers
lsfrsv
lsid
lsload
bclusters
bgadd
bgmod
blaunch
blimits
bresources
bresume
brsvmod
bslots
lsadmin
lsgrun
lshosts
lsinfo
lsloadadj
lslogin
lsmake
lsmake4
lsmon
lsplace
lsreghost
lsrun
bwait
daemons.wrap
ebrokerd
elim.gpu
elim.gpu.ext
elim.gpu.topology
lim
mbatchd
mbschd
res
sbatchd
eauth.krb5
ego_client
egosc
krbrenewd
mesub
nios
pim
rla
eauth.cve
libbat.a
libbat.so
liblsf.a
liblsf.so
schmod_demand.so
libptmalloc3.so
cal_jobweight.so
liblsbstream.so
schmod_advrsv.so
schmod_affinity.so
schmod_aps.so
schmod_bluegene.so
schmod_cpuset.so
schmod_craylinux.so
schmod_crayx1.so
schmod_dc.so
schmod_default.so
schmod_dist.so
schmod_fairshare.so
schmod_fcfs.so
schmod_jobweight.so
schmod_limit.so
schmod_mc.so
schmod_parallel.so
schmod_preemption.so
schmod_ps.so
schmod_pset.so
schmod_reserve.so
schmod_rms.so
schmod_xl.so
libfairshareadjust.a
liblsbstream.a
libsec_ego_default.so
lsf_release
misc/examples/external_plugin/allocexample.c
misc/examples/external_plugin/Makefile
misc/examples/external_plugin/matchexample.c
misc/examples/external_plugin/myplugin.c
misc/examples/external_plugin/README
misc/examples/external_plugin/sch.mod.fcfs.c
misc/examples/clearcase/daemons.wrap.c
packagedef.txt
resource_connector/azure/lib/AzureTool.jar
resource_connector/azure/conf/azureprov_config.json
resource_connector/azure/conf/azureprov_templates.json
resource_connector/azure/scripts/getAvailableTemplates.sh
resource_connector/azure/scripts/getRequestStatus.sh
resource_connector/azure/scripts/requestMachines.sh
resource_connector/azure/scripts/requestReturnMachines.sh
resource_connector/azure/scripts/user_data.sh
resource_connector/ego/scripts/Main.py
resource_connector/openstack/scripts/Main.py
resource_connector/openstack/scripts/OpenStackClient.py
resource_connector/openstack/scripts/userscript.sh
resource_connector/openstack/scripts/MachineFile.py
resource_connector/openstack/scripts/user_data.sh
resource_connector/openstack/conf/osprov_templates.json
resource_connector/openstack/conf/osprov_config.json
resource_connector/aws/conf/awsprov_templates.json
resource_connector/aws/conf/awsprov_config.json
resource_connector/aws/conf/credentials
resource_connector/aws/lib/AwsTool.jar
resource_connector/aws/scripts/user_data.sh
resource_connector/aws/scripts/getAvailableMachines.sh
resource_connector/aws/scripts/getAvailableTemplates.sh
resource_connector/aws/scripts/getRequestStatus.sh
resource_connector/aws/scripts/getReturnRequests.sh
resource_connector/aws/scripts/requestMachines.sh
resource_connector/aws/scripts/requestReturnMachines.sh
resource_connector/policy/Main.py
resource_connector/policy/Log.py
resource_connector/policy/PolicyFile.py
resource_connector/softlayer/lib/SoftLayerTool.jar
resource_connector/softlayer/scripts/getRequestStatus.sh
resource_connector/softlayer/scripts/requestReturnMachines.sh
resource_connector/softlayer/scripts/getAvailableTemplates.sh
resource_connector/softlayer/scripts/requestMachines.sh
resource_connector/softlayer/scripts/postprovision.sh
resource_connector/softlayer/conf/credentials
resource_connector/softlayer/conf/softlayerprov_config.json
resource_connector/softlayer/conf/softlayerprov_templates.json
resource_connector/hostProviders.json
util/elim.mic.ext/README
esub.p8aff(Only need on lnx3.10-glibc2.17-ppc64le)

 

7.   Product notifications

To receive information about product solution and patch updates automatically, subscribe to product notifications on the My notifications page (www.ibm.com/support/mynotifications) on the IBM Support website (support.ibm.com). You can edit your subscription settings to choose the types of information you want to get notification about, for example, security bulletins, fixes, troubleshooting, and product enhancements or documentation changes.


8.   Copyright and trademark information

© Copyright IBM Corporation 2017

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM®, the IBM logo and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.