IBM Spectrum LSF Data Manager 10.1 Fix Pack 4 (476296) Readme File

Abstract

IBM Spectrum LSF Data Manager 10.1 Fix Pack 4. This Fix Pack includes all issues and solutions resolved between 10.1 GA and 23 November 2017.

Description

Readme documentation for IBM Spectrum LSF Data Manager 10.1 Fix Pack 4 (476296) including installation-related instructions, prerequisites and co-requisites, and list of fixes.

The issues addressed in LSF Data Manager 10.1 Fix Pack 4:

ID

Fixed Date

Description

P102062

2017/04/14

This fix addresses the following issue:
When a job with a data requirement is initially submitted to an LSF cluster the data manager will attempt to create a meta-data file for it within its staging area. If the staging area is not accessible by the data manager, the metadata file might not get created. This is reported back to the mbatchd where this is interpreted to mean that the job's data requirement cannot be satisfed, and the job is terminated. The behavior of the data manager has been changed so that when metadata file creation fails, data manager periodically retries until it succeeds. During this period, the job with the data requirement remains pending.
Also, data transfer queues can now be configured with REQUEUE_EXIT_VALUES enabled in the lsb.queues file. If a transfer job exits with one of the configured values, it will be requeued. An enhanced mbatchd is part of LSF 10.1 Fix Pack 4.

P102401

2017/11/09

This fix addresses the following issue:
An LSF job can use the bstage command to stage out data (a file or a folder) from its working directory to the data manager's staging cache. If the tag option is not being used a transfer job is submitted by the data manager to stage the data out from the staging area to the remote destination. When 'bstage out' occurs the bstage command contacts the data manager that it wants to stage out. The data manager will create a path in its stage-out cache. In the specific case when -src specifies a folder, and the execution host cannot access the data manager's cache directly via the file system the path setup in the data manager's cache is not done correclty which results with the failure of the 'bstage out'.


The solutions in LSF Data Manager 10.1 Fix Pack 4:

ID

Fixed Date

Description

P102275

2017/08/31

The previous release of the IBM® Spectrum LSF Data Manager allowed users to specify a folder as the data requirement when submitting a job. LSF would attempt to discover the individual files contained within the folder, and generate a signature for each file. The result was that each file is an individual data requirement for the job, and potentially requires its own transfer job. The IBM® Spectrum LSF Data Manager has been enhanced so that when a folder is specified as a data requirement for a job, LSF generates a single signature for the folder as a whole, and only a single transfer job is required. Linked files can now be used when specifying a job data requirement, and the colon (:) character can now be used in the path of a job data requirement.

159307

2017/09/29

Enhance the LSF data manager implementation for the hybrid cloud environment using job forwarding with the LSF multicluster capability to enable the submission cluster to push the forwarding job’s data requirement to the execution cluster and to receive the forwarding job’s output back.

This fix enables the SNDJOBS_TO parameter for the data transfer queue in the execution cluster.
The path of the FILE_TRANSFER_COMMAND for the data manager host must exist in the submission cluster.
For example, in the submission cluster C1, the following definition in the lsb.queues file sets a queue to receive the data transfer job and to push the local data file to the remote execution cluster:

In submission cluster C1
Set a queue to receive the data transfer job to push local data file to remote execution cluster.

Begin Queue
QUEUE_NAME = receive
PRIORITY = 40
RCVJOBS_FROM = C2
HOSTS=host1 host2 # Transfer nodes of the submission cluster
RES_REQ = select[type==any]
End Queue

In execution cluster C2
Set a data transfer queue to forward data transfer job to submission cluster.

Begin Queue
QUEUE_NAME = data_transfer
DATA_TRANSFER=y
SNDJOBS_TO = receiveQ@C1
HOSTS=host3 host4 # Transfer nodes of the execution cluster
End Queue

Notes:
1. Configure password-less SSH and permanently add the following hosts to the .ssh/known_hosts file:
1.1 Transfer nodes that are needed to access all hosts in the local cluster
1.2 Transfer nodes in the submission cluster
1.3 Transfer nodes in the job execution cluster
1.4 Compute nodes in the job execution cluster
1.5 NFS server for the staging area if the compute node does not mount the staging area

2. To prevent overusing transfer nodes for handling forwarded data transfer jobs, configure the load balance policy between the transfer queue and the receive-jobs queue, such as queue-based fairshare (slot_pool and slot_share) and slot limit for each queue.

3. If the submission cluster for the remote batch job is not specified in the send-jobs queue, the data transfer job runs in the main job execution cluster instead of forwarding back to the submission cluster.

4. If the job with the data requirement is submitted to the execution cluster C2, the data transfer logic is the same as for a single cluster

Readme file for: IBM® Spectrum LSF Data Manager

Product/Component Release: 10.1

Update Name: Fix 476296

Fix ID: lsf-dm-10.1.0.4-spk-2017-Dec-build476296

Publication date: 11 December 2017

Last modified date: 11 December 2017

Contents:

1.     List of fixes

2.     Download location

3.     Products or components affected

4.     System requirements

5.     Installation and configuration

6.     List of files

7.     Product notifications

8.     Copyright and trademark information

 

1.   List of fixes

P102062, P102275, P102401

2.   Download Location

Download Fix 476296 from the following location: http://www.ibm.com/eserver/support/fixes/

3.   Products or components affected

Components affected by the issues and solutions addressed in LSF Data Manager Version 10.1 Fix Pack 4 include:
LSF/bmod
LSF/bsub

Data Manager/bdata
Data Manager/bstage
Data Manager/dmd
 

4.   System requirements

lnx26-x64
lnx310-x64
lnx310-ppc64le

 

5.   Installation and configuration

 

5.1          Before installation

 

The prerequisite for applying this patch is that LSF 10.1 Fix Pack 4 has been installed. It can be downloaded from http://www.ibm.com/eserver/support/fixes/. Search for "build476021".


LSF_TOP=Full path to the top-level installation directory of LSF.

1)    Log on to the LSF master host as root

2)    Set your environment:

-      For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf

-      For sh, ksh, or bash: $ . LSF_TOP/conf/profile.lsf

 

5.2          Installation steps

 

1)    Go to the patch install directory: cd $LSF_ENVDIR/../10.1/install/

2)    Copy the patch file to the install directory $LSF_ENVDIR/../10.1/install/


3)    Run patchinstall: ./patchinstall <patch>

 

5.3          After installation

 

1)    Run
badmin mbdrestart
bdata admin reconfig



 

5.4          Uninstallation

 

To roll back a patch:

1)    Log on to the LSF master host as root

2)    Set your environment:

-      For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf

-      For sh, ksh, or bash: $ . LSF_TOP/conf/profile.lsf


3)    Run ./patchinstall -r <patch>

4)    Run
badmin mbdrestart
bdata admin reconfig



6.   List of files in package

 

bmod
bsub
bdata
bstage
dmd

 

7.   Product notifications

To receive information about product solution and patch updates automatically, subscribe to product notifications on the My notifications page (www.ibm.com/support/mynotifications) on the IBM Support website (support.ibm.com). You can edit your subscription settings to choose the types of information you want to get notification about, for example, security bulletins, fixes, troubleshooting, and product enhancements or documentation changes.


8.   Copyright and trademark information

© Copyright IBM Corporation 2017

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM®, the IBM logo and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.