IBM
Spectrum LSF 10.1 Solution LSF Docker Job Support
Readme File
Abstract
RFE#86007.
The solution
allows LSF to run job in Docker containers on demand. LSF manages the
entire life cycle of jobs running in the container as common
jobs.
Description
Readme
documentation for IBM Spectrum LSF 10.1 Solution build 423991
including installation-related instructions, prerequisites and
co-requisites, and list of fixes.
This
solution
enables LSF to run and manage jobs in Docker containers.
- Prerequisites
- Docker Engine, Version 1.12, or later, must be
installed on LSF server hosts. The Docker daemon must be started
on hosts and can successfully start containers.
- In the lsf.conf file, configure the following parameters:
LSF_PROCESS_TRACKING=Y
LSF_LINUX_CGROUP_ACCT=Y
LSB_RESOURCE_ENFORCE="cpu memory"
- In the lsf.shared file, configure the boolean resource
"docker":
Begin Resource
RESOURCENAME TYPE INTERVAL INCREASING
DESCRIPTION
docker
Boolean
()
()
(docker host)
- In the lsf.cluster file, attach the boolean
resource "docker" to LSF server hosts on which Docker Engine is
running.
This enables LSF to automatically dispatch Docker jobs to the "docker"
hosts.
Begin Host
HOSTNAME
model type
server r1m mem swp RESOURCES
host1
!
!
1 3.5 ()
()
(docker)
This
solution introduces a new parameter CONTAINER in the lsb.applications
file
for configuring the Docker job application profile.
Syntax
CONTAINER=docker[image(image-name) options(docker-run-options) starter(user-name)]
Description
image: Required. This keyword
configures the Docker image name that is used in running jobs.
options: Optional. This keyword
configures the Docker job run options, which are passed to the job
container by "docker run" in LSF.
starter: Optional. This keyword
specifies the name of the user that starts the "docker run" to launch
containers for jobs. The default user is the LSF primary administrator.
Note
1. Before specifying the Docker job run options, make sure that these
options work in the Docker run command line.
The "--cgroup-parent", "--user (-u)", and "--name"
options are reserved for LSF internal use. Do not use these options in
the options keyword configuration.
The "-w" and "--ulimit" options are automatically
set for LSF automatically. Do not use these options in the options
keyword configuration because the specifications here override the LSF
settings.
The "-v" option is automatically used to mount the
working directories that LSF needs: current working directory, job
spool directory, destination file for the "bsub -f" command, tmp
directory, top level LSF, and checkpoint directory on demand.
2.
"--rm" is proposed to be configured in options
keyword configuration to automatically remove containers after job
is done.
3.
The starter account must be root or the user configured in the
"docker" user group. To add a user to the "docker" user group, run the
following command:
sudo usermod -aG docker starter_username
Examples
CONTAINER=docker[image(image-name)
options(--rm)]
To
make blaunch works, the network and IPC must work across containers,
the execution user ID and user name mapping file must be mounted into
the container for blaunch authentication.
CONTAINER=docker[image(image-name)
options(--rm --network=host --ipc=host -v
/path/to/my/passwd:/etc/passwd)]
The passwd
file is in the following format:
user1:x:10001:10001:::
user2:x:10002:10002:::
Jobs
submitted to the docker application profile are started in the
container, for example:
bsub -app docker ./myjob.sh
1.
For
parallel Docker jobs, LSF will get the incorrect status DONE instead of
EXIT under the following conditions:
- Ctrl+C is used to cancel an interactive parallel
docker job
- If one task crashes when
RTASK_GONE_ACTION=KILLJOB_TASKEXIT is
configured in the lsb.applications file.
2. For "options" configuration in CONTAINER, it is not proposed to use
"--sig-proxy=false". This configuration may trigger a kernel bug to
make the whole system hang when bkill a Docker job running with it.
Readme
file for:
IBM® Spectrum LSF
Product/Component
Release:
10.1
Update
Name: Solution
LSF Docker job support
Fix
ID:
LSF-10.1-build 423991
Publication
date:
28 September 2016
Last
modified date:
28
September 2016
Contents:
1. List
of
fixes
2. Download
location
3. Products
or components affected
4. System
requirements
5. Installation
and configuration
6. List
of
files
7. Product
notifications
8. Copyright
and trademark information
1. List
of
fixes
RFE#86007.
LSF
Docker job support
2. Download
Location
Download Fix build 423991
from
the following location: http://www.ibm.com/eserver/support/fixes/
3. Products
or components affected
Affected
components
include: mbschd, mbatchd, sbatchd, res, bapp, blaunch, lsadmin, badmin
4. System
requirements
Linux2.6-glibc2.3-x86_64
Linux3.10-glibc2.17-x86_64
5. Installation
and configuration
5.1 Before
installation
(LSF_TOP=Full
path to the top-level installation directory of LSF.)
1) Log
on to
the LSF master host as root
2) Set
your
environment:
- For
csh or
tcsh: % source LSF_TOP/conf/cshrc.lsf
- For
sh,
ksh, or bash: $ . LSF_TOP/conf/profile.lsf
5.2 Installation
steps
1) Go
to the
patch install directory: cd $LSF_ENVDIR/../10.1/install/
2) Copy
the
patch file to the install directory $LSF_ENVDIR/../10.1/install/
3) Run
patchinstall: ./patchinstall <patch>
5.3 After
installation
1) Log
on to
the LSF master host as root
2) Run
lsfrestart
5.4 Uninstallation
To
roll
back a patch:
1) Log
on to
the LSF master host as root
2) Run
./patchinstall -r <patch>
3) Run
lsfrestart
6. List
of
files
mbschd
mbatchd
sbatchd
res
bapp
blaunch
badmin
lsadmin
7. Product
notifications
To
receive information about product solution and patch updates
automatically, subscribe to product notifications on the My
notifications page (www.ibm.com/support/mynotifications)
on the IBM Support website (support.ibm.com).
You can edit your subscription settings to choose the types of
information you want to get notification about, for example, security
bulletins, fixes, troubleshooting, and product enhancements or
documentation changes.
8. Copyright
and trademark information
©
Copyright IBM Corporation 2016
U.S.
Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp.
IBM®,
the IBM logo and ibm.com®
are trademarks of International Business Machines Corp., registered in
many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks
is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.