IBM Platform LSF 9.1.3 Fix 392106 Readme File

Abstract

RFE 73159. This fix introduces the lsreghost command to register the IP and host name of your local host with LSF servers so that LSF does not need to use the DNS to resolve your local host.

Description

Readme documentation for IBM Platform LSF 9.1.3 Fix 392106 including installation-related instructions, prerequisites and co-requisites, and list of fixes.
This fix addresses the issue of resolving the host name and IP address for LSF client hosts with non-static IP addresses in environments where the DNS is not able to properly resolve these hosts after their IP addresses change.

The following describes how lsreghost works and how to properly configure and run lsreghost to directly register LSF client hosts with LSF servers so that LSF servers can internally resolve these client hosts without needing the DNS.

Configuring registration handling on the LSF Server

The following parameter in the lsf.conf file enables registration handing:

LSF_REG_FLOAT_HOSTS=Y|y|N|n

On the LSF server, LIM receives the registration request from clients running the lsreghost service. From this request, LIM receives the host name, IP address, and (on Windows only) the computer SID of the client host. This information is stored in the reghostscache file.

The LSF_REG_FLOAT_HOSTS parameter must be set on the LSF server. This parameter enables LSF daemons to look for records in the reghostscache file when attempting to look up host names or IP addresses.

By default, the reghostscache file is stored in the file path as defined by the LSB_SHAREDIR parameter in the lsf.conf file. Using a shared directory defined by LSB_SHAREDIR allows the reghostscache file to be shared with as many LSF servers as possible. If LSB_SHAREDIR is a shared directory that is accessible to all LSF servers, hosts that use this registering mechanism must only register to the master host. Otherwise, register to the master host and any individual server hosts with a local LSB_SHAREDIR.

 

If LSB_SHAREDIR is not defined, the reghostscache file is placed in the LSF_TOP directory.

A typical record in the reghostscache file might look like the following:
MyHost1    192.168.1.2    S-1-5-21-5615612300-9789239785-9879786971

Windows clients that register have their computer SID included as part of the record. If a registration request is received from an already registered host but its SID does not match with the corresponding record's SID in the reghostscache file. This new registration request is rejected, which prevents malicious clients from imitating another host's name and registering itself as another host.

lsreghost client for Windows
Usage: lsreghost [-h | -V]
           lsreghost (-i <file_path>/hostregsetup | -r | -s | -e)

-h: Displays usage information
-V: Displays product version
-i <file_path>/hostregsetup: Installs lsreghost as a Windows service. This option requires the full file path of the hostregsetup file.
-r: Uninstalls and removes the lsreghost Windows service
-s: start lsreghost service
-e: end/stop lsreghost service

Note: The lsreghost command modifies Windows services, which means that it must be run with administrator privileges. Be sure to select “Run as administrator” when launching a command prompt to run this command.


Optional client side parameters in the lsf.conf file:
LSF_CONNECTION_CHANGE=Y|y|N|n
LSF_REG_HOST_INTERVAL=integer (minutes)

hostregsetup file format
The hostregsetup file is used to specify the LSF servers to which the local host must register itself. Define the LSF servers as follows:

serverName1
serverName2
...

Each line in the hostregsetup file contains the host name of one LSF server. (Empty lines and #comment text are allowed)
Note that if LSB_SHAREDIR can be accessed by all the servers in a cluster, only the master LIM host should be defined in the hostregsetup file, as explained in the "Configuring registration handling on the LSF server" section of this README.

The full path to the hostregsetup file must be specified with the install (-i) option. This file must be accessible every time the service starts up, and remain accessible when the service is running.

For example,

            lsreghost -i C:\LSF_9.1\hostregsetup

How to use the lsreghost Windows service
Once the hostregsetup file is ready, install the lsreghost service by using the command:

            lsreghost -i <file_path>/hostregsetup

If it is installed correctly, the lsreghost service can be seen in Windows Task Manager's Services tab. The service's startup type is set to automatic, which means that the lsreghost service automatically starts up every time the local machine starts up. You can also start up the lsreghost service by using the following command:

            lsreghost -s

Note: lsreghost.exe must be in the LSF_BINDIR directory for the service to work.

When it starts up, lsreghost registers the local host with LSF servers defined in the hostregsetup file. The service then periodically checks the primary IP address of the local machine, and if it detects a change in the primary IP address, the service sends another registration message to LSF servers to register the new IP address.

If the LSF_CONNECTION_CHANGE parameter in the lsf.conf file is set on the client side, lsreghost also registers with LSF servers whenever it detects a change in the total number of connections (IP addresses) that are associated with the local host.

Another lsf.conf parameter LSF_REG_HOST_INTERVAL can be set on the client side. This parameter takes an integer value X in minutes, and if defined, lsreghost always resend a registration message to LSF servers every X minutes.


These two parameters LSF_CONNECTION_CHANGE and LSF_REG_HOST_INTERVAL are read during service startup, so if the lsreghost service is already running when these parameters are set, then the service must be restarted. This is not the case with hostregsetup - servers can be added or deleted and lsreghost rereads the file when it resends a registration message.

To stop the lsreghost service, run the following command:
            lsreghost -e

To remove lsreghost from Windows Services, run the following command:
            lsreghost -r

lsreghost client for Linux
Usage: lsreghost [-h | -V]
           lsreghost -s <file_path>/hostregsetup

-h: Displays usage information
-V: Displays product version
-s <file_path>/hostregsetup: Sends a registration request to LSF servers

lsreghost on Linux is a command that registers with LSF servers when run.
            lsreghost -s <file_path>/hostregsetup

When running lsreghost with the -s option, lsreghost sends a single register message to all LSF servers defined in the hostregsetup file and exits.

For security purposes, lsreghost can only be run with root privileges. If you want to run lsreghost at set intervals, set up a Linux cron job.

The following is a sample cron job script:
#!/bin/sh
previous_ip=""
ip_rec_file=/tmp/ip.record
current_ip=`ifconfig eth0  | grep "inet addr:" | awk '{
    split($0, a, " ");
    print a[2]; /*addr:9.21.14.28*/
}'`
if [ -f "${ip_rec_file}" ]; then
    previous_ip=`cat ${ip_rec_file}`
else
    rm -rf ${ip_rec_file}
    echo ${current_ip} >  ${ip_rec_file}
fi
if [ "$current_ip" != "$previous_ip" ]; then
    echo ${current_ip} > ${ip_rec_file}
    echo "ip is changed, register this host by calling lsreghost -s $HOME/hostregsetup"
    lsreghost -s $HOME/hostregsetup
fi

A cron job can be configured to run the sample script every minute, or any other time interval.

Recommendations and limitations
For environments where LSF servers have non-shared file systems, the local host registration request may need to be sent to several LSF servers. This might not scale well if client hosts send registration messages too frequently in clusters with many servers.

 

For environments where all LSF severs have access to a shared file system location defined by LSB_SHAREDIR, all users of lsreghost must register to the master host. If different lsreghost users register to different LSF servers in a shared environment, the internal cache of these servers becomes inconsistent, which causes the shared reghostscache file to be inaccurate.

The LSF_REG_HOST_INTERVAL parameter is intended to help in situations where certain LSF servers are down and therefore miss the registration request. Re-registration on a periodic timer defined with LSF_REG_HOST_INTERVAL should help keep the reghostscache up to date when servers come back online after being down. Note that setting a low interval for this parameter may cause unnecessary network traffic that might not scale well in large clusters.

Platform MultiCluster and multiple clusters:
LSF servers in a MultiCluster environment can simply be listed in the hostregsetup file and lsreghost will register with them. This also works for multiple independent LSF clusters if the LIM ports in these clusters are the same. If the LIM port in different clusters are different, you must manually register to another cluster's servers by restarting the lsreghost service after changing lsf.conf to point to the new cluster, including changing LSF_LIM_PORT to the port used in the new cluster.

Windows race condition:
A special race condition issue might be encountered by the lsreghost service with Windows network mechanisms. When the primary IP address of a Windows host changes due to the Windows OS switching to use a new network interface, lsreghost might detect the IP change before Windows actually starts using the new interface to forward packets. In this case, lsreghost sends a registration message, but the message stills contain the old IP address because the new interface has not yet switched to the new IP address. If this issue is encountered, you must restart lsreghost (by running lsreghost -e, then lsreghost -s) after Windows completes switching its primary network interface. Defining LSF_CONNECTION_CHANGE might help with this race condition issue because it will enable lsreghost to register again 30 seconds after the initial primary IP address change. In the common case where the total number of addresses also changed upon a change in primary IP address.

Firewall:
If a firewall exists between LSF servers and the local host running the lsreghost service, you may need to manually re-register after a primary IP address change by restarting lsreghost. In this case, after the primary IP address of the local host changes, lsreghost sends a register request to LSF servers, but the firewall blocks this request (for example, because you have not yet authenticated with the new IP address). In this case, you must re-register manually after authenticating past the firewall.
(by running lsreghost -e, then lsreghost -s)

Windows client host gets a new SID:
In rare cases, a registered Windows LSF client host gets a new SID. This might happen if the client host is replaced by a new host, and this new host is given the exact same host name as the old LSF client host. In this case, the new host would not be able to register because its SID does not match the SID for the given host name that already registered into the reghostscache. To resolve this issue, you must manually edit the reghostscache file on the LSF server side. The LSF administrator needs to find the record for the given host name and remove that record (line) from the reghostscache file. After editing the reghostscache file, restart the LIM and the new host can register with its new SID.

 

Readme file for: IBM® Platform LSF

Product/Component Release: 9.1.3

Update Name: Fix 392106

Fix ID: LSF-9.1.3-build392106

Publication date: 11th March 2016

Last modified date: 11th March 2016

Contents:

 

1.     List of fixes

2.     Download location

3.     Products or components affected

4.     System requirements

5.     Installation and configuration

6.     List of files

7.     Product notifications

8.     Copyright and trademark information

 

1.   List of fixes

RFE 73159

2.   Download Location

Download Fix 392106 from the following location: http://www.ibm.com/eserver/support/fixes/

3.   Products or components affected

Affected components include: LSF/lim, LSF/mbatchd, LSF/sbatchd, LSF/res. LSF/nios, and LSF/lsreghost

 

4.   System requirements

Linux2.6-glibc2.3-x86_64

Windows Server 2008 R2 SP1 (equivalent or greater)

5.   Installation and configuration

 

5.1          Before installation

 

On Windows hosts:

No action required.

 

On Linux hosts:

(LSF_TOP=Full path to the top-level installation directory of LSF)

1)    Log on to the LSF master host as root.

2)    Set your environment:

-      For csh or tcsh: % source LSF_TOP/conf/cshrc.lsf

-      For sh, ksh, or bash: $ . LSF_TOP/conf/profile.lsf

 

5.2          Installation steps

 

On Windows hosts:

1)     Log on to the Windows host as LSF administrator.

2)     Download the lsf9.1.3_win-x64-392106.msp file to each Windows host in your cluster.

3)     Double-click the .msp icon and follow the installation instructions.

 

On Linux hosts:

1)    Go to the patch install directory: cd $LSF_ENVDIR/../9.1/install/

2)    Copy the patch file to the install directory $LSF_ENVDIR/../9.1/install/

3)    Run patchinstall: ./patchinstall <patch>

 

5.3          After installation

 

For Windows:

1)     Log on to the Windows LSF master host as LSF administrator.

2)     Restart LSF services. (lsfrestart)

3)     Follow instructions in the Description section of this Readme on how to enable the lsreghost mechanism.

 

For Linux:

1)    Log on to the LSF master host as root.

2)    Restart LSF daemons. (lsfrestart)

3)    Follow instructions in the Description section of this Readme on how to enable the lsreghost mechanism.

 

5.4          Uninstallation

 

For Windows:

To roll back a patch:

1)     Select Control Panel -> Programs -> Programs and Features.

2)     Click “View installed updates”, select this specific patch (Update for LSF 9.1.3 build “392106”) and click Uninstall.

3)     Restart LSF services. (lsfrestart)

 

For Linux:

To roll back a patch:

1)    Log on to the LSF master host as root.

2)    Run ./patchinstall –r <patch>

3)    Restart LSF daemons. (lsfrestart)

 

6.   List of files

 

lsreghost
lim
mbatchd
sbatchd
res
nios

(These have “.exe” extension on Windows)

7.   Product notifications

To receive information about product solution and patch updates automatically, subscribe to product notifications on the My notifications page (www.ibm.com/support/mynotifications) on the IBM Support website (support.ibm.com). You can edit your subscription settings to choose the types of information you want to get notification about, for example, security bulletins, fixes, troubleshooting, and product enhancements or documentation changes.

 

8.   Copyright and trademark information

© Copyright IBM Corporation 2016

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM®, the IBM logo and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.