- LoadLeveler 5.1.0.x for Red Hat Enterprise Linux 6 (RHEL6) on servers with 64-bit Opteron or EM64T processors
- LoadLeveler 5.1.0.x for SUSE LINUX Enterprise Server 11 (SLES11) on servers with 64-bit Opteron or EM64T processors
- LoadLeveler 5.1.0.x for Red Hat Enterprise Linux 6 (RHEL6) on POWER servers
- If the scheduler and resource manager components on the same machine are not at the same level, the daemons will not start up.
- Please refer to the "Known Limitations" section under the fix pack README for more limitation information for this release.
- Please refer to the "Setting up control groups on a diskless (or stateless) cluster for preemption, process tracking, workload management (WLM), and checkpoint/restart" under the "Installation Information" section for more information on how to setup control groups.
Problems fixed in LoadLeveler 5.1.0.12 [October 12, 2012]
- LoadLeveler now shows correct value of ConsumableCpus when machine group is configured.
- The LoadLeveler job query commands will now return the correct "Step Cpus" value for the running job that requires ConsumableCpus in the node_resources keyword.
- The central manager daemon will not core dump when attempting to use the VerifyJobs transaction to contact thousands of LoadLeveler stard daemons.
- The LoadL_configurator daemon will not crash when the node tries to get the configuration data from the config hosts.
- A core dump problem when running command llstatus -L machine has been fixed.
- Resource Manager only:
- The handling of hierarchical communication errors is restored to the prior release behavior.
- LoadLeveler Startd daemon will ensure that the cpu map files are created before terminating a checkpointing job.
- The LOADL_HOSTFILE environment variable will be set in the environment of the job prolog and the user environment prolog.
- Obsolete code which attempts to terminate left over job processes is removed.
- LoadLeveler enables the use of mdcr 5 for checkpoint/restart on AIX. The name of the mdcr library will be changed to libmdcr5.so and the binary ll_mdcr-checkpoint will be built as a 64 bit binary since libmdcr5 is 64 bit.
- Blue Gene:
- Once LoadLeveler defects the error of BlueGene I/O node or compute node, it will put the nodes into drain state. And if a block fails to boot for three times, it will be destroyed.
- Scheduler only:
- The scheduler will ignore any floating resource requirement with a 0 value.
- A dead lock problem in resource manager daemon has been fixed.
Problems fixed in LoadLeveler 5.1.0.11 [August 27, 2012]
- The coredump problem when fetching step adapter usage information has been fixed.
- It has been fixed that the command "llstatus -l -L" shows submit-only node down.
- The negotiator daemon now correctly free the memory so that the core dump will not occur.
- Resource Manager only:
- The Region Manager has been modified to ignore all adapters on the same subnet as the adapter that was filtered out with adapter_list. Instead of the Region Manager marking those adapters down, those adapters will remain in an HB_UNKNOWN state.
- Blue Gene:
- If a Blue Gene job terminates due to a kill timeout, the node used by the job is availble for future jobs after the block in use has been freed.
- Scheduler only:
- Only the messages from the last iteration of topdog scheduling is printed out in the command "llq -s".The intermediate message is not printed.
- The accounting record which has a negative wall clock value is now skipped by the llsummary command.
Problems fixed in LoadLeveler 5.1.0.10 [July 20, 2012]
- The region manager failover and recovery code is changed to ensure that the resource manager is notified when a region manager becomes active which makes all active nodes and adapters available for scheduling.
- Resource Manager only:
- The resource manager daemon will not crash once startup LL if set D_FULLDEBUG for RESOURCE_MGR_DEBUG in LoadL_config file.
- Blue Gene:
- LoadLeveler Changes to use the new checkIO() call for V1R1M1 BlueGene software.
- The dependency check for the libbgsched shared object is removed from the LoadLeveler Blue Gene rpm so that the rpm nodeps option is no longer required.
- LoadLeveler llqres command will display the information for the Blue Gene reservation which specifying bg_block.
- A check that was preventing Blue Gene reservations from being modified has been fixed so the change request can be processed.
- When some nodeborard is down in one midplane, the Blue Gene small block job can run in the midplane if the resource can meet the job requirement.
- The nodeboard list that is returned from the BGQ scheduler API may not always be in order. LoadLeveler will sort this list to ensure it is in order before indexing on it.
Problems fixed in LoadLeveler 5.1.0.9 [June 19, 2012]
- Update for LINUX on 64-bit Opteron or EM64T processors ONLY
- Implemented internal LoadLeveler data contention improvements.
- Jobs were rejected when the schedd daemon was unable to determine the protocol versions for the nodes allocated to a job step it was trying to dispatch. The corre ct protocol version is being called now so that the jobs will be started correctly.
- Fixed Negotiator daemon memory leaks.
- Incorrect error messages seen for user prolog/epilog during llctl ckconfig command which is fixed by correcting the internal user variables names.
- Corrected inefficiency when reading configuration data from the database and protect against these kinds of performance issues that had caused LoadLeveler from st arting when large systems are configured.
- Corrected the lldbupdate to be able to update from 5.1.0.6 to 5.1.0.9.
Problems fixed in LoadLeveler 5.1.0.8 [June 15, 2012]
- Update for LINUX on POWER ONLY
- Implemented internal LoadLeveler data contention improvements.
- Jobs were rejected when the schedd daemon was unable to determine the protocol versions for the nodes allocated to a job step it was trying to dispatch. The correct protocol version is being called now so that the jobs will be started correctly.
- Fixed Negotiator daemon memory leaks.
- Incorrect error messages seen for user prolog/epilog during llctl ckconfig command which is fixed by correcting the internal user variables names.
- Corrected inefficiency when reading configuration data from the database and protect against these kinds of performance issues that had caused LoadLeveler from starting when large systems are configured.
- Corrected the lldbupdate to be able to update from 5.1.0.6 to 5.1.0.8.
Problems fixed in LoadLeveler 5.1.0.7 [June 8, 2012]
- Do not install LL 5.1.0.7 service update if you are using or planning to use a database for the LoadLeveler configuration.
- The llstatus command shows the startds to be up even though the llrstatus command shows the startd and the region manager they report to is actually down. The central manager will now be notified by the resource manager when the startd is marked as down by the resource manager so the llstatus command will now show the correct output state as the llrstatus command.
- Fixed llconfig from core dumping if trying to add a new machine_group or region to a cluster that has more than 128 machines.
- Fixed llconfig to correctly set the island in the maching_group.
- Blue Gene:
- LoadLeveler will correctly calculate the I/O ratio per midplane based on hardware state to support a mixed I/O environment on Blue Gene/Q.
Problems fixed in LoadLeveler 5.1.0.6 [April 27, 2012]
- Mandatory service pack for Red Hat Enterprise Linux 6 (RHEL6) on POWER servers.
- The CAU value is now allocated correctly on all the nodes on which the job is run.
- Resource Manager only:
- Fixed dead lock in Region manager daemon when determining heartbeat status and llstatus information will now show the correct status after reconfig.
- Fixed startd daemon from core dump when preempting a running job via suspend method.
- Fixed checking of process tracking during job termination so jobs will be able to terminate correctly in an environment that does that hve process tracking set.
- Blue Gene:
- Enhanced the support for Blue Gene block booting failures by draining problem hardware from the LoadLeveler cluster.
- Fixed problems with LoadLeveler scheduling blocks using pass through.
- Updated llq -h command output to reflect changes in Blue Gene terminology (Partitions are now referred to as Blocks)
- Corrected display of connectivity for large blocks in llsummary output.
- Fixed a problem calculating the minimum block size for LoadLeveler jobs when midplanes contain error with iolinks.
Problems fixed in LoadLeveler 5.1.0.5 [April 4, 2012]
- Fixed some memory leaks in Startd and Schedd daemons.
- If there is no network statement in the job command file, then the default network is used, which assumes ethernet. If the cluster does not have ethernet configured, then the job will stay in the "ST" state and not run. The default network support will now use the adapter associated with the hostname with which the machine is configured in the administration file.
- Fixed LoadL_master from core dumping during llctl stop in database environment due to timing locks.
- Fixed LoadL_negotiator from core dumping by not sending corrupted job step data to the central manager.
- Fixed lldbupdate command from getting the 2544-019 error message by parsing the database information correctly so LoadLever will be able to start up.
- Resource Manager only:
- A problem in pe_rm_connect() that caused read() to be called on a socket that was not ready to be read has been corrected, allowing pe_rm_connect() to continue to retry to the connection for the specified rm_timeout amount of time.
- Scheduler only:
- The list of reserved resources was not being updated properly when the reservation requesting a 0 count ended, leading to the core dump. That reservation list is now being being updated correctly in all cases.
Problems fixed in LoadLeveler 5.1.0.4 [March 16, 2012]
- Mandatory service pack for Red Hat Enterprise Linux 6 (RHEL6) and SUSE LINUX Enterprise Server 11 (SLES11) on servers with 64-bit Opteron or EM64T processors.
- LoadLeveler can now display the host name correctly based on the name_server configuration. The previous limitation of the name_server keyword being ignore is now lifted.
- On SLES11, the lldbupdate fails to connect to the database due to incorrect odbc.ini location is now corrected.
- Fixed Linux schedd daemon core dump in a mixed AIX and Linux cluster when submitting a job from the AIX cluster.
- Fixed potential central manager deadlock.
Copyright and trademark information
http://www.ibm.com/legal/copytrade.shtml
Notices
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. Some jurisdictions do not allow disclaimer of express or
implied warranties in certain transactions, therefore, this
statement may not apply to you.
This information could include technical inaccuracies or
typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new
editions of the publication. IBM may make improvements and/or
changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Microsoft, Windows, and Windows Server are trademarks of Microsoft
Corporation in the United States, other countries, or both.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino,
Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and
Pentium are trademarks or registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries.
Other company, product, or service names may be trademarks or
service marks of others.
THIRD-PARTY LICENSE TERMS AND CONDITIONS, NOTICES AND INFORMATION
The license agreement for this product refers you to this file for
details concerning terms and conditions applicable to third party
software code included in this product, and for certain notices
and other information IBM must provide to you under its license
to certain software code. The relevant terms and conditions,
notices and other information are provided or referenced below.
Please note that any non-English version of the licenses below is
unofficial and is provided to you for your convenience only. The
English version of the licenses below, provided as part of the
English version of this file, is the official version.
Notwithstanding the terms and conditions of any other agreement
you may have with IBM or any of its related or affiliated entities
(collectively "IBM"), the third party software code identified
below are "Excluded Components" and are subject to the following
terms and conditions:
- the Excluded Components are provided on an "AS IS" basis
- IBM DISCLAIMS ANY AND ALL EXPRESS AND IMPLIED WARRANTIES AND CONDITIONS WITH RESPECT TO THE EXCLUDED COMPONENTS, INCLUDING, BUT NOT LIMITED TO, THE WARRANTY OF NON-INFRINGEMENT OR INTERFERENCE AND THE IMPLIED WARRANTIES AND CONDITIONS OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- IBM will not be liable to you or indemnify you for any claims related to the Excluded Components
- IBM will not be liable for any direct, indirect, incidental, special, exemplary, punitive or consequential damages with respect to the Excluded Components.