===============================================================================
Readme
file for: IBM Platform Conductor for Spark
Product/Component
Release: 1.1.0
Update
name: Spark 1.5.2 Package
Fix ID:
pcs-1.1-build398394
Publication
date: 8 April 2016
Updated
Spark 1.5.2 package for IBM® Platform Conductor for Spark v1.1.0
===============================================================================
=========================
CONTENTS
=========================
1. About
the Spark 1.5.2 package
2.
Prerequisites
3.
Installation and Configuration
4.
Copyright
=========================
1. About
the Spark 1.5.2 package
=========================
The
Spark version package (Spark1.5.2-Conductor1.1.tgz) supports Spark version 1.5.2 for Platform
Conductor for Spark v1.1.0 and includes the following updates:
·
Application & Notebooks page did
not display details if the Spark instance group execution user was different
from Cluster Admin
Previously, if the
execution user specified when creating a Spark instance group was not the
Cluster Admin specified at installation time, the Applications & Notebooks
page did not display any applications. This issue is now fixed.
·
Spark applications cannot be killed
by spark-submit
Spark applications
could not be killed when multiple applications were running at the same time.
This issue is now fixed.
·
Service-to-slot ratio
You can now configure each task to run with multiple slots at the Spark
instance group level. To configure this setting, edit the Spark instance group
configuration and define SPARK_EGO_SLOTS_PER_TASK under Session Scheduler settings. Valid value is a positive integer
starting from 1; the default setting is 1.
·
Support for exclusive slot
allocation
You can now schedule Spark applications using the exclusive slot allocation
policy. When slot allocation is exclusive, all of a host's free slots are
allocated to one consumer at a time. Use this allocation to resolve resource
fragmentation.
·
Support for hybrid policy with
exclusive slot allocation
When
using the exclusive slot allocation policy with a hybrid policy, you can now
configure the reclaim grace period to specify the duration (in seconds) that
the Spark master will wait to reclaim resources from applications. To configure
this setting, edit the Spark instance group configuration and define SPARK_EGO_RECLAIM_GRACE_PERIOD
under Session Scheduler settings.
Valid value is in the range 0 – 8640000. The default setting is 0, wherein the
Spark driver kills any running tasks and returns resources at once to the Spark
master.
·
Spark application crashes
Previously, a JNI library mismatch with the EGO version caused some Spark
applications to crash. This issue is now fixed.
·
SPARK_EGO_EXECUTOR_SLOTS_RESERVE
behavior
In some cases,
setting SPARK_EGO_EXECUTOR_SLOTS_RESERVE
(under Spark on EGO settings) to more than 1 increased the resource demand
of Spark applications far more than expected. This behavior is now fixed.
·
Prevent kill action from the Master
UI when EGO_AUTH mode set
You can no longer kill (stop) Spark applications from
the Spark master UI when EGO_AUTH
mode is configured with spark.ui.killEnabled
set to true.
·
Support uname
and passwd to stop Spark applications using RESTful
APIs
You can now stop (kill) Spark applications using
RESTful APIs with spark.ego.uname
and spark.ego.passwd
set.
·
Master UI now displays Spark
applications
Previously, the Spark master UI would not render Spark
applications correctly. With this issue now fixed, the master UI builds
correctly to display Spark applications.
·
Rebuilding historical Spark web UI
no longer needs to be asynchronous
Previously, when a long-running application finishes,
it took a while (sometimes minutes) to rebuild the Spark web UI. In the
meantime, no other applications could register with this master. This issue has
now been fixed (https://issues.apache.org/jira/browse/SPARK-12062)
=========================
2. Prerequisites
=========================
2.1 Platform Conductor for Spark v1.1.0 must be
installed on a supported operating system. For more information, see http://www.ibm.com/support/knowledgecenter/SSVH2B_1.1.0/install/install.dita.
2.2 Interim fix 398394 (conductor1.1.0_x86_64-build398394.tar.gz) must
be installed. Refer to readme_build398394.html for instructions.
=========================
3.
Installation and Configuration
=========================
3.1 Download the Spark1.5.2-Conductor1.1.tgz package to a local directory on
your computer.
3.2 Log in to the Platform Management
Console (PMC) as cluster administrator.
3.3 From the PMC, click Workload > Spark >
Version Management.
3.4 If you have previously added Spark 1.5.2, select
the Spark 1.5.2 package and click Remove.
3.5 Click Add.
3.6 Click Browse and select the Spark1.5.2-Conductor1.1.tgz package from this interim fix.
3.7 Click Add.
The updated Spark 1.5.2 package is
added to Platform Conductor for Spark.
NOTE: If you do not want your existing
Spark instance groups to pick up the new Spark 1.5.2 version, do not edit the
Spark configuration for those Spark instance groups. Editing the configuration
will cause the new Spark 1.5.2 package to be deployed for that Spark instance
group.
3.8 To modify an existing Spark instance group
using a previous Spark 1.5.2 package to use the new Spark 1.5.2 package,
perform the following:
1) Stop the existing Spark
instance group.
2) Back up the
configuration files for this Spark instance group:
$SPARK_HOME/conf
$SPARK_HOME/../master_conf
$SPARK_HOME/../shuffle_conf
$SPARK_HOME/../history_conf (if history was enabled)
Any other manual configuration
changes you have made.
3) Follow Option 1 or
2 to deploy the new Spark package to the existing Spark instance group.
The Spark package referred to below is the Spark1.5.2.tgz file that you extract from the Spark1.5.2-Conductor1.1.tgz file.
OPTION 1: Modify the Spark configuration in the existing Spark
instance group. Modifying the configuration triggers a redeploy of the Spark
package, which then picks up the latest version.
OPTION 2: Manually update the Spark package for the existing
Spark instance group:
a) From the command line interface, source the profile.
Find the package name
for the existing Spark instance group you want to update. The package name will
be in the form: SPARKINSTANCEGROUPNAME_Spark1.5.2.
For example, if the name of a Spark Instance Group is LOB, the package name
would be: LOB_Spark1.5.2. Also, find the
consumer path for this package; this is the top-level consumer for the Spark
instance group.
Run the command:
soamdeploy add <packagename> -p <Path to Spark1.5.2.tgz file> -c <consumerPath>
For example:
soamdeploy add LOB_Spark1.5.2 –p /tmp/Spark1.5.2.tgz
–c /LOB
This command creates
a new version of the package for your Spark instance group.
b) From the Spark Instance Groups
page, navigate to the Hosts tab for
the group and click Deploy Spark to
Hosts.
4) Start the Spark
instance group.
=========================
4.
Copyright
=========================
© IBM Corporation
2016
U.S. Government Users
Restricted Rights - Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.
IBM®, the IBM logo
and ibm.com® are trademarks of International Business Machines Corp., registered
in many jurisdictions worldwide.
Other product and
service names might be trademarks of IBM or other companies. A current list of
IBM trademarks is available on the
Web at
"Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml