IBM® Spectrum Conductor with Spark 2.2.1 Readme for Interim Fix 481699

Readme file for: IBM Spectrum Conductor with Spark
Product/Component Release: 2.2.1
Update Name: Zeppelin 0.7.3 notebook package
Fix ID: cws-2.2.1.0-build481699
Publication date: Apr 12, 2018

This update provides the Zeppelin 0.7.3 notebook package for IBM Spectrum Conductor with Spark 2.2.1. Zeppelin 0.7.3 supports all Spark versions that are supported in IBM Spectrum Conductor version 2.2.1, but it does not support Spark versions 1.5.2, 2.0.1, and 2.1.0 when SSL is enabled. Disable SSL to use Zeppelin 0.7.3 with Spark versions 1.5.2, 2.0.1, and 2.1.0.

Installation and configuration

Follow the information in this section to install and configure this notebook package:

System requirements

Linux 64-bit or Linux on POWER 64-bit LE.

For details, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.1/supported_system_config/supported_os.html.

Prerequisites

o   IBM Spectrum Conductor with Spark 2.2.1 must be installed on a supported operating system. For details, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.1/installing/install_roadmap_s.html.

o   cURL 7.28.0 or higher must be installed on all hosts that will run the Zeppelin notebook. You can download cURL from https://curl.haxx.se/download.html.

Installation

1.     On the client machine where you have a browser, decompress the cws-2.2.1.0_build481699.tgz package. For example, on Linux:

> mkdir -p /tmp/build481699

> tar zoxf cws-2.2.1.0_build481699.tgz -C /tmp/build481699

2.     Launch a browser and log in to the cluster management console as the cluster administrator.

3.     Navigate to Workload > Spark > Notebook Management.

4.     On the Notebook Management page, click Add.

5.     In the Deployment Settings tab, enter the required fields as follows:

o   Name: Zeppelin

o   Version: 0.7.3

o   Package: Click Browse and select /tmp/build481699/Zeppelin-0.7.3.tar.gz.

o   Start command: ./scripts/start_zeppelin.sh

o   Stop command: ./scripts/stop_zeppelin.sh

o   Job monitor command: ./scripts/jobMonitor.sh

IMPORTANT: If you specify a base port (from which the system tries to find available ports for the Zeppelin notebook), take note that this base port setting is ignored. Notebook port numbers are dynamically generated; as a result, the actual port that is used by the notebook may not be the same as the specified port.

6.     Click Add. The Zeppelin notebook is added to your cluster.

7.     Create a Spark instance group and select the Zeppelin notebook that you added. Edit the notebook configuration to specify the execution user for this notebook in the Deployment Settings tab. Make other changes as required. For more information, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.1/developing_instances/instance_create_about.html.

8.     Assign users for the Zeppelin notebook. For more information, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.1/managing_instances/mapping_notebooks.html.

9.     Launch the Zeppelin notebook. For more information, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.1/managing_applications/notebooks_launching.html.

Once the notebook launches in a new window, enter the notebook user name and password. This should be the same name as the user that is assigned to this notebook. After successful authentication, you can take advantage of Zeppelin notebook capabilities in your IBM Spectrum Conductor with Spark cluster. For more information about how to use Zeppelin, see https://zeppelin.apache.org.

Known issues with Helium visualization

With Zeppelin 0.7.3, you can load or unload a pluggable Zeppelin package at runtime through the Helium framework. At the time of this release, the following known issues exist with the Zeppelin 0.7.3 source:

o   On hosts without direct access to the Internet, Zeppelin notebook services take a long time to start up and Zeppelin logs produce the following error: “Could not download npm”. The Zeppelin notebook eventually starts, but you cannot use Helium visualization features. No workaround is available.

o   On Docker hosts, Zeppelin might fail to start because of timeout issues. No workaround is available.

o   On Docker hosts, Zeppelin logs produce an error indicating that an error occurred during the npm install. As a result, you cannot use Helium visualization features. To work around this issue, edit the Spark instance group configuration and add your notebook execution user's home directory as a writable data volume for the notebook.

Copyright and trademark information

© Copyright IBM Corporation 2018

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM®, the IBM logo, and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml