Readme for Zeppelin 0.7.0 Notebook in IBM Spectrum Conductor with Spark 2.2.0

Readme for IBM^® Spectrum Conductor with Spark 2.2 Interim Fix 441457

Readme file for: IBM Spectrum Conductor with Spark
Product/Component Release: 2.2.0
Update Name: Zeppelin 0.7.0 notebook package
Fix ID: cws-2.2-build441457
Publication date: March 17, 2017

This update provides the Zeppelin 0.7.0 notebook package for IBM Spectrum Conductor with Spark V2.2.0. Zeppelin 0.7.0 supports all Spark versions supported in V2.2.0: Spark 1.5.2, Spark 1.6.1, Spark 2.0.1, and Spark 2.1.0 (requires interim fix cws-2.2-build437098 to be installed).

Installation and configuration

Follow the information in this section to install and configure this notebook package:

System requirements

Linux 64-bit or Linux on POWER 64-bit LE. For details, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.0/supported_system_config/supported_os.html.

Prerequisites

o IBM Spectrum Conductor with Spark V2.2.0 must be installed on a supported operating system. For details, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.0/installing/install_roadmap_s.html.

o Install cURL 7.28.0 or higher on all hosts that will run the Zeppelin notebook. You can download cURL from https://curl.haxx.se/download.html.

Installation

1. Download the Zeppelin-0.7.0.tar.gz package to a local directory on your computer.

2. Launch a browser and log in to the cluster management console as cluster administrator.

3. Go to Workload > Spark > Notebook Management.

4. In the Spark Notebook Management page, click Add.

5. In the Deployment Settings tab, enter the required fields as follows:

o Name: Zeppelin

o Version: 0.7.0

o Package: Click Browse and select the Zeppelin-0.7.0.tar.gz package that you downloaded previously.

o Prestart command: ./scripts/prestart_zeppelin.sh

o Start command: ./scripts/start_zeppelin.sh

o Stop command: ./scripts/stop_zeppelin.sh

o Job monitor command: ./scripts/jobMonitor.sh

IMPORTANT: If you specify a base port (from which the system tries to find available ports for the Zeppelin notebook), take note that this base port setting is ignored. Notebook port numbers are dynamically generated; as a result, the actual port used by the notebook may not be the same as the specified port.

6. Click Add.

The Zeppelin notebook is added to your cluster.

7. Create a Spark instance group and select the Zeppelin notebook that you added. Edit the notebook configuration to specify the execution user for this notebook in the Deployment Settings tab. Make other changes as required. For more information, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.0/developing_instances/instance_create_about.html.

8. Assign users for the Zeppelin notebook. For more information, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.0/managing_instances/mapping_notebooks.html.

9. Launch the Zeppelin notebook. For more information, see https://www.ibm.com/support/knowledgecenter/SSZU2E_2.2.0/managing_applications/notebooks_launching.html.

Once the notebook launches in a new window, enter the notebook user name and password. This should be the same name as the user assigned to this notebook. After successful authentication, you can take advantage of Zeppelin notebook capabilities in your IBM Spectrum Conductor with Spark cluster. For more information about how to use Zeppelin, see https://zeppelin.apache.org.

Known issues with Helium visualization

From Zeppelin 0.7.0, you can load or unload a pluggable Zeppelin package at runtime through the Helium framework. At the time of this release, the following known issues exist with the Zeppelin 0.7.0 source:

o On hosts without direct access to the Internet, Zeppelin notebook services take a long time to start up and Zeppelin logs include the following error: “Could not download npm”. The Zeppelin notebook eventually starts, but you cannot use Helium visualization features. No workaround is available.

o On Docker hosts, Zeppelin might fail to start because of timeout issues. No workaround is available.

o On Docker hosts, Zeppelin logs show an error indicating an error during the npm install. As a result, you cannot use Helium visualization features. To work around this issue, edit the Spark instance group configuration and add your notebook execution user's home directory as a writable data volume for the notebook.

Copyright and trademark information

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM®, the IBM logo, and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.