===================================================================== Abstract: This fix pack is for Platform Analytics for Symphony. This fix pack includes issue resolutions, enhancements, performance tuning and all previously released patches that are already available on Fix Central. Readme file for: IBM Platform Analytics 9.1.1.0 Product/Component Release: 9.1.1 Update name: Fix Pack 340779 Publication date: 04 June 2015 Last modified: 04 June 2015 APAR: P100989 Description: This fix pack resolves several issues, such as RPT data loss and time zone correction. It also includes enhancements including usability improvement for workbooks, and performance tuning. ====================================================================== ========================= CONTENTS ========================= 1. About the fix pack 2. Products or components affected 3. Supported operating systems 4. QA performance reference 5. Installation and Configuration 6. File List 7. Copyright ========================= 1. About the fix pack ========================= This fix pack includes three major improvements: Data correction fixes, performance tuning, and usability improvements. For node and server, it also includes all previously released patches that are already available on Fix Central. 1.1 Data correction - Resolve RPT data loss issue due to data latency. Add a new ETL for collecting cluster and application data: * symclusterandapp Related ETL: * symresourceallocation * symsessioninformation * symsessionprofilingexpress * symtaskruntimeCount * symtaskruntimerank * symworkloadaccounting Note: The symtaskruntimeCount and symtaskruntimerank ETL may run a bit longer than before because application information will be checked for data latency. - Enhanced alc to get time zone of each cluster, instead of hard coding time zone in ETL Related schema: -Two new columns added to table REGISTER_INFO: * TZ_ID VARCHAR(128) * LAST_UPDATE_TIME TIMESTAMP Related ETL: * symsessionprofilingexpress - Others * Corrected task pending time in the Workload Accounting workbook * Corrected task pending time in the resource allocation ETL * Resolved data loss issue that occurred when disabling ResAllocEvent ETL and enabling it at a later time * Corrected the Run time and Finished Task Qty measures for long running tasks and sessions respectively in the Session Profiling Workbook 1.2 Performance tuning - Optimize data loader PK violation checking performance and lighten the load of database. - Optimize report performance. * Reduced a few measures with performance bottleneck * Redesigned some workbook layout to ensure display efficiency * Removed unwanted filters * Enhanced calculation logic Related workbooks: * Symphony Workload Accounting * Symphony Session Profiling Express - Optimize ETL performance Related ETL: * symresourceallocation * symsessionprofilingexpress 1.3 Usability Improvements * Symphony Workload Accounting * Symphony Session Profiling Express 1.4 Previously released patches that are included in this solution: - Daylight Saving Time (DST) handling APAR:P100399 - Resolve VEMKD FD leaking issue when alc service is enabled APAR:P100310 - Resolve DataLatencyCheck false alarm issue when alc fail over by EGO APAR:P100459 - PA data and log shows regular time, instead of day light saving time APAR:P100607 - Improve PKViolationClean task performance APAR:P100329 1.5 New APAR included in this solution: - Resolve missing data issue in the Resource Allocation workbook APAR: P101165 ========================= 2. Products or components affected ========================= IBM Platform Analytics Database Schema IBM Platform Analytics Nodes IBM Platform Analytics Server ========================= 3. Supported operating systems ========================= Linux x86_64 (Platform Analytics Node) Linux x86_64 (Platform Analytics Vertica Database Node) Windows 2008 x86_64 (Platform Analytics Server) ========================= 4. QA performance reference ========================= Performance test results comparing to Platform Analytics without optimization. * Report performance results are as shown: Note: Presents the first open time to each dashboard. RPT table has 150 Day data (30M task/day, 10M session/day) and data range is 1 week. ------------------------------------+------------------------------+-------------+----------- Report name | Dashboards | Before | After ------------------------------------+------------------------------+-------------+----------- Symphony Session Profiling Express | Session Profiling Express | 1 min | 25 sec ------------------------------------+------------------------------+-------------+----------- Symphony Workload Accounting | Workload Accounting | 1 min | 40 sec ------------------------------------+------------------------------+-------------+------------ Symphony Workload Accounting | Workload Distribution Count | 1 min 10 s | 42 sec ------------------------------------+------------------------------+-------------+------------ Symphony Workload Accounting | Workload Distribution Rank | 2 min | 1 min ------------------------------------+------------------------------+-------------+------------ Symphony Workload Accounting | Session Information | 1 min | 40 s ------------------------------------+------------------------------+-------------+------------ * ETL performance result: ---------------------------+------------+----------------------+------------+----------- ETL name | Step | # of records written | Before | After ---------------------------+------------+----------------------+------------+----------- Resource Allocation | 1 day data | 300,000 | 40 min | 25 min ---------------------------+------------+----------------------+------------------------ ========================= 5. Installation and Configuration ========================= In this document, ANALYTICS_TOP describes the top-level IBM Platform Analytics Server and Data Collection Node installation directory. For example, this directory may refer to the following: * Linux /opt/IBM/PlatformAnalytics * Windows C:\IBM\Platform Analytics ---------------------------------------------------------------- IMPORTANT: ---------------------------------------------------------------- Each step can only be started when the previous step is complete. ---------------------------------------------------------------- 5.1 Download the fix patch for Platform Analytics from Fix Central. * build340779_schema.zip * build340779_node_all.zip * build340779_server_all.zip 5.2 Update IBM Platform Analytics Schema. 1) Log in to the Analytics database host by Vertica DB admin, copy the schema patch to Analytics database host and extract all the files. unzip build340779_schema.zip 2) Navigate to the schema directory containing your database schema scripts, and run SQL_Patch.sql to update database schema. Command: /bin/vsql -d -p -U -w -f build340779_schema/schema/SQL_Patch.sql Where: - is the Vertica installation directory. By default: /opt/vertica - is the name of the Analytics database. - is the port of the Analytics database. - is the name of the Analytics database user, instead of the default user (the database administrator). - is the password for the Analytics database user. Note: Initialize table APP_NAME_TASK_ATTRIBUTES for section 1.1 RPT data loss issue. It may takes some time. In the SQL_Patch.sql file, the k_safe is defined as 1, it is proper for 2-5 DB nodes, if you have only 1 db node, please change it to 0. 5.3 Apply IBM Platform Analytics data collection node patch for all Analytics data collection nodes in all Symphony clusters ------------------------------------------------------------------------ Note: - Ensure 5.2 is done for Analytics Database before applying data collection node patch; - Before installing this patch for an Analytics data collection node, install Symphony 6.1.1 "Task's First Start Time" patch symphony_build238010_linux2.6-glibc2.3-x86_64.tar.gz following sym_RFC4725_README_ms.pdf. If it was already installed, ignore this step. - You must apply this patch for all the Analytics data collection nodes in all Symphony clusters. If a new node is added in future, then apply this fix pack before starting the data collection on the node. - Impact to Analytics: there might be 1~2 sampling intervals' data loss between step 5.3.2 and 5.3.7 (expect a couple of minutes). - Impact to Symphony clusters: none. ------------------------------------------------------------------------ 5.3.1 Log in to the Platform Symphony master host by your account such as root. source the Platform Symphony environment variables: * bsh: source SYMPHONY_TOP/profile.platform * csh: source SYMPHONY_TOP/cshrc.platform Log in to the Symphony master host by Symphony cluster admin by command: egosh user logon -u -x sym_admin: the admin of symphony cluster password: the password of sym_admin 5.3.2 Manually stop the alc service by command: egosh service stop alc Verify that the alc service is stopped: egosh service list 5.3.3 Log in to the Platform Analytics data collection node host with the account such as root, copy the node patch to the node host and extract all the files: unzip build340779_node_all.zip 5.3.4 Apply the node patch. Run the following command and specify the patch installation directory on this host to apply the patch: ANALYTICS_TOP/patch_tools/pinstall.sh Note: pinstall.sh will automatically back up the original files from ANALYTICS_TOP, copying the files from the patch directory to this directory. After patch applied, do following manual step: - Rename ANALYTICS_TOP/conf/profile.tz.perf.bak to ANALYTICS_TOP/conf/profile.tz.perf If profile.tz.perf exists, ignore this step. - Modify ANALYTICS_TOP/conf/plc/plc_pa_ego.xml, set Interval as 60 with following: 5.3.5 (Optional) If Platform Analytics data collection node is installed in a shared location, ignore this step. If Platform Analytics data collection node is installed in local location and there are several Analytics nodes installed on more than one management hosts in the cluster, repeat step 5.3.3 and 5.3.4 to apply this node patch for all Platform Analytics data collection nodes one by one. 5.3.6 If ego failover mode is enabled for alc service, after applying the Analytics node patch, copy the following files by Symphony admin to the Symphony shared directory. Copy the following files to Symphony_ShareDir_TOP/alc/conf from ANALYTICS_TOP/conf, overwriting existing ones: * Symphony_ShareDir_TOP/alc/conf/dataloader/symsessionhistloader.xml * Symphony_ShareDir_TOP/alc/conf/dataloader/symtaskhistloader.xml * Symphony_ShareDir_TOP/alc/conf/profile.tz.perf * Symphony_ShareDir_TOP/alc/conf/tz.properties 5.3.7 Restart alc service by Symphony cluster admin using command: egosh service start alc 5.3.8 Data collection node health check Run the following steps to check if Analytics data collection node has been started normally: 1) Run command 'egosh service list -s alc' to check if alc has started. In the output, STATE should be 'STARTED' and INST_STATE should be 'RUN'. For example, SERVICE STATE ALLOC CONSUMER RGROUP RESOURCE SLOTS SEQ_NO INST_STATE ACTI alc STARTED 87 /Manage* Manag* perfdb01 1 1 RUN 448 2) After several minutes, check log files if there has error: * ANALYTICS_TOP/log/plc.log. * ANALYTICS_TOP/log/dataloader/symsessionloader.log. * ANALYTICS_TOP/log/dataloader/symtaskhistloader.log. * ANALYTICS_TOP/log/dataloader/egoeventsloader.log. 3) (Optional) Log in to the Analytics database host, run command below to verify the loaded new data column values: Command: /bin/vsql -d -p -U -w -c "select end_time, end_time_gmt from task_attributes where end_time >= date_trunc('hour',sysdate) order by end_time desc limit 1;" Where: - is the Vertica installation directory. By default: /opt/vertica - is the name of the Analytics database. - is the port of the Analytics database. - is the name of the Analytics database user, instead of the default user (the database administrator). - is the password for the Analytics database user. 5.4 IBM Platform Analytics Server Patch Installation 5.4.1 Log in to the Platform Analytics Server host as Administrator. Copy the server patch to the server host and extract the files. 5.4.2 Manually apply the server patch 1) Stop the IBM Platform Analytics (Task Scheduler) service. 2) Create directory build340779 under ANALYTICS_TOP/.patch_backup/, back up the existing directory from ANALYTICS_TOP/conf/datatransformer to directory ANALYTICS_TOP/.patch_backup/build340779. 3) Back up the existing directory from ANALYTICS_TOP/report/reports/sym to directory ANALYTICS_TOP/.patch_backup/build340779. 4) Back up the existing file vertica_pkclean.js from ANALYTICS_TOP/bin to directory ANALYTICS_TOP/.patch_backup/build340779. 5) Back up the existing files PIAdmin.jar and PIUtil.jar from ANALYTICS_TOP/lib to directory ANALYTICS_TOP/.patch_backup/build340779. 6) Copy subdirectory bin, lib, and conf from build340779_server_all/ to ANALYTICS_TOP/, overwriting the existing ones. ANALYTICS_TOP/bin/vertica_pkclean.js ANALYTICS_TOP/lib/PIUtil.jar ANALYTICS_TOP/lib/PIAdmin.jar ANALYTICS_TOP/conf/datatransformer/flow/symtaskruntimerank/*.* ANALYTICS_TOP/conf/datatransformer/flow/symresallocEvent/*.* ANALYTICS_TOP/conf/datatransformer/flow/symworkloadaccounting/*.* ANALYTICS_TOP/conf/datatransformer/flow/symsessioninformation/*.* ANALYTICS_TOP/conf/datatransformer/flow/symcapacityvsallocation/*.* ANALYTICS_TOP/conf/datatransformer/flow/symtaskruntimeCount/*.* ANALYTICS_TOP/conf/datatransformer/flow/symsessionprofiling/*.* ANALYTICS_TOP/conf/datatransformer/flow/symresourceallocation/*.* ANALYTICS_TOP/conf/datatransformer/flow/symclusterandapp/*.* ANALYTICS_TOP/conf/datatransformer/flow/symhardware/*.* ANALYTICS_TOP/conf/datatransformer/flow/symsessionprofilingexpress/*.* Copy the following two files from build340779_server_all/report/reports/ to ANALYTICS_TOP/report/reports/sym Symphony Session Profiling Express.twb Symphony Workload Accounting.twb 7) Edit ANALYTICS_TOP/tasks/TaskRankETLHourly/TaskRankETLHourly.tsk, add the following line: datatransformer/flow/symclusterandapp/main_cluster_app.xml Note: Since most of the ETL depends on this task, be careful if you plan to disable it. 8) Edit ANALYTICS_TOP/tasks/SymSessionProfileETL/SymSessionProfileETL.tsk, add the following line: datatransformer/flow/symsessionprofilingexpress/main_session_profiling_express.xml 9) Start the IBM Platform Analytics (Task Scheduler) service. 5.4.3 Server health check Run the following steps to check if Analytics Server has been started normally: After several minutes, check log files for any errors: * ANALYTICS_TOP/log/automationmanager.log..txt * ANALYTICS_TOP/log/task/*.log..txt 5.5 (Optional) Historical report data reaggregation Note: Reaggregation could not handle all historic data because raw table only keep limited time data. * This step is optional, use TaskCountDaily as an example for any other ETL you need to reaggregate. 1. Disable the tasks scheduling which need reaggregate Log in Analytics Server, Open Analytics console, navigate to 'Scheduled Tasks' panel, right click scheduled tasks below, and temporarily disable scheduled intervals 2. Clean target RPT table, suggest back up it firstly. Log in to database by vsql, /bin/vsql -d -p -U -w Where: - is the Vertica installation directory. By default: /opt/vertica - is the name of the Analytics database. - is the port of the Analytics database. - is the name of the Analytics database user, instead of the default user (the database administrator). - is the password for the Analytics database user. vsql=> create table RPT_TASK_MEASURE_COUNT_BY_DAY_bak_340779 as (select * from RPT_TASK_MEASURE_COUNT_BY_DAY) ; vsql=> truncate table RPT_TASK_MEASURE_COUNT_BY_DAY; 3. Delete related records from AGGREGATE_INFO vsql=> delete from AGGREGATE_INFO where RAW_TABLE_NAME='RPT.SYM.TASKRUNTIMECOUNT'; vsql=> commit; 4. Re-schedule task which need reaggregate Log in to the Analytics Server, open the Analytics Console, and Navigate to the 'Scheduled Tasks' panel. Right click on the scheduled tasks you want to reaggregate and temporarily change the scheduled intervals as per the following table: --------------------------------------------------- Scheduled Tasks | Schedule Interval (suggested) -------------------+------------------------------- TaskCountDaily | 1 hour --------------------------------------------------- Note: - The scheduled tasks will automatically re-aggregate the historical data for all the related ETLs. The temporary scheduled interval is to speed up re-aggregation progress; - It takes about 0.5 hours to finish re-aggregation for one day's data. The performance really depends on both historical data volume and the number of concurrent running Analytics Nodes. - After reaggregation is done, change the interval for this etl back to 6 hours (recommended) 5. Change the schedule intervals for the tasks you re-scheduled. Use the original values or the suggested values below. Command to check if all the ETLs have been caught up (expect no output of the SQL): Log in to database by vsql, run the SQL below in vsql command line: select next_begin_agg_time, cluster_name, raw_table_name from aggregate_info; Note: - The 'next_begin_agg_time' is the GMT time that the ETL will start from for the next round. Check the column to ensure all the previous data has been caught up. 5.6 Schedule interval Open Analytics console, navigate to 'Scheduled Tasks' panel, right click scheduled tasks and temporarily change scheduled intervals as per the following table: --------------------------------------------------- Scheduled Tasks | Schedule Interval ------------------------------+-------------------- SymHardwareETL | 10 minutes SymSessionProfileETL | 10 minutes SymSessionInformationETL | 10 minutes SymWorkloadAccountingETL | 10 minutes SymResAllocCapacityETL | 15 minutes TaskRankETLHourly | 1 hour TaskCountDaily | 6 hours --------------------------------------------------- 5.7 Publish workbook 1) Open Symphony Session Profiling Express and Workload Accounting workbook in directory ANALYTICS_TOP\report\reports\ with IBM Platform Analytics (Designer). 2) Click "OK" for the default data source connection and then configure the right data source and connect. 3) From the menu, go to Server > Publish Workbook. 4) Enter the Tableau Server URL in the Server field. 5) Sign in to the Tableau Server using the user name and password and click OK. 6) Select Default or one of project from drop down list as Project. 7) Click "Authentication" and change authentication to "Embedded password" and click "OK". 8) Select all the sheets under 'Views to Share'. 9) Click Publish on "Publish Workbook to Tableau Server". 10) Repeat step 1) ~ 9) to publish other workbooks. ========================= 6. File List ========================= . build340779_server_all |-- patch_install | |-- fixlist.db | |-- filelist.db | `--patchinfo.txt |-- conf | `-- datatransformer | | `-- flow | | | |-- symtaskruntimerank | | | | |-- TaskRuntimeRankSource.xml | | | | |-- SplitClusterApp.xml | | | | |-- TaskRuntimeRankDepParams.xml | | | | |-- EndTimeSource.xml | | | | |-- main_TaskRuntimeRank.xml | | | | |-- RptTaskRuntimeRankLoader.xml | | | | |-- StartTimeSource.xml | | | | `--TaskRuntimeRankExtract.xml | | | |-- symresallocEvent | | | | |-- ResAllocEventExtract.xml | | | | |-- RptResAllocEventLoader.xml | | | | |-- StartTimeSource.xml | | | | |-- ResAllocEventDepParams.xml | | | | |-- ResAllocEventSource.xml | | | | |-- main_res_alloc_Event.xml | | | | |-- EndTimeSource.xml | | | | |-- TruncEndTime.xml | | | | `--NextBeginAggTime.xml | | | |-- symworkloadaccounting | | | | |-- SplitClusterApp.xml | | | | |-- WorkloadAccountingLoader.xml | | | | |-- main_workload_accounting.xml | | | | |-- StartTimeSource.xml | | | | |-- EndTimeSource.xml | | | | |-- WorkloadAccountingSource.xml | | | | |-- WorkloadAccountingExtract.xml | | | | `--WorkloadAccountingDepParams.xml | | | |-- symsessioninformation | | | | |-- SessionInformationSource.xml | | | | |-- main_session_information.xml | | | | |-- SplitClusterApp.xml | | | | |-- EndTimeSource.xml | | | | |-- SessionInformationLoader.xml | | | | |-- SessionInformationDepParams.xml | | | | |-- StartTimeSource.xml | | | | `--SessionInformationExtract.xml | | | |-- symcapacityvsallocation | | | | |-- main_capacity_vs_allocation.xml | | | | |-- CapacityDepParams.xml | | | | |-- StartTimeSource.xml | | | | |-- NextBeginAggTime.xml | | | | |-- TruncEndTime.xml | | | | |-- CapacityExtract.xml | | | | |-- RptCapacityLoader.xml | | | | |-- CapacitySource.xml | | | | `--EndTimeSource.xml | | | |-- symtaskruntimeCount | | | | |-- main_TaskRuntimeCount.xml | | | | |-- TaskRuntimeCountSource.xml | | | | |-- TaskRuntimeCountExtract.xml | | | | |-- StartTimeSource.xml | | | | |-- RptTaskRuntimeCountLoader.xml | | | | |-- SplitClusterApp.xml | | | | |-- TimeDiff.xml | | | | |-- TaskRuntimeCountDepParams.xml | | | | `--EndTimeSource.xml | | | |-- symsessionprofiling | | | | |-- NextBeginAggTime.xml | | | | |-- StartTimeSource.xml | | | | |-- EndTimeSource.xml | | | | |-- TruncEndTime.xml | | | | |-- SplitClusterApp.xml | | | | |-- main_session_profiling.xml | | | | |-- TimezoneExtract.xml | | | | |-- SessionProfilingDepParams.xml | | | | |-- MinCreateTime.xml | | | | |-- SessionProfilingSource.xml | | | | |-- RptSessionProfilingLoader.xml | | | | `--SessionProfilingExtract.xml | | | |-- symresourceallocation | | | | |-- main_resource_allocation.xml | | | | |-- ResourceAllocationDepParams.xml | | | | |-- MaxLoaderTime.xml | | | | |-- RptResourceAllocationLoader.xml | | | | |-- ResourceAllocationSource.xml | | | | |-- NextBeginAggTime.xml | | | | |-- StartTimeSource.xml | | | | |-- TimeDiff.xml | | | | |-- ResourceAllocationExtract.xml | | | | |-- EndTimeSource.xml | | | | `--TruncEndTime.xml | | | |-- symclusterandapp | | | | |-- ClusterAndAppSource.xml | | | | |-- ClusterAndAppDepParams.xml | | | | |-- main_cluster_app.xml | | | | |-- StartTimeSource.xml | | | | |-- EndTimeSource.xml | | | | |-- ClusterAndAppLoader.xml | | | | `--ClusterAndAppExtract.xml | | | |-- symhardware | | | | |-- HardwareSource.xml | | | | |-- EndTimeSource.xml | | | | |-- main_hardware.xml | | | | |-- StartTimeSource.xml | | | | |-- RptHardwareLoader.xml | | | | |-- HardwareExtract.xml | | | | |-- HardwareDepParams.xml | | | | `--TruncEndTime.xml | | | `-- symsessionprofilingexpress | | | | |-- SessionProfilingDepParams.xml | | | | |-- MinCreateTime.xml | | | | |-- TimezoneExtract.xml | | | | |-- SessionProfilingExtract.xml | | | | |-- SplitClusterApp.xml | | | | |-- SessionProfilingSource.xml | | | | |-- EndTimeSource.xml | | | | |-- StartTimeSource.xml | | | | |-- TruncEndTime.xml | | | | |-- main_session_profiling_express.xml | | | | `--RptSessionProfilingExpressLoader.xml |-- report | `-- reports | | |-- Symphony Session Profiling Express.twb | | `--Symphony Workload Accounting.twb |-- lib | |-- PIUtil.jar | `--PIAdmin.jar `-- bin | `--vertica_pkclean.js build340779_schema `-- schema | `--SQL_Patch.sql build340779_node_all |-- patch_install | |-- patchinfo.txt | |-- filelist.db | `--fixlist.db |-- soam | `-- 6.1.1 | | |-- linux-x86_64 | | | `-- lib | | | | `--libegoresource.so | | `-- lib | | | |-- perf_ego_loader.jar | | | `--perf_soam_loader.jar |-- conf | |-- dataloader | | |-- symtaskhistloader.xml | | `--symsessionhistloader.xml | |-- profile.tz.perf.bak | `--tz.properties `-- 1.2 `-- lib |-- perf_dataloader.jar `-- perf_commons.jar ========================= 7. Copyright ========================= © Copyright IBM Corporation 2015 U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM® the IBM logo and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.