Unless specifically noted otherwise, this history of problems fixed for IBM Spectrum Scale 4.2.x applies for all supported platforms.
Changes to this release of the IBM Spectrum Scale licensed program and the IBM Spectrum Scale library include the following:
- Auditing configuration changes
A syslog entry is automatically written whenever a GPFS command makes a configuration change. Adding the information to the syslog gives flexibility in mining, processing, and redirecting these events. Entries can also be written to the standard GPFS log. The commandAudit parameter of the mmchconfig command controls this option.
- Automated configuration of sensors for performance monitoring
IBM Spectrum Scale now supports automated configuration of sensors for its performance monitoring tool.
- Callback event for file system structure errors
A new user callback event fsstruct (file system structure error) is triggered when the file system detects an error in the metadata. Immediate notification enables the callback program to act to mitigate further errors.
- CES, NFS, and SMB protocols: Support for SLES V12 on x86 systems
Cluster Export Services (CES) partially supports SUSE Linux Enterprise Server (SLES) V12 on x86 systems. The SMB and NFS protocols are now supported via a manual installation process.
- Compression support for FPO environments
File compression is expanded to support the File Placement Optimizer (FPO) environment. For the FPO environment, you must set the block group factor to a multiple of 10 to avoid degrading file system performance.
- Deadlock management and debug data control
Deadlock management is extended with the following features: * The detection thresholds for deadlocks are automatically adjusted according to waiter length and cluster overloaded status. * New defaults more suitable for customer environments are established for the configuration variables deadlockDataCollectionDailyLimit, deadlockDataCollectionMinInterval, and others. A new configuration variable debugDataControl controls the amount of debug data that is collected. The default setting is a minimal amount of debug information that is the most important for debugging issues.
- /dev/
device for a file system on Linux On Linux, GPFS no longer creates the /dev/
device for a file system. Applications that relied on the file system device under /dev to be present, or that relied on "/dev" to be displayed in the output of the mount command, must find other ways to obtain the information. As a substitute, consider the information provided by the /etc/fstab file and /proc/mounts entries. - Encryption: Simplified setup and Vormetric DSM support
* A new console command mmkeyserv greatly simplifies the setup of encryption both on the key server and the client node. IBM Security Key Lifecycle Manager (SKLM) V2.5.0.4 or later (including V2.6) is required. * Encryption support is added for key servers that run Vormetric Data Security Manager (DSM) V5.2.3 or later.
- Federation in the performance monitoring tool
A performance monitoring tool installation with multiple collectors is called a federation. Federation is introduced in performance monitoring to increase the scalability or the fault-tolerance of the performance monitoring system.
- Guided installation
The spectrumscale installation toolkit now provides next step hints that are designed to help customers new to IBM Spectrum Scale with an easy workflow that helps customers to install and configure an IBM Spectrum Scale cluster.
- Hadoop Support for IBM Spectrum Scale
* HDFS transparency now supports running the Hadoop Map/Reduce workload inside the virtual machine container, Docker. * Federation is introduced in HDFS to solve the HDFS NameNode scaling problem. * Hadoop distcp is used for data migration from HDFS to the IBM Spectrum Scale file system and between two IBM Spectrum Scale file systems. * HDFS transparency security has been introduced for the simple security mode and the Kerberos mode. * User authentication and authorization is weak in the simple mode. The data transfers and RPCs from the clients to the NameNode and DataNode are not encrypted. The Kerberos mode introduced in the Hadoop ecosystem provides a secure Hadoop environment.
- InfiniBand and RDMA performance
Performance is improved for clusters that use InfiniBand and RDMA for their intranode communications network.
- Linux on z Systems: Expanded features
* Quality of Service (QoS) support. * mproved extended count key data (ECKD™) device handling: On different nodes, different bus IDs can refer to the same device. * IBM Spectrum Scale GUI now supported on Linux for z Systems.
- --metadata-only parameter for mmrestripefs
A --metadata-only option for the mmrestripefs command allows the restripe to complete in less time than a full restripe of metadata and data. The savings in time is useful in situations where there is a concern about file system operations and you want to restripe. This operation is supported for migrating data off disks, rebalancing, restoring replication, and comparing replicas.
- mmhealth: Monitoring services hosted on cluster nodes
A new command, mmhealth is added to monitor the health status of nodes and different services hosted on nodes. The mmhealth command also displays the event logs responsible for the unhealthy status of nodes and services, to analyze and determine the problem responsible for the service failure.
- Object storage improvements
* Added support for starting and stopping the ibmobjectizer service. * For problem determination, added potential problem scenarios with proposed solutions. * Added support for object encryption. * Added new constraints for unified file and object access. * Added support for simplified enablement of S3. * Added support for multi-region object deployment with a highly available keystone service. * Added support for OpenStack Liberty packages. * Added support to execute mmobj commands from any IBM Spectrum Scale client node. * Added support for monitoring support for external AD and LDAP server for object authentication and main object services.
- Quality of Service for I/O operations (QoS) improvements
Quality of Service for I/O operations is expanded to support the File Placement Optimizer (FPO) environment.
- Re-create and restore options for protocols cluster failover
The failover procedure can choose between re-create and restore options.
- Re-create and restore options for failing back to an old
primary for protocols cluster
When failing back to an old primary, the file protocol configuration can either be re-created or restored.
- Re-create and restore options for failing back to a new primary
for protocols cluster
When failing back to a new primary, the file protocol configuration can either be re-created or restored.
- Support for Transparent Cloud Tiering
* The Transparent Cloud Tiering feature leverages the existing ILM policy available in IBM Spectrum Scale, and administrators can define policies to migrate cold data to a cloud storage tier or recall data from the cloud storage tier on reaching certain threshold levels. * A new command, mmcloudgateway, is added to manage and configure the cloud storage tier.
- workerThreads tunes file system performance
The workerThreads parameter of the mmchconfig command controls an integrated group of variables that tune file system performance. Use this variable to tune file systems in environments that are capable of high sequential or random read/write workloads or small-file activity. This variable can be used in any installation and is preferred over worker1Threads and prefetchThreads in new installations.
- IBM Spectrum Scale GUI changes
* Renamed Monitoring > Topology page to NSDs. The NSDs page facilitates monitoring the status of Network Shared Disks (NSD) and nodes to NSD mapping in the system. * Added new Monitoring > Nodes page in the GUI. The Nodes page provides an easy way to monitor the performance, health status, and configuration aspects of all available nodes in the IBM Spectrum Scale cluster. The properties of a node display the status of various CES servicesrecr such as Object, NFS, and SMB as well as the authentication status of these services if they are enabled. It also displays other details such as network status, information on attached NSDs and file systems, and so on. * Monitoring performance of transparent cloud tiering services through Performance and Dashboard pages. * Renamed Monitoring > Performance page to Statistics. * Added capacity monitoring options in the Statistics page. * Added monitoring options for GPFS waiters in Monitoring > Statistics panel. * You can assign a name to the dashboards and the user can switch between dashboards. * The dashboards are now stored on the server instead of the browser. Therefore, it can be shared among users and browsers. * Default dashboards are shipped with the GUI. When you open the IBM Spectrum Scale GUI after the installation or upgrade, you can see the default dashboards. You can further modify or delete the default dashboards to suit your requirements. * Renamed Download Logs page as Diagnostic Data. Now, the GUI can be used instead of the gpfs.snap command to collect the details of the issue. * The Files > Information Lifecycle page facilitates defining compression and deletion rules. * The new Settings > Object Service page facilitates start and stop feature for object services. * Up to 1000 nodes are supported. * The GUI can now be used an IBM Spectrum Scale cluster where sudo wrappers are used. * IBM Spectrum Scale GUI support for System platform is available on RHEL7.2 and SLES12. * By default, GUI commands that change the configuration of the cluster cause an audit message to be sent to syslog. Optionally, an audit message can also be sent to the GPFS log.
- NFS and SMB protocol troubleshooting information added
* New AD Discovery tool to query and validate several AD settings. * New troubleshooting information for NFS mount issues, NFS error events, and NFS error scenarios. * New troubleshooting information for SMB client on Linux failures, SMB mount errors, SMB error events, and SMB access issues.
- Documented commands, structures, and subroutines
The following lists the modifications to the documented commands, structures, and subroutines: New commands mmadquery, mmcloudgateway, mmhealth, mmkeyserv New structures There are no new structures. New subroutines There are no new subroutines. Changed commands gpfs.snap, mmafmlocal, mmcallhome, mmchconfig, mmchnode, mmcesdr, mmcrsnapshot, mmdelsnapshot, mmlscluster, mmnfs, mmobj, mmrestripefs, mmsmb, mmprotocoltrace, mmsmb, mmuserauth Changed structures There are no changed structures. Changed subroutines gpfs_iopen() gpfs_iopen64() Deleted commands There are no deleted commands. Deleted structures There are no deleted structures. Deleted subroutines There are no deleted subroutines. New messages 6027-1826, 6027-2363, 6027-2364, 6027-2365, 6027-2366, 6027-2367, 6027-2368, 6027-2369, 6027-2370, 6027-2371, 6027-2372, 6027-2373, 6027-2374, 6027-2375, 6027-2376, 6027-2377, 6027-2378, 6027-3108, 6027-3720, 6027-3721, 6027-3722, 6027-3723, 6027-3724, 6027-3725, 6027-3726, 6027-3727, 6027-3728, 6027-3915, 6027-3916, 6027-3594, 6027-3595, 6027-3596 Changed messages 6027-1368, 6027-1235, 6027-1545, 6027-2271, 6027-2272, 6027-2273, 6027-2274, 6027-2951 Deleted messages 6027-1997
Problems fixed in IBM Spectrum Scale 4.2.0.4 [June 29, 2016]
- For Spectrum Scale customers that have NFS enabled and use the GPFS GUI to monitor NFS exports, and have a large number of exports (~1000s), there will be a noticeable performance improvement.
- Fix a problem where mounting a large number of file systems on a node with a small pagepool may fail or hang due to running out of space for log buffers.
- Fix a kernel panic due to write after free in mmfs26 mmkproc.
- Fix an issue in the AFM environment where recovery,resync and prefetch operations can fail because of a large number of files to be queued.
- This fix provides improvements for mmap read SMP scalability.
- If asynchronous NFS/NLM locking is used this fix will prevent potential kernel crash.
- Fix a problem in which policy skips a file that has an 's' in its mode field.
- This fix will try to force through log recovery even when all stripes of a log home vdisk are marked stale (logically unreadable) in the metadata. This will only occur when run under debug control. This applies to GSS & ESS installations.
- This fix will avoid random node reboots when files are updated from different nodes.
- Fix a problem in which CES gets shutdown on the local cluster manager nodes when the remote cluster has quorum loss.
- Fix an assert that was hit when running offline fsck for a file system that has one or more files with multiple trailing duplicate blocks.
- Fix a segmentation fault issue when running the mmsetquota command. This issue would only happen when GPFS overwrite tracing is enabled on Linux.
- Fix a problem in which too much data is dumped when collecting data for deadlocks and expels. This was causing performance issues.
- Fix a problem where during a stress workload doing appends to a small file could cause kernel panic.
- Fix a daemon crash that can occur while trying to execute a pcache command with maxThrottle set.
- Fix network communication problems that can occur when mmdiag --iohist and overload detection happen at the same time.
- Fix a problem in which no server will try to activate the recoverygroup after a mmchrecoverygroup failure. This fix applies to GSS/ESS customers.
- Fix an alloc segment steal problem that can lead to more than 22 minutes of searching for a free buffer assert.
- Fix random memory corruption and kernel crashes in the AFM environment which are likely to happen while deleting the non empty directory at the home or secondary clusters.
- Fix the readdir performance issue of independent writer mode filesets in the AFM environment. Introduce a new configuration option afmDIO at the fileset level to replicate data from cache to home using direct IO.
- Fix QOS performance issues.
- Fix a mmdiscoverycomp failure that can occur if the cluster is configured to use different admin and daemon node names.
- Update log recovery code to better handle certain invalid data that could cause GPFS daemon to die with Signal 11. This change will allow offline mmfsck to run and fix the problem.
- This update addresses the following APARs: IV84196 IV85084 IV86147 IV86154 IV86156 IV86158 IV86159 IV86160 IV86161 IV86162 IV86163 IV86164 IV86175 IV86176 IV86177 IV86178 IV86181 IV86182 IV86183.