Unless specifically noted otherwise, this history of problems fixed for IBM Spectrum Scale 4.2.x applies for all supported platforms.
Problems fixed in IBM Spectrum Scale 4.2.0.1 [January 15, 2016]
- Fix an issue that could cause the GPFS daemon to abnormally terminate or could cause incorrect performance data to be reported when GPFS SNMP subagent, mmpmon, or zimon are being utilized.
- Fix GNR AU log long waiters seen in SSD replacement.
- Fix the snapshot restore issue that some files in a live file system are not restored.
- Fix a problem where mmchfs -z, -Q or --perfileset-quota prematurely releases a sdr lock which can result in the command to fail.
- Fix signal 11 in verbsDisconnect_i when "large" fabnum value is used.
- Fix a problem with GPFS logging code that could cause the GPFS daemon to die with signal 11. This problem can only occur on nodes with LROC enabled.
- On a GSS / ESS / GNR system that uses NVRAM for the log tip, short outages of one of the nodes can cause inappropriately strongly worded error messages in the log, which state "[E] Insufficient spare space; unable to complete rebalance of DA ...". Those messages have been changed to be more sensible.
- Fix a problem that could cause an FSSTRUCT error to be incorrectly logged when reading from a disk. This could only occur when LROC is enabled.
- Fix a signal 11 daemon crash that can occur while running the mmchcarrier or mmchpdisk commands while a disk enclosure is in a failed state. This fix is recommended for all GNR (ESS/GSS) customers.
- Fix a problem in which all I/O stops and all nodes go into arbitrating state that can occur during a network failure.
- Fix logAssertFailed: !(_ownedByCaller((lockWordCopy), (lockWord_t *)&(lockWordCopy))) that can occur during high stress work loads.
- Fix failures that can occur when trying to resume pdisks including mmchcarrier command failures. These errors occur on a GSS/ESS system.
- Fix AioWorkerThread to not allow it to steal a dirty buffer that could cause a deadlock.
- Fix a "Constraint error" that can occur during the mmdiscovercomp command when trying to add servers to the component database. This fix applies to GSS/ESS customers.
- Improve GNR write performance by using more threads to flush internal GNR metadata.
- Fix code to avoid quorum loss declaration of the current cluster manager when the network is broken between two nodes.
- Fix a deadlock that can occur when queue memory crosses AFM hard memory limit.
- Fix a mmfsd daemon crash that is possible when Zimon is used to monitor the node and a file system is force unmounted due to some unrecoverable file system error.
- This fix removes the restriction that daemon interface changes are not allowed on CCR enabled clusters. You can now make daemon interface changes but only from non-quorum nodes.
- Fix logAssertFailed: (_ownedByCaller((lockWordCopy), (lockWord_t *)&(lockWordCopy))) that can occur when a fileset goes to disconnected mode.
- If a system built on GNR/GSS/ESS servers has been getting IO errors on GPFS file systems (reported all the way to the end user application, not internal disk IO errors on individual physical disks), and those IO errors happened exactly at a time when some pdisks were unreachable (for example due to cabling or connectivity issues), and those pdisks would have been reachable from the backup node of the GNR server, then this fix will prevent the IO errors, by failing the recovery group containing the affected vdisk over to the backup node.
- This fix restricts the mmchcluster command from disabling CCR in a cluster that has a CES node.
- Fix a problem in which orphans (files with inode allocated but not initialized) that have been moved to .ptrash can not be deleted.
- Fix a GNR server node crash that can occur when a network fails to connect a GNR server pair.
- Fix a GPFS daemon assert that can occur during restripe file operations. If a storage pool gets deleted by mmdeldisk -p or mmdeldisk -c the GPFS daemon assert could occur during either a mmrestripefile command or a mmchattr -I yes command.
- Fix an assert that can occur when adding pdisks with --replace option.
- Fix a logAssert that can occur during a snapshot restore process that is scanning a sparse file whose size is close to the GPFS maximum file size limitation.
- Fix the path to the Linux modprobe command that the mmchfirmware command uses when --type adapter is specified. This fix applies to GSS/ESS customers.
- Starting with 4.1.1, GPFS changed the contents of the Linux NFS filehandle. This means if the AFM home is upgraded to 4.1.1 or later, existing AFM filesets detect a change in the export since the filehandle was changed and will suspend future synchronization with home. Similarly, a change from knfsd to Ganesha at home also causes a filehandle change even though the export is the same. The only solution is to resync the cache using failover which is expensive. This fix handles upgrades if home is running GPFS by detecting and upgrading cached filehandles when the filehandle changes for an inode.
- Fix a mmbackup failure that can occur when the command line arguments list is too long.
- Fix a fileset can become stuck in an unmounted state problem that can occur if a remote fileset becomes stale and then comes back and both the application and gateway nodes are the same.
- Fix a node crash that can occur during a rolling upgrade.
- On a GNR, ESS, and GSS systems, error messages are printed when an I/O to a physical disk does not succeed. These messages were printed even when the I/O operation was not even attempted. In those cases, the I/O error messages are now suppressed.
- Fix a mmdiscovercomp command failure that can occur when adding storage servers in GSS/ESS.
- Fix a problem in which mmaddnode fails to copy the committed key file to the new node. This only occurs on a CCR disabled cluster and if there are 2 key files.
- Fix a problem in which a hard memory limit is not honored when a fileset is in disconnected mode.
- This update addresses the following APARs: IV79381 IV79745 IV79747 IV79749 IV79750 IV79752 IV79753 IV79754 IV79757 IV79759 IV79762 IV79763 IV79764 IV79766 IV79768.