Summary of changes for IBM Spectrum Scale version 4 release 1.1 as updated, June 2015

Changes to this release of the IBM Spectrum Scale licensed program and the IBM Spectrum Scale library include the following:

Active file management asynchronous fileset-level data replication for disaster recovery (DR)
Asynchronous replication of data at the file level enables you to create a primary(active)/secondary(passive) relationship at the fileset level. Data is asynchronously replicated to the secondary on a periodic basis. To enable this function, ensure that you run the following commands: * If you are migrating from a previous release, run the mmchconfig release=LATEST command. * Run the mmchfs -V full command.
Cluster Configuration Repository (CCR) Enhancements were made to restore broken configuration and files to bring a cluster back online or a broken node to a working state. In the case of a disaster recovery setup, steps are provided to downgrade the quorum assignments when half or more of the quorum nodes are no longer available at one of the sites. Consult the "Establishing disaster recovery for your GPFS cluster" topic in the IBM Spectrum Scale: Advanced Administration Guide.

Cygwin 64-bit version requirement for Windows nodes The 32-bit version of Cygwin is no longer supported for Windows nodes running GPFS. Users that are running GPFS 4.1 with the 32-bit version of Cygwin installed must upgrade to the 64-bit version of Cygwin before installing IBM Spectrum Scale 4.1.1. Users with SUA on GPFS releases prior to 4.1 should upgrade directly to the 64-bit version of Cygwin.

Data collection for expelled nodes When a node is about to be expelled for unknown reasons, debug data is collected automatically to help find the root cause.

Deadlock amelioration Deadlock breakup requests can be issued on demand at a time that is chosen by a system administrator. A user callback for the deadlockOverload event can be added to notify a system administrator to check the system and workload for an overload condition.

File Placement Optimizer (FPO) FPO enhancements deliver the ability to change block allocation of an existing file with the mmrestripefile and mmchattr commands and efficient removal of disks when disks have already been emptied with the auto recovery process. Auto recovery has been optimized to handle multiple failure and recovery events more efficiently.

Fileset-level integrated archive manager (IAM) modes Fileset-level integrated archive manager (IAM) modes give users the ability to set four different IAM modes at the fileset level, including the root fileset, so that users can modify the file-operation restrictions that normally apply to immutable files. For more information, see the following: * topic about immutability and appendOnly restrictions in the Information Lifecycle Management chapter of the IBM Spectrum Scale: Advanced Administration Guide * mmchfileset and mmlsfileset command descriptions in the IBM Spectrum Scale: Administration and Programming Reference To enable this function, ensure that you run the following commands: * If you are migrating from a previous release, run the mmchconfig release=LATEST command. * Run the mmchfs -V full command.

GPFS Native RAID (GNR) and Elastic Storage Server (ESS) documentation The documentation for GNR and ESS was removed from the information units in the IBM Spectrum Scale library. This includes GNR commands, GNR callbacks available to the mmaddcallback command, vdisk performance monitoring with the mmpmon command, messages in the ranges 6027-1850 - 6027-1899 and 6027-3000 - 6027-3099, and the chapter in the IBM Spectrum Scale: Advanced Administration Guide titled GPFS Native RAID (GNR). For more information about GNR, see GPFS Native RAID: Administration. For more information about ESS, see Deploying the Elastic Storage Server.

Hadoop support Hadoop support was expanded from FPO storage to shared storage. This allows data stored in current GPFS clusters using shared storage to be accessible to Hadoop applications. IBM Spectrum Scale Hadoop Connector has been enhanced to transparently support both FPO based storage pools to leverage data locality and shared storage where locality information is not applicable. This allows FPO and shared storage pool to be used within the same file system, which allows Hadoop applications to access data in the entire file systems transparently. IBM Spectrum Scale Hadoop Connector fully supports Hadoop version 2.5, and it can also be used with Hadoop version 2.6 in compatibility mode (Hadoop file system APIs in 2.6 are not yet implemented). The mmhadoopctl command was introduced to simplify IBM Spectrum Scale Hadoop Connector configuration and management.

Inode expansion optimization In this release, inode expansion, which allows dynamic growth of inodes, is optimized to reduce the contention that can flare up during bursts of file creates. To enable this function, ensure that you run the following commands: * If you are migrating from a previous release, run the mmchconfig release=LATEST command. * Run the mmchfs -V full command to enable all of the new functionality that requires different on-disk data structures. For more information, see the topics on completing migration and use of disk storage and file structure in file systems in the IBM Spectrum Scale: Concepts, Planning, and Installation Guide.

Installation toolkit The installation toolkit can be used to do the following: * Install and configure GPFS. * Add GPFS nodes to an existing cluster. * Deploy and configure SMB, NFS, OpenStack Swift, and performance monitoring tools on top of GPFS. * Configure authentication services for protocols. * Upgrade GPFS and protocols. For details, see the spectrumscale command description in the IBM Spectrum Scale: Administration and Programming Reference.

Multi protocol data access Data access to a shared storage infrastructure through enhanced protocol support for NFS, SMB, and Swift Object. For more information, see the IBM Spectrum Scale: Advanced Administration Guide and the IBM Spectrum Scale: Administration and Programming Reference.

Performance improvements for mmfsck The mmfsck command can now store information that is found during a scan of the file system into a patch file. The information in the patch file can then be used as input to repairing the file system. Using a patch file to repair the file system prevents an additional scan before starting the repair actions. For more information, see the mmfsck command description in the IBM Spectrum Scale: Administration and Programming Reference.

PIT inode list The parallel inode traversal (PIT) scan used for the mmchdisk, mmdeldisk, mmrestripefs, and mmrpldisk commands has now been updated to produce a list of inodes with interesting attributes, for example: those having broken disk addresses or those being ill placed. While the mmfileid command can be used to list files with broken disk addresses, this can be a slow process. Two new optional parameters, --inode-criteria CriteriaFile and -o InodeResultFile have been added to the more commonly-used mmchdisk, mmdeldisk, mmrestripefs, and mmrpldisk commands. These parameters allow you to find files matching certain criteria without a separate invocation of mmfileid. With this new feature, you can easily find the interesting files and their inode numbers. The output file will contain a list of inode numbers that meet the specified flags along with the name of the flag and the file type. For more information about these commands and for a description of the optional parameters and flags, see the commands in the IBM Spectrum Scale: Administration and Programming Reference. To enable this function, ensure that you run the following commands: * If you are migrating from a previous release, run the mmchconfig release=LATEST command. * Run the mmchfs -V full command to enable all of the new functionality that requires different on-disk data structures.

Policy improvements: This release includes the following policy improvements: mmapplypolicy --sort-command SortCommand The mmapplypolicy --sort-command parameter allows you to specify an alternative sort command to be used, rather than the default sort command provided with the operating system. Implicit SET POOL 'first-data-pool' rule For file systems that are at or have been upgraded to 4.1.1, the system recognizes that, even if no policy rules have been installed to a file system by mmchpolicy, data files should be stored in a non-system pool if available (rather than in the system pool, which is the default for earlier releases). For more information, see the following: * Information Lifecycle Management chapter in the IBM Spectrum Scale: Advanced Administration Guide * mmchpolicy command description in the IBM Spectrum Scale: Administration and Programming Reference

Quota management Quota management improvements for file system format 4.1.1 and higher include: * Allowing quota management to be enabled and disabled without unmounting the file system. To enable this function, ensure that you run the following commands: * If you are migrating from a previous release, run the mmchconfig release=LATEST command. * Run the mmchfs -V full command.

Read replica policy In a file system with replicas, there are replicas for each data block stored in different disks in different failure groups. Now, using the readReplicaPolicy attribute of the mmchconfig command you can specify the location from which the policy is to read replicas. readReplicaPolicy lets you specify that the first replica be read, the local or closest replica, or the fastest. For more information, see the mmchconfig command in the IBM Spectrum Scale: Administration and Programming Reference.

Performance Monitoring Tool The Performance Monitoring tool aims to provide performance information after collecting the metrics from GPFS and protocol nodes using the mmperfmon query command with an appropriate query. The tool helps in detecting performance issues and problems. The predefined queries and metrics help in investigating every node or any particular node that is collecting metrics. For more information, see the following: * "Performance Monitoring tool overview" topic in the IBM Spectrum Scale: Advanced Administration Guide * mmperfmon command description in the IBM Spectrum Scale: Administration and Programming Reference

Documented commands, structures, and subroutines The following lists the modifications to the documented commands, structures, and subroutines: New commands The following commands are new: * mmces * mmdumpperfdata * mmhadoopctl * mmnfs * mmobj * mmperfmon * mmprotocoltrace * mmsmb * mmuserauth * spectrumscale New structures There are no new structures. New subroutines There are no new subroutines. Changed commands The following commands were changed: * gpfs.snap * mmaddcallback * mmafmctl * mmafmlocal * mmapplypolicy * mmbackup * mmbuildgpl * mmchconfig * mmchdisk * mmcheckquota * mmchfileset * mmchnode * mmchpool * mmchpolicy * mmcrcluster * mmcrfileset * mmdeldisk * mmedquota * mmfsck * mmlscluster * mmlsfileset * mmlsfs * mmlspolicy * mmlsquota * mmpsnap * mmrepquota * mmrestorefs * mmrestripefile * mmrestripefs * mmrpldisk Changed structures There are no changed structures. Changed subroutines There are no changed subroutines. Deleted commands There are no deleted commands. Deleted structures There are no deleted structures. Deleted subroutines There are no deleted subroutines.

Messages The following lists the new, changed, and deleted messages: New messages 6027-962, 6027-2145, 6027-2230, 6027-2234, 6027-2235, 6027-2238, 6027-2240, 6027-2241, 6027-2242, 6027-2245, 6027-2246, 6027-2247, 6027-2248, 6027-2249, 6027-2250, 6027-2251, 6027-2252, 6027-2253, 6027-2254, 6027-2255, 6027-2256, 6027-2257, 6027-2258, 6027-2259, 6027-2260, 6027-2261, 6027-2262, 6027-2263, 6027-2264, 6027-2265, 6027-2266, 6027-2267, 6027-2268, 6027-2269, 6027-2270, 6027-2271, 6027-2272, 6027-2273, 6027-2274, 6027-2281, 6027-2282, 6027-2283, 6027-2284, 6027-2285, 6027-2286, 6027-2287, 6027-2288, 6027-2289, 6027-2290, 6027-2291, 6027-2292, 6027-2293, 6027-2294, 6027-2295, 6027-2296, 6027-2297, 6027-2298, 6027-2299, 6027-2300, 6027-2301, 6027-2302, 6027-2303, 6027-2304, 6027-2305, 6027-2306, 6027-2307, 6027-2308, 6027-2309, 6027-2310, 6027-2311, 6027-2312, 6027-2313, 6027-2314, 6027-2315, 6027-2316, 6027-2317, 6027-2318, 6027-2319, 6027-2320, 6027-2321, 6027-2322, 6027-2323, 6027-2324, 6027-2325, 6027-2326, 6027-2327, 6027-2329, 6027-2330, 6027-2331, 6027-2332, 6027-2333, 6027-2334, 6027-2335, 6027-2336, 6027-2337, 6027-2338, 6027-2339, 6027-2340, 6027-2341, 6027-2342, 6027-2343, 6027-2344, 6027-2345, 6027-2346, 6027-2347, 6027-2348, 6027-2349, 6027-2350, 6027-2351, 6027-2352, 6027-3255, 6027-3256, 6027-3257, 6027-3306, 6027-3307, 6027-3308, 6027-3309, 6027-3310, 6027-3311, 6027-3312, 6027-3313, 6027-3314, 6027-3315, 6027-3316, 6027-3551, 6027-3552, 6027-3553, 6027-3554, 6027-3579, 6027-3580, 6027-3581, 6027-3708, 6027-3709, 6027-3710, 6027-3711, 6027-3712, 6027-3713, 6027-3714, 6027-3715, 6027-3716, 6027-3717, 6027-3718, 6027-3719, 6027-3900, 6027-3901, 6027-3902, 6027-3903, 6027-3904, 6027-3905, 6027-3906, 6027-3907, 6027-3908, 6027-3909, 6027-3910, 6027-3911, 6027-3912, 6027-4000, 6027-4001, 6027-4002, 6027-4003, 6027-4004, 6027-4005, 6027-4006, 6027-4007, 6027-4008, 6027-4009, 6027-4010, 6027-4011, 6027-4012, 6027-4013, 6027-4014, 6027-4015 Changed messages 6027-625, 6027-872, 6027-1305, 6027-2181, 6027-2183, 6027-2229, 6027-2714, 6027-2715, 6027-2758, 6027-3248, 6027-3249 Deleted messages 6027-2622, 6027-2632, 6027-3511, 6027-3514, 6027-3515, 6027-3516, 6027-3536, 6027-3544

Problems fixed in GPFS 4.1.0.8 [May 26, 2015]

Correct a small vulnerability in takeover after file system manager failure during a snapshot command.

The code change ensures that online replica compare tool does not report false positive mismatches when the file system has suspended disks.

Fix an AFM recovery issue during the fileset unlink.

Fix a problem when determining whether copy-on-write is needed or not in the presence of snapshots. Sometimes this problem may result in spurious write operation failures (especially, but not limited to file/directory creation).

Fix a hang in mmrestripefs, which may also result in waiters for "PIT_Start_MultiJob". The problem may happen if the set of nodes specified in the '-N' option to the command includes nodes which are still in the process of being started (or restarted).

mmcrsnapshot, mmdelsnapshot and mmfileset commands quiesce the file system before they start actual work. During that quiesce if a thread doing file deletion of an HSM migrated file is stuck waiting for recall, since that recall could take long time due to slow tapes for example, then the mm commands could time out. This fix allows those commands to proceed while a deletion is waiting for recall.

Close a very small window of deadlock caused by releasing the kssLock and and calling cxiWaitEventWakeupOne when a thread not waiting for the exclusive lock is waken up and leaving the thread actually waiting for the lock sleeping and waiting.

Avoid a GPFS crash when running mmrestorefs or mmbackup where there are deleted filesets.

Enable offline fsck to validate extended attribute file

Fix a problem with directory lookup code that can cause FSErrInodeCorrupted error to be incorrectly issued. This could occur when lookup on '..' entry of a directory occurs at the same time as its parent is being deleted.

Ensure that EA migration to enable FastEA support for a file system does not assert for 'Data-in-Inode' case under certain conditions

Enable online fsck to fix AFM pre-destroyed inodes. Use PIT to cleanup unlinked inodes in AFM disabled fileset.

Update allocation code to close a small timing window that could lead to file system corruption. The problem could only occur when a GPFS client has a file system panic at the same time as the new file system manager is performing a take over after the old manager resigned.

Fix a signal 11 problem in multi-cluster environment when gpfs daemon relay the fsync request through metanode but the OpenFile got stolen on the metanode in the middle.

Remove confusing trace stop failed error messages on Windows.

The privateSubnetOverride configuration parameter may be used to allow multiple clusters on the same private subnet to communicate even when cluster names are not specified in the 'subnets' configuration parameter.

This fix indicates that mmfileid command will not work if there is only GPFS express edition installed.

Fix a workload counter used for NVRAM log tip I/O processing queues. Recommended if NVRAM log tip is in-use.

Potentially avoid crash on normal OS shutdown of CNFS node.

Fix issue where file create performance optimization was sometimes disabled unnecessarily.

In a cluster configured with node quorum, fix a problem where, if the cluster manager fails and the cluster is left with only the bare-minimum number of nodes to maintain node quorum, the cluster may still lose quorum.

Enable offline fsck to fix AFM orphan directory entries in single run

Fix a problem where the number of nodes allowed in a cluster is reset from 16384 to 8192.

This affects GSS/ESS customers who are using chdrawer to prepare to replace a failed storage enclosure drawer on an active system.

Correct a problem in the v4.1 release with directory listings in file systems created prior to v3.2.

Fix a problem that a race between log wrap and repair threads caused checksum mismatch in indirect blocks.

Fix a daemon crash in AFM ensuring that the setInFlight() method have positive 'numExecuted' value while calculating the average wait time of the messages.

Fix a problem on GPFS CCR cluster where GPFS commands may not work on inactive configuration servers after generated new security key.

Fix command poor performance on cluster that has no security key.

Fix a problem with DIRECT_IO write which can cause data loss when file system panic or node fails after a write passes the end of file using DIRECT_IO and causes an increase in file size. The file size increase could be lost.

File cache filled-up with deleted objects (Linux NFS)

Fix a hardlink creation issue by handling the E_NEEDS_COPIED error in SFSLinkFile function for AFM files.

Fix handling of policy rules like ... MIGRATE ... TO some-group-pool THRESHOLD (hi,lo) ...

The /var/mmfs/etc/RKM.conf configuration file used to configure file encryption now supports a wider set of characters.

Trigger a back-off when 90% of the configured hard memory limit is hit during queuing of AFM recovery operations.

ESS customers, using zimon, may see GPFS daemon crashes in the performance monitoring code.

ESS customers, using zimon, may see GPFS daemon crashes in the performance monitoring code.

Add support for multiple RDMA completion threads and completion queues

Fix signal 11 in verbs::verbsCheckConn_i

Fix signal 11 in runTSPcache caused by a uninitialized variable in error paths.

mmauth inadvertently change cipherList to an invalid string. Changed Externals: New messages: GPFS: 6027-3708 [E] Incorrect passphrase for backend '%s'. GPFS: 6027-3709 [E] Error encountered when parsing line %d: expected a new RKM backend stanza. GPFS: 6027-3710 [E] Error encountered when parsing line %d: invalid key '%s'. GPFS: 6027-3711 [E] Error encountered when parsing line %d: invalid key-value pair. GPFS: 6027-3712 [E] Error encountered when parsing line %d: incomplete RKM backend stanza '%s'. GPFS: 6027-3713 [E] An error was encountered when parsing line %d: duplicate key '%s'. GPFS: 6027-3714 [E] Incorrect permissions for the /var/mmfs/etc/RKM.conf configuration file. Deleted messages: GPFS: 6027-3536 [E] Incorrect passphrase '%s' for backend '%s'. GPFS: 6027-3511 [E] Error encountered when parsing '%s': expected a new RKM backend stanza. GPFS: 6027-3515 [E] Error encountered when parsing '%s': invalid key-value pair. GPFS: 6027-3514 [E] Error encountered when parsing '%s': invalid key '%s'. GPFS: 6027-3516 [E] Error encountered when parsing '%s': incomplete RKM backend stanza '%s'. GPFS: 6027-3544 [E] An error was encountered when parsing '%s': duplicate key '%s'.

This update addresses the following APARs: IV71419 IV71569 IV71601 IV71607 IV71613 IV71616 IV71628 IV71633 IV71634 IV71636 IV71648 IV71692 IV71815 IV72029 IV72033 IV72039 IV72042 IV72048 IV72684 IV72687 IV72688 IV72694 IV72695 IV72698 IV72700 IV72890.

Problems fixed in GPFS 4.1.0.7 [March 12, 2015]

Fix a problem with block allocation code, where E_NOSPC error could be incorrectly returned after running out of disk space in one failure group. This problem only affects file systems with data replication.

When a GSS logTipBackup pdisk fails, mmlsrecoverygroup output will now display offline (as opposed to error)for the affected logTipBackup vdisk.

Fix a problem in an FPO environment which may cause auto recovery to fail to suspend some down disks.

Fix a problem when deleting files from independent fileset which causes unnecessary recalls when there are no snapshots.

Ensure to check null pointer when memory allocation fails while Ganesha is active

Enforce the same declustered array (DA) name for the old pdisk and the corresponding new one when replacing a pdisk with mmadpdisk --replace

Fix problem that may cause assertion '!ofP->destroyOnLastClose' when the file system is mounted in RO mode on some nodes and in RW more on others.

Enhance the handling of TSM summary output in mmbackup.

Fix a deadlock occurring in a heavy DIO and mmap workload on Linux

Fix fsck handling of ToBeDeleted inode map bits

Fix a problem where, when migrating a block, mmrestripefile/mmrestripefs -r should but does not comply with the file's WADFG attribute if that file belongs to a FPO pool which has WAD=0, BGF=1 attribute.

Fix bug in change license in CCR cluster.

Fix a problem that might cause auto recovery failure in FPO cluster.

Fix code to calculate the correct payload for a remote procedure call during file updates to the GPFS internal configuration repository. The problem may cause assertion 'bufP - msgP == len'.

Fix a problem that might cause no space error even if there is free disk space.

Protect fcntl kernel calls against non-privileged callers.

Exclude sensitive files from gpfs.snap collection.

GPFS command hardening.

Enable dynamically switching from cipherList=EMPTY to cipherList=AUTHONLY without bringing down the entire cluster.

This update addresses the following APARs: IV68842 IV69086 IV69619 IV69657 IV69702 IV69704 IV69705 IV69706 IV69707 IV70610.

Problems fixed in GPFS 4.1.0.6 [February 5, 2015]

Customer can change the spares of an existing DA when adding new pdisk.

Update code to prevent a deadlock that could occur when multiple threads try to create the same file in a directory at same time.

When E_NOMEM occurs on a token manager during token transfer, try throw out half of all OpenFile objects, regardless of which file system they belong to. Also limit implicitly set maxStatCache(MSC) value to 10,000.

Reduce memory utilization for GPFS RDMA QPs and fix a problem with Connect-IB when verbsRdmaSend is enabled.

Fixed the assert by mapping the error to permission error as expected.

Fixed a problem where creating a clone copy file, in an encrypted file system may result in threads being blocked or in an abnormal daemon termination.

Correct tsbuhelper updateshadow command to recognize several circumstances when gencount changes even for files not yet backed up, or changed.

Fixed the problem by changing the implementation of flushpending while operations get requeued.

In FPO, use disks evenly when doing mmrestripefs -r or mmdeldisk.

Fix is especially recommended for GSS systems on the Power platform.

Fixed the problem by allowing prefetch recovery to run only if all the nodes in the cluster are all at at least minReleaselevel 1410.

Fixed the problem by clearing CA_STATE|CA_CREATE bits after local rm is done.

Fix a defect which may cause data consistency problem if one runs mmrestripefile after reducing replica level.

Fix is recommended for all GSS systems using NVRAM log tip devices.

When creating FPO file system, always use 'cluster' layoutMap if allowWriteAffinity is yes and layoutMap is unspecified.

Fixed a problem that enable file system quota would hit an assert.

"Waiting for nn user(s) of shared segment" messages on shutdown.

Fixed the problem by reading inmem attrs for dump command instead of reading it from disk.

Fsck reports false corruption in inode alloc map due to rounding on wrong sector size.

Fix code that could cause mmclone copy to fail incorrectly with EINVAL. This problem could only occur when source is file in GPFS snapshot.

Fix GNR bug related to a race condition that cause recovery failure during startup or failover.

Fix a mmdeldisk problem caused by disabled quota files placed in the system pool.

Fixes a problem where fsck hits signal 8 during inode validation

Make sure that the inode pointed to by different dentry is not the same inode to prevent possible deadlock by 2 diferent dentries pointing to the same inode and trying to lock the same inode twice.

Fix potential loss of IO error in linux io_getevents() call when enableLinuxReplicatedAio is enabled (3.5.0.14+). Fix a problem that returns 'error 217' improperly when do Direct IO on a replicated file which haspartial replicas on unavailable disks (4.1.0+).

Fix a stack corruption issue.

Fix a linux lookup crash issue.

Apply if secrecy of file metadata (pathnames, attributes and extended attributes) is a concern.

Revised disk selection algorithm to ensure no reuse can exist in a map entry being modified by rebuild or rebalance. Prior algorithm allowed this type of reuse and could lead to lessening the failure tolerance, trailer validation errors, and assert crashes in various GSS subsystems.

Ensure that GPFS uses only secure CBC padding schemes when exchanging data over TLS. This affects customers who have set cipherList to a real cipher (i.e. not AUTHONLY) in order to authenticate and encrypt data passed between nodes/clusters. This also affects customers who use file-level encryption.

Fix GPFS_FCNTL_GET_DATABLKLOC API, make it returns correct disk ID for data in inode files.

This fix is required for GSS server nodes in which multiple names assigned to the node differ only in the domain portion of the name.

Correct a problem leading to misleading tsgskkm system clock complaints.

Fix is recommended in all GNR configurations.

This update addresses the following APARs: IV67901 IV68007 IV68059 IV68060 IV68062 IV68064 IV68065 IV68096 IV68491 IV68493.

Problems fixed in GPFS 4.1.0.5 [December 5, 2014]

Fix an alloc cursor issue in block allocation code that may lead to spurious no space error in FPO file system.

Fixed code to consider flags during a disks challenge of the current cluster manager.

This fix applies to GNR/GSS customers that are adding additional server nodes and cluster is created using default block size.

Reduce number of nsdMsgRdmaPrepare messages sent.

Fix GSS bug related to concurrent overlapping read operations during failover/error scenarios.

Fixed a problem that the range size was initialized to wrong value with meta data-block size instead of the correct data block size.

Fix the problem that for certain configuration of FPO cluster, replicas failed to involve all LGs.

Redirect automatic recovery's tsrestripefs output to /var/adm/ras/restripefsOnDiskFailure.log

fix problem with verbsRdmasPer[Node Connection] set to a value of 1.

Reduce CNFS failover time on systems with large list of exports.

Allow disk addresses in inode 5 (Extended Attribute File) be be found by the mmfileid command.

Fix GSS bug related to mixed read-write workload with a working set size that matches GSS cache memory size.

Fixed a problem that range number was initialized to wrong value in a different meta data-block size environment anda full block write after lseek without placement installed.

fcntl revokes referencing a completed/freed request.

Hadoop File System API open() in connector throws exception as hdfs when user has no permission to access a file.

Fix bug where mmsetquota set the inode limit to unlimited if only changeblock quota is requested and vice versa.

If the user of a GSS system had previously changed the slow disk detection parameters manually to the following values: nsdRAIDDiskPerformanceMinLimitPct=50 and nsdRAIDDiskPerformanceShortTimeConstant=25000, then they can now remove the manual setting; but they don't have to remove it.

Fixed problem where the gpfs daemon gets sig11 when application calls GPFS_FCNTL_GET_DATABLKLOC api in mixed PPC64/X64 cluster.

This update addresses the following APARs: IV66617 IV66620 IV66621 IV67005 IV67006.

Problems fixed in GPFS 4.1.0.4 [October 31, 2014]

Fix code to prevent a GPFS daemon assert that could occur after automatic remount of filesystems. The problem only occurs on Linux node when user issued GPFS commands to access the filesystem before automatic remount has completed.

Fix "subscript out of range" error in recoverFailedDisk.

Correct --single-instance option for mmapplypolicy runs against directory.

The fix avoids a bug where pdisks become "missing" due to descriptors being overwritten. The fix is recommended if system firmware upgrades are applied.

Fix code used to check user provided buffer for NULL where it can cause writev() to incorrectly fail with EINVAL. Both readv()/writev() could be affected.

Fixed the problem that O_DSYNC was not honored on data write to small files.

Fixed Ganesha thread deadlock caused by Ganesha up-call thread's getting byte range lock that was just released.

Fix ibv_modify_qp error 22 when RDMA connecting client mlx4_0 port 2 HCA to server single port qib0 HCA.

Fix rare race condition that could result in a deadlock when moving data from the inode into a regular data block.

Too many RDMA connection created between nodes.

Fix Signal 11 on Connect-IB.

Make sure that FHs are still pointing to directories after we get the lock for rename.

1) Make sure stop Ganesha as soon as GPFS daemon cleanup starts 2) Make sure that FHs are still pointing to directories after we get the lock for rename.

Fix problem where AFM does not correctly replay Asynchronous I/O (AIO) writes, such as with aio_write(), when the file is opened with O_DIRECT flag. The problem may cause files to exhibit inconsistency between cache and home.

Fixed a timing issue which may cause AFM to miss replaying updates to home, if the update happens after a period of long inactivity. Disable a three-phase protocol during which requests are dropped while the gateway nodes are temporarily made inactive so they can be marked clean.

This update addresses the following APARs: IV65497.

Problems fixed in GPFS 4.1.0.3 [September 18, 2014]

The patch fixed mmfsd crash problem when restriping FS and adding disks running at the same time.

Fixes a problem may cause mmchdisk start to hang if there are too many pools or diskless nodes in a FPO cluster.

Fix a bug in mmdeldisk/mmadddisk that may cause file system to become unknown to GPFS if the name of that file system contains special char.

Fixed the daemon assert by allowing the async reads as dependents on lookup.

Fix local read only cache code that could caused GPFS daemon assert during file system unmount when local read only cache is in use.

Fixed a possible cause of deadlock when mmpmon or the GPFS SNMP subagent does node registration.

Improve handling and reporting of certain types of corrupted directory blocks.

Fix a defect in the fileset snapshot restore tool that it may cannot restore a file properly when the file's link count is zero but there is still open instance for the file.

mmlsfileset command used to display expiration timeouts for NON-RO fileset so if there any scripts which was looking for this will break.

Fix code that can cause GPFS daemon to assert when running mkdir on an AFM/SECVM enabled filesystem with metadata replication. The problem will only occur when there is no AIX node and GPFS can't allocate disk space for both replica.

Fix a truncate() issue. When truncate to a position which is beyond end of file on a FPO file system, the filesize could be set wrongly.

Apply if you are troubled by mmapplypolicy not terminating promptly and cleanly when ENOSPC is encountered.

Fix an assertion during mmrestoreconfig caused by quota file migration to reserved file.

Fixed a rare problem in background deletion code due to uninitialized list pointers.

Callback/user exit support is added for new event "afmFilesetUnmounted" which gets called when fileset is unmounted.

prevent the GPFS daemon from running into the assertion when a GSS vdisk client IO fails.

Fix race condition when renaming files.

Reduce the kernel stack usage in the write code path to void potentional kernal stack overflow.

This fix improves the GPFS tool useability to use mmaddcallback command to constantly monitor a desired file system status.

Avoid possible core dump by protecting pclose() call if broken pipe is known. Also, add LCL_ESTALE to list of fatal conditions.

This fix will detect the failure and report that the mmchfs command has failed.The FS will retain its old name and still be usable.

fixed a memory overwritten problem caused by uninitialized string copying in mmfs_dm_query_session().

Update restripe code to better handle both replica in same failure group after disk usage and failure group were changed via mmchdisk.

Fix is recommended on all systems in which the disk enclosure supports slot power control. Currently, this includes only the P7IH.

Prevent mmimgbackup from accepting --image option with a value that begins with an absolute path name such as /gpfs/ as this option is meant to only permit a file base name or prefix. Not for specifying the output directory.

Fixed daemon failure on a cluster not configured with the ClusterConfiguration Repository (CCR), but which include Tie-Breaker disks which havea physical sector size of 4096 bytes.

Fix AIO code path to properly manage vfs user count. Incorrect vfs user count could prevent file system quiesce and cause some GPFS commands (ex. snapshot commands) to fail.

Improve handling of unexpectedly corrupted directory blocks.

This fix is recommended for GPFS Native RAID systems running on the AIX operating system. It has no effect on Linux.

Fix inode dealloc issue in AFM LU mode.

Fix node failure recovery code to prevent a possible GPFS daemon assert. This problem could occur if file system manager node fails in the middle of restripe.

Fixed the startup code which will handle the locks in previous uncleaned mmshutdown better.

during mmcrfs, the client doesn't correctly parse the reply for 'nsdMsgConvertFormat', which causes the client failing to move on to the next backup nsd server.

Avoid daemon assert when rejecting illegal mmcrsnapshot commands.

Fix mmfsck assert "More than 22 minutes searching for a free buffer in the pagepool"

fcntl revokes may block unmount under appropriate loads

Fix a problem introduced by the fix for APAR IV56220 that will cause NSD server hangs/failures or possibly data corruption.

Do not log fsstruct errors when offline fsck is running.

When performing automatic disk recovery, take into account the value of defaultHelperNodes when initiating file system restripe operations.

Add additional checks to ensure the uniqueness of certain node attributes.

Fixed assert in setDataValid becuase blockOffset larger than the start offset.

Fix repair code to always performance rebalance for none FPO file system as requested by user.

Prevented I/O from being started after a node has lost contact with the cluster.

Fix bug introduced in GPFS 3.4 PTF30 and GPFS 3.5 PTF20 where mmlsnsd -X doesn't display persistent reserve information of the disk.

Fix code used to check user provided buffer for NULL where it can cause writev() to incorrectly fail with EINVAL. Both readv()/writev() could be affected.

This update addresses the following APARs: IV63359 IV63464 IV63467 IV63469 IV63471 IV63516 IV63518 IV63824 IV63878 IV63882 IV63887 IV63889 IV63894 IV63901 IV63903 IV63911 IV63917 IV63927 IV64598 IV64862.

Problems fixed in GPFS 4.1.0.2 [August 4, 2014]

Add tsm server selection option and improve messages.

write(fd,NULL,bs) gives rc -1 and inconsistent behavior Added a check in code to validate if user provided buffer is NULL. If user provided buffer for rea/write system call is NULL than error is returned much earlier in code.

Fix various problems with RDMA reconnection.

Fix a rare case live lock which can happen when FPO file system is in low space situation.

Fix two integer overflow problems of GPFS block map allocation module which caused by adding larger disk into existing file system. The problem can lead to block lost and data corruption.

Avoid very rare log recovery failure after restripe of snapshot files.

Prevent GPFS file system program mmfsd crash on a GNR/GSS system while deleting a log vdisk.

Fix a problem in locking code that can cause GPFS daemon assert under certain rare race condition. The chance is slightly higher under 4.1.

Prevent file system errors in the face of too many disk errors.

Offline fsck fileset 0 record false positive on v3.3 and older filesystem.

Fix a defect in the fileset snapshot restore tool when the tool tries to restore attributes of directories which they have been deleted after we create fileset snapshot.

Apply if you see tsapolicy failing immediately after a helper node goes down.

Eliminate FSSTRUCT errors from occuring during image restore process. Prevent gpfsLookup() common function from proceeding if stripe group is being image restored.

Fix a node crash defect in gpfs_ireaddirx GPFS API when we use it to list changed dentry for a directory which has data in inode.

Improved stability of mmfsadm dump tscomm.

Install this fix if you have non-standard enclosure / drawer hardware not found in GSS systems.

Fix a defect in the fileset snapshot restore tool when it tries to restore clone file which has been deleted after we create snapshot.

Ignore Grace msg on nodes that do not support Ganesha.

Fixed hung problem due ro lock overflow.

Fix a problem where user registered callback is unexpectedly invoked when using mount instesad of mmmount.

Fix a generation number mismatch defect when we create fileset in GPFS secvm file system.

Fixed a race condition that could lead to an assertion failure when mmpmon is used.

Fixed Assert 'filesetP->hasInodeSpace == 0' in offline fsck.

Fixed problem in multi acquire and release with FGDL.

When there is a GPFS failure return EUNATCH to Ganesha.

Fileset snapshot restore tool restores dir attributes more effectively.

Without this fix a setup with 4 or more drawers in an enclosure may not be able to survive the loss of the enclosure even though mmlsrecoverygruop -L states that disk group fault tolerance can survive an enclosure loss.

Fix a defect that the restore tool cannot sync the restoring fileset when the file system manager node of the restoring fileset is running in GPFS 4.1.0.0 and the restore command is running in a node which runs upper version.

Fixed online fsck assert after block allocation map export.

Must make sure that all the interfaces are enabled.

Fixed Ganesha not using right interface in RHEL6.5.

Fix GPFS_FCNTL_GET_DATABLKDISKIDX fcntl API to return location info of pre allocated block correctly.

clear sector 0 and last 4k bytes of the disk before it is created as NSD to prevent accidental GPT table recovery by uEFI driver.

Fix a race condition problem in fileset snapshot restore tool when it tries to restore extended attributes for a directory.

When GPFS kernel module is loaded on Linux, look up dependent symbols on demand.

Fix stale mount handling in case of home mount issues.

Fixed problem in scanForFileset when sgmgr resigns while the scan is in progress.

Fixed problem where the GPFS daemon may terminate abnormally while performing encryption key rewrap (mmapplypolicy command with "CHANGE ENCRYPTION KEYS" rule). Fixed problem where mmrestorefs -j on an encrypted file system may resultin the file system being unmounted.

Prevent multiple eviction processes from running simultaneously.

Assert in Deque after gracePeriodThread runs.

Update mmchmgr to pick the best candidate as new filesystem manager when user did not specify a new manager node.

Fix a memory leak in the GPFS daemon associated with Events Exporter, mmpmon,and SNMP support.

In GPFS systems employing GPFS Native RAID, there was a situation in which failover and failback, and disk replacement operations could hang, requiring GPFS to be restarted on the node to clear the hang. Fix is extremely low risk and highly recommended.

Fix mmsetquota bug that returns invalid argument if a stanza contain fileset attribute along with type=FILESET.

Fix deadlock if fs panics during E_IO err processing.

mmdeldisk is blocked while phase3 recovery is doing deferred deletions. It is enough to wait until log recovery is done.

Ensure SQL migration is done on GSS nodes only.

Use maxLogReplicas instead of defaultMaxMetadataReplicas to calcuate the number of new log items when a new stripped log is added.

Limit PORTMAP inactive failure due to DNS busy.

Ensure that vdisks are automatically scrubbed periodically.

Initialize the fromFsP to NULL in openArch() to guard against ever calling gpfs_free_fssnaphandle() with a bad argument. Add an informative message to look for the an error log in /tmp when the file writer pipeline is broken.

Correct the multi-release table to avoid releasing fcntl tokens prematurely.

Fixed race condition between two threads trying to become metanode at the same time.

Ensured not to create file if it already exists for NFS when Ganesha is running.

Fixed a typo in in removeOpenFileFrombgdList function that caused sig 11.

Fix code to prevent potential GPFS daemon assert during log recovery. This problem only occurs when filesystem is 4.1 format with 4K alignment enabled (4K inode size, etc). Data replication has to be enabled and direct IO been used for write with buffer size that is not 4K aligned.

This update addresses the following APARs: IV61626 IV61628 IV61630 IV61655 IV61988 IV61991 IV61995 IV62043 IV62091 IV62215 IV62243 IV62418.

Problems fixed in GPFS 4.1.0.1 [June 06, 2014]

Fix thread-safe problem in dumping GPFS daemon threads backtrace.

Fixes a problem with fsck repair of deleted root directory inode of independent filesets.

Fixed a problem in clusters configured for secure communications (cipherListconfiguration variable containing a cipher other than AUTHONLY) which may cause communications between nodes to become blocked.

After a file system is panicked, new lock range request will notbe accepted.

This fix only affects customers running GNR/GSS on Linux, and who have in the past misconfigured their GNR servers by turning the config parameter "numaMemoryInterleave" off, and who experienced IO errors on Vdisks as a result of that misconfiguration. These IO errors can potentially corrupt in-memory metadata of the GNR/GSS server, which can lead to data loss later on. This fix provides a tool that can be used to locate and repair such corruption.

Remove mmchconfig -N restrictions for aioWorkerThreads and enableLinuxReplicatedAio.

Fixed problem when reading a clonde child from a snapshot

Fixed a rare race condition causing the assert when two threads are attempting to do a metanode operation at the same time whilethe node is in the process of becoming a metanode.

Fixed a deadlock in a complicated scenario involving restripe,token revoke and exceeding file cache limit.

Fixed race between log recovery and mnodeResign thread

E_VALIDATE errors in the aclFile after node failure

Deal with stress condition where mmfsd was running out of threads

Fix a problem in log recovery that would cause it to fail when replaying a directory insert record. The error only occurs for filesystems in version 4.1 format, where the hash value of the file name being inserted is the same as an existing file in the directory. The problem is also dependent on the length of the file name, and only happens if the system crashes after the log record is committed, but before the directory contents are flushed.

Fixed the problem that was caused by a hole in the cleanBufferafter the file system panicked.

Close a hole that fileset snapshot restore tool (mmrestorefs -j) may cannot restore changed data for a clone child file.

Fix a rare assert which happens under low disk space situation

Fixed deadlock during mmap pagein

Fixed the problem of excessive RPCs to get indirect blocks and the problemof metanode lock starvation involving a huge sized sparse file.

A problem has been fixed where the GPFS daemon terminates abnormallywhen direct I/O and vector I/O (readv/writev) is used on encrypted files,and the data is replicated, or the data must go through an NSD server.

Fix a potential deadlock when selinux is enabled and FS is dmapi managed.

Close a hole that fileset snapshot restore tool (mmrestorefs -j) may cannot restore a snapshot in a race condition that one restore thread is deleting a file but another restore thread is also trying to get file attributes for this file.

Fixed a kernel oops that caused by a race in multiple NFS readson the same large file.

mmchfirmware command will avoid accessing non-existent disk path.

Fix a directory generation mismatch problem in an encrypted secvm file system.

shutdown hangs in the kernel trying to acquire revokeLock

Apply at your convenience. Even if you hit this bug, an equivalent cleanup is completed later in the command execution.

improved stability of daemon-to-daemon communications when cipherList is set to a real cipher (i.e. not AUTHONLY).

The serial number of physical disks is now recorded in the GNR event log, and displayed in the mmlspdisk command.

GNR on AIX allow only 32K segment.

Fixes a problem with fsck repair of corrupt root directory inode

mmbackup tricked by false TSM success messages Mmbackup can be fooled by TSM output when dsmc decides to roll-back a transaction of multiple files being backed up. When the TSM server runs out of data storage space, the current transaction which may hold many files will be rolled back and re-tried with each file separately. The failure of a file to be backed up in this case was not detected because the earlier message from dsmc contained "Normal File --> [Sent]" though it was later rolled back. Fixes in tsbuhelper now detect the failure signature "** Unsuccessfull **" string and instead of simply ignoring these now will revert the changes in the shadow DB for the matching record(s). Hash table keeps track of last change in each record already, so reverting is now a legal state transition for hashed records. Reorganized some debug messages and streamlined some common code to work better. Now find 'failed' string to issue reversion updates as well. Fixed pattern matching in tsbackup33.pl to properly display all "ANS####" messages.

Fix RO cache i/o error if mounting fs in ro mode.

Don't release mutex if daemon death.

Fix the path buffer length calculation to return the correct length for dm_handle_to_path() functions.

Fix bug in mmauth that may cause duplicate configure entries and node numbermismatch in configure file.

Fix a problem with creating directory if the parent directory has default POSIX ACL.

mmbackup fails to read hex env values mmbackup debug values, progress reporting, and possibly other user settings may be presented in decimal or hex, especially the bit-mapped progress and debugging settings. Perl doesn't always interpret the hex values correctly unless converted with the oct() function.

Correct an NLS-related problem with mmchdisk and similar commands.

This update addresses the following APARs: IV60187 IV60468 IV60469 IV60471 IV60475 IV60478 IV60543 IV60863 IV60864.

Product/Component Name:	Platform:	Fix:
IBM Spectrum Scale	Linux 64-bit,x86_64 RHEL Linux 64-bit,x86_64 SLES	SPECTRUM_SCALE_STD-4.1.1.0-x86_64-Linux

Readme and Release notes for release 4.1.1.0 IBM Spectrum Scale 4.1.1.0 SPECTRUM_SCALE_STD-4.1.1.0-x86_64-Linux Readme