Nfs in Lenovo Ontap Best Practices and Implementation Guide
Nfs in Lenovo Ontap Best Practices and Implementation Guide
Abstract
This document provides basic concepts, support information, configuration tips, and best practices for NFS in Lenovo®
ONTAP®. This guide covers the latest available ONTAP versions for currency and length. In most cases, the commands
and best practices listed in this guide apply to all ONTAP versions.
Results might vary in older ONTAP versions. Consult the product documentation for your ONTAP version when
necessary.
Second edition (October 2023)
© Copyright Lenovo 2023.
LIMITED AND RESTRICTED RIGHTS NOTICE: If data or software is delivered pursuant to a General
Services Administration (GSA) contract, use, reproduction, or disclosure is subject to restrictions set forth
in Contract No. GS-35F-05925
TABLE OF CONTENTS
Qtrees ..........................................................................................................................................................59
Qtrees and file moves ............................................................................................................................................ 60
Qtree IDs and rename behavior ............................................................................................................................. 60
File handle effect for qtree exports .........................................................................................................................60
Mounting multiple Qtrees in the same volume on the same NFS client .................................................................61
Subdirectory exports .............................................................................................................................................. 61
User and group owners .......................................................................................................................................... 62
3 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Replay/reply cache .................................................................................................................................................62
File locking ............................................................................................................................................................. 62
Impact of NFSv4.x locks on failover scenarios ...................................................................................................... 63
4 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFSv4.x client IDs/NFS4ERR_CLID_INUSE .......................................................................................................132
NFS silly renames ................................................................................................................................................ 134
How ONTAP NFS handles atime updates ........................................................................................................... 134
Trademarks ...............................................................................................................................................167
LIST OF TABLES
Table 1) NFS version support in ONTAP. ....................................................................................................................... 8
Table 2) NFS security support details. ............................................................................................................................ 9
Table 3) NFS feature support. .........................................................................................................................................9
Table 4) Unsupported NFS features. ............................................................................................................................ 10
Table 5) Export examples. ............................................................................................................................................ 15
Table 6) NFSv3 mode bits versus NFSv4.x ACL granularity. ....................................................................................... 36
Table 7) NFSv4.x lock terminology. .............................................................................................................................. 42
Table 8) NFS lease and grace periods. ........................................................................................................................ 48
Table 9) Referrals versus migration versus pNFS. ....................................................................................................... 50
Table 10) NFSv4.1 delegation benefits. ........................................................................................................................54
Table 11) Limits on local users and groups in ONTAP. ................................................................................................ 58
Table 12) Replay/reply cache NDO behavior. ...............................................................................................................62
Table 13) Lock state NDO behavior. .............................................................................................................................63
Table 14) UNIX Mode-bit levels. ................................................................................................................................... 75
Table 15) Nconnect performance results. ..................................................................................................................... 94
Table 16) VM statistic masks. ....................................................................................................................................... 98
Table 17) NFS credential cache settings. ................................................................................................................... 111
5 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Table 18) Exec contexts per node. ............................................................................................................................. 118
Table 19) Exec context throttle scale. ......................................................................................................................... 119
Table 20) Job comparisons — parallel dd with 65,536 and 128 RPC slots. ............................................................... 124
Table 21) Job comparisons — parallel dd with 65,536, 128, and 64 RPC slots. ........................................................ 124
Table 22) High file count creation (one million files) — NFSv3 — with and without nconnect — default slot tables. . 125
Table 23) High file count creation (one million files) — NFSv3 — with and without nconnect — 128 slot tables. ...... 126
Table 24) High file count creation (one million files) — NFSv3 — with and without nconnect — 16 slot tables. ........ 126
Table 25) Total clients at maximum concurrent operations (128) before node exec context exhaustion. .................. 127
Table 26) Total clients using 16 slot tables before node exec context exhaustion. .................................................... 127
Table 27) Job comparisons - parallel dd — NFSv3 and NFSv4.1 with 65536 RPC slots. .......................................... 128
Table 28) One million files using f.write — NFSv3, 65536 slots — VM dirty bytes defaults versus tuned. .................130
Table 29) 50x 500MB files using dd — NFSv3, 65536 slots — VM dirty bytes defaults versus tuned. ...................... 130
Table 30) NFSv4.x session slot performance comparison. .........................................................................................131
Table 31) NFSv4.x session slot performance — Percent change versus 180 slots. .................................................. 132
Table 32) NFSv4.1/pNFS/nconnect vs. NFSv3 — sequential reads. ..........................................................................156
Table 33) NFSv3 versus NFSv4.1 performance — High file creation workload ......................................................... 157
Table 34) NFSv3 vs. NFSv4.1 performance — High sequential writes ...................................................................... 158
Table 35) High file count test results — one million files. ............................................................................................162
Table 36) High file count test results — one million files — average CPU busy % and average latency. .................. 162
Table 37) Low file count test results — 2GB files. ...................................................................................................... 163
Table 38) Low file count test results — Average CPU busy % and average latency. .................................................163
Table 39) Default NFS Ports in ONTAP. .....................................................................................................................164
LIST OF FIGURES
Figure 1) Cluster namespace. .......................................................................................................................................13
Figure 2) Load-sharing mirror protection of vsroot volumes. ........................................................................................ 14
Figure 3) Symlink example using vsroot. ...................................................................................................................... 18
Figure 4) Lenovo FlexGroup volumes. ..........................................................................................................................19
Figure 5) Lenovo FlexCache volumes. ......................................................................................................................... 20
Figure 6) Qtree export specification — ThinkSystem Storage Manager. ......................................................................23
Figure 7) Reordering the rule index in ThinkSystem Storage Manager. ....................................................................... 25
Figure 8) pNFS data workflow. ......................................................................................................................................52
Figure 9) Gratuitous ARP during LIF migration. ............................................................................................................65
Figure 10) Example of setting NFSv4 audit ACE. ......................................................................................................... 68
Figure 11) Single LIF NAS interaction. ..........................................................................................................................69
Figure 12) Multiple LIFs NAS interaction. ..................................................................................................................... 70
Figure 13) Default actimeo latency — vdbench. ........................................................................................................... 92
Figure 14) Actimeo=600 latency — vdbench. ............................................................................................................... 93
Figure 15) Actimeo=600, nocto latency — vdbench. .................................................................................................... 93
6 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Figure 16) NFS mounts with and without nconnect ...................................................................................................... 95
Figure 17) Filtering events in ThinkSystem Storage Manager UI. ................................................................................ 98
Figure 18) Viewing NFS client to volume mappings in ThinkSystem Storage Manager ............................................. 102
Figure 19) Impact of RPC slot tables on NFSv3 performance. ................................................................................... 122
Figure 20) Parallel dd performance — NFSv3 and RPC slot tables; 1MB rsize/wsize. .............................................. 123
Figure 21) Parallel dd performance — NFSv3 and RPC slot tables; 256K rsize/wsize. ............................................. 123
Figure 22) RPC packet with 16 GIDs. ......................................................................................................................... 136
Figure 23) Random reads, 4K, NFSv3 versus NFSv4.x — IOPS/Latency ..................................................................158
Figure 24) Random writes, 4K, NFSv3 versus NFSv4.x –—IOPS/Latency ................................................................ 159
Figure 25) Sequential reads, 32K, NFSv3 versus NFSv4.x — IOPS/Latency ............................................................ 159
Figure 26) Sequential writes, 32K, NFSv3 versus NFSv4.x — IOPS/Latency ............................................................160
7 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Basic NFS concepts in Lenovo ONTAP
Intended audience and assumptions
This technical report is for storage administrators, system administrators, and data center managers. It
assumes basic familiarity with the following:
The Lenovo ONTAP data management software
Network file sharing protocols (NFS in particular)
This document contains some advanced and diagnostic-level commands. Exercise caution when using
these commands. If there are questions or concerns about using these commands, contact Lenovo
Support for assistance.
8 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
In-flight security refers to security for data that is being transmitted across the network.
NFS over SSH and NFS over stunnel are not supported with ONTAP.
NFS features
Each NFS version adds new features to the protocol to enhance operations, performance and business
use cases. The following table shows the supported NFS features in ONTAP, along with the associated
NFS version information for that feature.
The following table shows features that are currently unsupported for NFS.
9 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Table 4) Unsupported NFS features.
NFS version Unsupported features
All versions POSIX ACLs
Subdirectory exports
SSSD Dynamic UIDs
NFSv3 Extended attributes (not in RFC spec)
NFSv4.0/4.1 Session trunking/multipath
NFSv4.2 Live file migration (Flex Files)
Sparse file
Space reservation
IO_ADVISE
Application data holes
Server-side copy
Full mode for Labeled NFS
For more details about these options and an example of when these options can help environments with
a large number of NFS clients, see “Network port exhaustion with a large number of NFS clients.”
10 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Showmount
ONTAP clusters can potentially have thousands of export rules, so a query for all exports can be process
intensive. Additionally, exports are not in flat files and are applied to volumes as rules, so the export path
and export rules would live in two different places.
Example of showmount –e from client with showmount disabled in ONTAP:
[root@nfsclient /]# showmount -e x.x.x.a
Export list for x.x.x.a:
/ (everyone)
The SVM has a vsroot volume mounted to /, which is the volume returned in the showmount query. All
other volumes are mounted below that mount point and are not returned to the client when the
showmount NFS option is disabled.
The following shows output from a packet trace of the showmount command being run against a data LIF
in ONTAP with the option disabled:
x.x.x.x x.x.x.a MOUNT 170 V3 EXPORT Call (Reply In 17)
Mount Service
Program Version: 3
V3 Procedure: EXPORT (5)
The trace shows that the server returns /unix ->. However, this export path has a specific client in the
rule set:
cluster::> vol show -vserver NFS83 -junction-path /unix -fields policy
(volume show)
vserver volume policy
------- ------ --------
NFS83 unix restrict
If client match is required in the showmount functionality, the showmount utility in the toolchest provides
that functionality.
The showmount functionality is required for some applications to work properly, so showmount support
was added to properly support those applications.
This functionality is disabled by default. It can be enabled with the following command:
cluster::> nfs server modify -vserver NFS -showmount
enabled disabled
11 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
To use showmount in ONTAP, the parent volume (including vsroot, or /) needs to allow read or
traverse access to the client/user attempting to run showmount and vsroot (/) should use UNIX
security style.
After this functionality is enabled, clients can query data LIFs for export paths. However, the clientmatch
(access from clients, netgroups, and so on) information is not available. Instead, each path reflects
everyone as having access, even if clients are specified in export policy rule sets.
Sample output of showmount in clustered:
# showmount -e x.x.x.a
Export list for x.x.x.a:
/unix (everyone)
/unix/unix1 (everyone)
/unix/unix2 (everyone)
/ (everyone)
If using Windows NFS, showmount should be enabled to prevent issues with renaming files and
folders.
Showmount caching
When showmount is run from a client, it requests information from the NFS server on the cluster.
Because export lists can be large, the cluster maintains a cache of this information to reduce the number
of requests made to the NFS server.
When a volume is unmounted from the cluster namespace (see “The cluster namespace”) using the
volume unmount command or from ThinkSystem Storage Manager, the cache does not update, so the
exported path remains in cache until it expires or is flushed.
To flush the showmount cache:
cluster::> export-policy cache flush -vserver SVM -cache showmount
The cache only flushes on the node you are logged in to. For example, if you are logged in to node1’s
management LIF, then the cache on node1 flushes. This means that only clients connecting to data LIFs
local to node1 benefit from the cache flush. To flush the cache on other nodes, log into the node
management LIF on those nodes. The node that is flushing is displayed when running the command.
cluster::> export-policy cache flush -vserver SVM -cache showmount
Warning: You are about to flush the "showmount" cache for Vserver "SVM" on node "node1", which
will result in increased traffic to the name servers. Do you want to proceed with flushing the
cache? {y|n}: y
Namespace concepts
This section covers the concept of a “namespace” in NFS environments and how ONTAP offers multiple
solutions for providing a global namespace for NFS clients.
12 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
All the volumes belonging to the SVM are linked into the global namespace in that cluster using the “/”
export path. The cluster namespace is mounted at a single point in the cluster. The top directory of the
cluster namespace within a cluster (“/”) is a synthetic directory containing entries for the root directory of
each SVM namespace in the cluster. The volumes in a namespace can be Lenovo FlexVol® volumes or
Lenovo ONTAP FlexGroup volumes.
Figure 1) Cluster namespace.
Warning: Offlining root volume vsroot of Vserver NFS will make all volumes on that Vserver
inaccessible.
Do you want to continue? {y|n}: y
Volume "NFS:vsroot" is now offline.
If the vsroot volume is somehow unavailable, then NFS clients will have issues whenever the vsroot
volume is needed to traverse the file system.
This includes (but might not be limited to) the following behaviors:
Mount requests hang.
If / is mounted, traversal from / to another volume, running ls, and son on hang.
Umount operations might fail because the mount is busy, even when the volume is back online.
If a volume is already mounted below “/” (such as /vol1), then reads/writes/listings still succeed.
Load-sharing mirrors in ONTAP is a way to leverage the ONTAP SnapMirror capability to increase vsroot
resiliency.
Load-sharing mirrors are supported only with vsroot volumes. To share a load across data
volumes, consider using Lenovo FlexCache volumes instead.
13 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
When load-sharing mirrors are available for the vsroot volume, NFSv3 operations are able to leverage the
load-sharing mirror destination volumes to traverse the file system. When load-sharing mirrors are in use,
it is possible to access the source volume through the .admin folder within the NFS mount.
Lenovo highly recommends creating load-sharing mirror relationships for vsroot volumes in NFSv3
environments.
NFSv4.x clients are unable to use load-sharing mirror volumes to traverse file systems due to the
nature of the NFSv4.x protocol.
Figure 2 shows how load-sharing mirrors can provide access to “/” in the event the vsroot is unavailable.
Figure 2) Load-sharing mirror protection of vsroot volumes.
To create a load-sharing mirror for the vsroot volume, complete the following tasks:
Typically, the vsroot volume is 1GB in size. Verify the vsroot volume size prior to creating new
volumes and ensure the new volumes are all the same size.
Create a destination volume to mirror the vsroot on each node in the cluster. For example, in a four-
node cluster, create four new volumes with the -type DP.
Create a new SnapMirror relationship from the vsroot source to each new DP volume you created.
Specify a schedule for updates depending on the change rate of your namespace root. For example,
hourly if you create new volumes regularly; daily if you do not.
Initialize SnapMirror by using the initialize-ls-set command.
14 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
A pseudo-file system applies only in ONTAP if the permissions flow from more restrictive to less
restrictive. For example, if the vsroot (mounted to /) has more restrictive permissions than a data volume
(such as /volname) does, then pseudo-file system concepts apply.
Having a pseudo-file system allows storage administrators to create their own file system namespaces, if
they desire, by way of mounting volumes to other volumes using junction paths. This concept is illustrated
in Figure 1) Cluster namespace..
One use case for -actual that is not inherently covered by the ONTAP NFS architecture is -actual for
qtrees or folders. For instance, if a storage administrator wants to export a qtree or folder to a path such
as /folder1, there is no way to do this natively using the NFS exports in the SVM. The path is instead
/volume/folder1.
In ONTAP, the path for a qtree that NFS clients mount is the same path as the qtree is mounted to in the
namespace. If this is not desirable, then the workaround is to leverage symlinks to mask the path to the
qtree.
What is a symlink?
Symlink is an abbreviation for symbolic link. A symbolic link is a special type of file that contains a
reference to another file or directory in the form of an absolute or relative path. Symbolic links operate
transparently to clients and act as actual paths to data.
15 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
# pwd
/directory/user
When mounting a folder using NFS, it is better to use a relative path with symlinks, because there is no
guarantee that every user mounts to the same mount point on every client. With relative paths, symlinks
can be created that work regardless of what the absolute path is.
For example, if a qtree exists in the cluster, the path can look like this:
cluster::> qtree show -vserver flexvol -volume unix2 -qtree nfstree
The parent volume is unix2 (/unix/unix2), which is mounted to volume unix (/unix), which is
mounted to vsroot (/).
cluster::> vol show -vserver flexvol -volume unix2 -fields junction-path
(volume show)
vserver volume junction-path
------- ------ -------------
flexvol unix2 /unix/unix2
Some storage administrators might not want to expose the entire path of /unix/unix2/nfstree,
because it can allow clients to attempt to navigate other portions of the path. To allow the masking of that
path to an NFS client, a symlink volume or folder can be created and mounted to a junction path. For
example:
cluster::> vol create -vserver flexvol -volume symlinks -aggregate aggr1 -size 20m -state online
-security-style unix -junction-path /NFS_links
The volume size can be small (minimum of 20MB), but that depends on the number of symlinks in the
volume. Each symlink is 4k in size, so you may need to create larger volume sizes to accommodate the
number of symlinks. Alternatively, create a folder under vsroot for the symlinks.
After the volume or folder is created, mount the vsroot to an NFS client to create the symlink.
# mount -o nfsvers=3 x.x.x.e:/ /symlink
# mount | grep symlink
x.x.x.e:/ on /symlink type nfs (rw,nfsvers=3,addr=x.x.x.e)
If using a directory under vsroot, mount vsroot and create the directory.
16 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
# mount -o nfsvers=3 x.x.x.e:/ /symlink
# mount | grep symlink
x.x.x.e:/ on /symlink type nfs (rw,nfsvers=3,addr=x.x.x.e)
# mkdir /symlink/symlinks
# ls -la /symlink | grep symlinks
drwxr-xr-x. 2 root root 4096 Apr 30 10:45 symlinks
To create a symlink to the qtree, use the -s option (s = symbolic). The link path needs to include a
relative path that directs the symlink to the correct location without needing to specify the exact path. If
the link is inside a folder that does not navigate to the desired path, then ../ needs to be added to the path.
For example, if a folder named NFS_links is created under / and the volume unix is also mounted under
/, then navigating to /NFS_links and creating a symlink cause the relative path to require a redirect to
the parent folder.
Example of a symlink created in a symlink volume mounted to /NFS_links:
# mount -o nfsvers=3 x.x.x.e:/ /symlink/
# mount | grep symlink
x.x.x.e:/ on /symlink type nfs (rw,nfsvers=3,addr=x.x.x.e)
# cd /symlink/NFS_links
# pwd
/symlink/NFS_links
# ln -s ../unix/unix2/nfstree LINK
# ls -la /symlink/unix/unix2/nfstree/
total 8
drwxr-xr-x. 2 root root 4096 May 15 14:34 .
drwxr-xr-x. 3 root root 4096 Apr 29 16:47 ..
-rw-r--r--. 1 root root 0 May 15 14:34 you_are_here
# cd LINK
# ls -la
total 8
drwxr-xr-x. 2 root root 4096 May 15 14:34 .
drwxr-xr-x. 3 root root 4096 Apr 29 16:47 ..
-rw-r--r--. 1 root root 0 May 15 14:34 you_are_here
# pwd
/symlink/NFS_links/LINK
Note: Despite the fact that the symlink points to the actual path of /unix/unix2/nfstree, pwd
returns the path of the symlink, which is /symlink/NFS_links/LINK. The file you_are_here
has the same date and timestamp across both paths.
Because the path includes ../, this symlink cannot be directly mounted.
Example of symlink created in vsroot:
# mount -o nfsvers=3 x.x.x.e:/ /symlink/
# mount | grep symlink
x.x.x.e:/ on /symlink type nfs (rw,nfsvers=3,addr=x.x.x.e)
# cd /symlink/
# pwd
/symlink
# ln -s unix/unix2/nfstree LINK1
# ls -la /symlink/unix/unix2/nfstree/
total 8
drwxr-xr-x. 2 root root 4096 May 15 14:34 .
drwxr-xr-x. 3 root root 4096 Apr 29 16:47 ..
-rw-r--r--. 1 root root 0 May 15 14:34 you_are_here
# cd LINK1
# ls -la
total 8
drwxr-xr-x. 2 root root 4096 May 15 14:34 .
drwxr-xr-x. 3 root root 4096 Apr 29 16:47 ..
-rw-r--r--. 1 root root 0 May 15 14:34 you_are_here
# pwd
/symlink/LINK1
17 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Again, despite the fact that the actual path is /unix/unix2/nfstree, we see an ambiguated path of
/symlink/LINK1. The file you_are_here has the same date and timestamp across both paths.
Additionally, the symlink created can be mounted instead of the vsroot path, adding an extra level of
ambiguity to the export path:
# mount -o nfsvers=3 x.x.x.e:/LINK1 /mnt
# mount | grep mnt
x.x.x.e:/LINK1 on /mnt type nfs (rw,nfsvers=3,addr=x.x.x.e
# cd /mnt
# pwd
/mnt
One use case for this setup is with automounters. Every client can mount the same path and never
actually know where in the directory structure they are. If clients mount the SVM root volume
(/), be sure to lock down the volume to nonadministrative clients.
For more information about locking down volumes to prevent listing of files and folders, see the section in
this document called “Limiting access to the SVM root volume.”
Figure 3 shows a sample of how a namespace can be created to leverage symlinks to create ambiguation
of paths for NAS operations.
Figure 3) Symlink example using vsroot.
Export policies and rules can be applied to volumes and qtrees, but not folders or symlinks. This
fact should be taken into consideration when creating symlinks for use as mount points. Symlinks
instead inherit the export policy rules of the parent volume in which the symlink resides.
18 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
A Lenovo FlexGroup volume provides NFS clients with a true global namespace — a single large bucket
of storage that spans multiple nodes in a cluster and provides parallel metadata operations for NAS
workloads.
Figure 4) Lenovo FlexGroup volumes.
19 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
The cache system is faster than the system with the data source. This can be achieved through faster
storage in the cache (for example, solid-state drives (SSD) versus HDD), increased processing power
in the cache, and increased (or faster) memory in the cache.
The storage space for the cache is physically closer to the host, so it does not take as long to reach
the data.
Caches are implemented with different architectures, policies, and semantics so that the integrity of the
data is protected as it is stored in the cache and served to the host.
FlexCache offers the following benefits:
Improved performance by providing load distribution
Reduced latency by locating data closer to the point of client access
Enhanced availability by serving cached data in a network disconnection situation
FlexCache provides all of the above advantages while maintaining cache coherency, data consistency,
and efficient use of storage in a scalable and high-performing manner.
Figure 5) Lenovo FlexCache volumes.
A FlexCache is a sparse copy; not all files from the origin dataset can be cached, and, even then, not all
data blocks of a cached inode can be present in the cache. Storage is used efficiently by prioritizing
retention of the working dataset (recently used data).
With FlexCache, the management of disaster recovery and other corporate data strategies only needs to
be implemented at the origin. Because data management is only on the source, FlexCache enables
better and more efficient use of resources and simpler data management and disaster recovery strategies.
NFSv3 locking
NFSv3 uses ancillary protocols like Network Lock Manager (NLM) and Network Status Monitor (NSM) to
coordinate file locks between the NFS client and server. NLM helps establish and release locks, while
NSM notifies peers of server reboots. With NFSv3 locking, when a client reboots, the server has to
release the locks. When a server reboots, the client reminds the server of the locks it held. In some cases,
20 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
the lock mechanisms do not communicate properly and stale locks are leftover on the server and must be
manually cleared.
NFSv4.x locking
NFSv4.x uses a lease-based locking model that is integrated within the NFS protocol. This means there
are no ancillary services to maintain or worry about; all the locking is encapsulated in the NFSv4.x
communication.
When a server or client reboots, if the lock cannot be re-established during a specified grace period, then
the lock will expire. ONTAP NFS servers control this lock timeout period with the options -v4-grace-
seconds and -v4-lease-seconds.
-v4-lease-seconds refers to how long a lease is granted before the client has to renew the lease.
The default is 30 seconds, with a minimum of 10 seconds and maximum of -1 second of the value of
-v4-grace-seconds.
-v4-grace-seconds refers to how long a client attempts to reclaim a lock from ONTAP during a
reboot of a node (such as during failovers/givebacks). The default is 45 seconds and can be modified
with a range of +1 second of the -v4-lease-seconds value and a maximum of 90 seconds.
In rare cases, locks may not be freed as quickly as stated by the lease seconds value, which results in
the locks being freed over the course of two lease periods. For example, if grace seconds is set to 45
seconds, it may take 90 seconds to free the lock. For information about the impact of these values to
storage failovers, see “Impact of NFSv4.x locks on failover.”
Lock types
There are several types of NFS locks, which include:
Shared locks. Shared locks can be used by multiple processes at the same time and can only be
issued if there are no exclusive locks on a file. These are intended for read-only work but can be used
for writes (such as with a database).
21 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Exclusive locks. These operate the same as exclusive locks in CIFS/SMB – only one process can
use the file when there is an exclusive lock. If any other processes have locked the file, an exclusive
lock cannot be issued, unless that process was forked.
Delegations. Delegations are used only with NFSv4.x and are assigned when the NFS server
options are enabled, and the client supports NFSv4.x delegations. Delegations provide a way to
cache operations on the client side by creating a “soft” lock to the file being used by a client. This
helps improve some aspects of performance for operations by reducing the number of calls being
made between the client and server and are similar to SMB opportunistic locks. For more information
about delegations, see “NFSv4.1 delegations.”
Byte-range locks. Rather than locking an entire file, byte-range locks only lock a portion of a file.
Note: Locking behavior is dependent on the type of lock, the client operating system version and
the NFS version being used. Be sure to test locking in your environment to gauge the expected
behavior.
Usage:
flock [options] <file|directory> <command> [command args]
flock [options] <file|directory> -c <command>
flock [options] <file descriptor number>
Options:
-s --shared get a shared lock
-x --exclusive get an exclusive lock (default)
-u --unlock remove a lock
-n --nonblock fail rather than wait
-w --timeout <secs> wait for a limited amount of time
-E --conflict-exit-code <number> exit code after conflict or timeout
-o --close close file descriptor before running command
-c --command <command> run a single command string through the shell
# flock -n 4
Vserver: DEMO
Volume Object Path LIF Protocol Lock Type Client
-------- ------------------------- ----------- --------- ----------- ----------
home /home/v4user_file data2 nlm byte-range 10.x.x.x
Bytelock Offset(Length): 0 (18446744073709551615)
22 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Unlock the file.
# flock -u -n 4
Note: Manually locking files allows you to test file open and edit interactions, as well as seeing how file
locks handle storage failover events.
Export concepts
Volumes in ONTAP are shared out to NFS clients by exporting a path that is accessible to a client or set
of clients. When a volume is mounted to the SVM’s namespace, a file handle is created and presented to
NFS clients when requested in a mount command. Permissions to these exports are defined by export
policies and rules, which are configurable by storage administrators.
Qtree exports
In ONTAP, it is possible to set export policies and rules for volumes, as well as underlying qtrees. This
offers a way to restrict/allow client access to storage-managed directories in ONTAP, which can help
storage administrators more easily manage workloads such as home directories.
By default, qtrees will inherit the export policy of the parent volume. You can explicitly choose or create
an export policy and rule when creating qtrees in ThinkSystem Storage Manager, or by using the -
export-policy CLI option.
Figure 6) Qtree export specification — ThinkSystem Storage Manager.
23 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Qtree ID considerations
After a noninherited export policy is applied to a qtree, NFS file handles change slightly when dealing with
operations between qtrees. For more information, see the section “Qtree IDs and rename behavior.”
24 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Export policy rules: Options
The Appendix of this document lists the various options used for export policy rules and what they are
used for, as well as examples. Most export policy rule options can be viewed using the export-policy
rule show command or using ThinkSystem Storage Manager.
It is important to consider the order of the export policy rules when determining the access that is and is
not allowed for clients in ONTAP. If you use multiple export policy rules, be sure that rules that deny or
allow access to a broad range of clients do not step on rules that deny or allow access to those same
25 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
clients. Rule index ordering factors in when rules are read; higher-number rules override lower-number
rules in the index.
When you are using more granular rules (such as for a specific client, such as an administrative
host), they should be placed higher up in the rule index. Broader access rules should be placed
lower. For example, an administrative host rule would be at rule index 1 and a policy for 0.0.0.0/0
would be at index 99.
Clientmatch caching
When a clientmatch entry is cached, it is kept local to the SVM and then flushes after the cache timeout
period is reached or if the export policy rule table is modified. The default cache timeout period is
dependent on the version of ONTAP and can be verified using the command export-policy access-
cache config show in admin privilege.
These are the default values:
TTL For Positive Entries (Secs): 3600
TTL For Negative Entries (Secs): 3600
Harvest Timeout (Secs): 86400
To view a specific client in the export policy access-cache, use the following advanced privilege
command:
cluster::*> export-policy access-cache show -node node-02 -vserver NFS -policy default -address
x.x.x.x
26 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Node: node-02
Vserver: NFS
Policy Name: default
IP Address: x.x.x.x
Access Cache Entry Flags: has-usable-data
Result Code: 0
First Unresolved Rule Index: -
Unresolved Clientmatch: -
Number of Matched Policy Rules: 1
List of Matched Policy Rule Indexes: 2
Age of Entry: 11589s
Access Cache Entry Polarity: positive
Time Elapsed since Last Use for Access Check: 11298s
Time Elapsed since Last Update Attempt: 11589s
Result of Last Update Attempt: 0
List of Client Match Strings: 0.0.0.0/0
Vserver: NFS
Is Cache Enabled?: true
Is Negative Cache Enabled?: true
Time to Live: 24h
Negative Time to Live: 1m
Is TTL Taken from DNS: true
In some cases, if an NFS client’s IP address changes, the hosts entry might need to be flushed to correct
access issues.
To flush a hosts cache entry, run the following commands:
cluster::*> name-service cache hosts forward-lookup delete -vserver NFS ?
-host -protocol -sock-type -flags -family
Netgroup caching
If using netgroups in the clientmatch field for export rules, then ONTAP does additional work to contact
the netgroup name service server to unpack the netgroup information. The netgroup database in ns-
switch determins the order in which ONTAP queries for netgroups. In addition, the method ONTAP uses
for netgroup support is dependent on if netgroup.byhost support is enabled or disabled.
27 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
If netgroup.byhost is disabled, then ONTAP queries the entire netgroup and populates the cache with
all netgroup entries. If the netgroup has thousands of clients, then that process could take some time
to complete. Netgroup.byhost is disabled by default.
If netgroup.byhost is enabled, then ONTAP queries the name service only for the host entry and the
associated netgroup mapping. This greatly reduces the amount to time needed to query for netgroups,
as we don’t need to look up potentially thousands of clients.
These entries are added to the netgroup cache, which is found in vserver services name-service
cache commands. These cache entries can be viewed or flushed, and the timeout values can be
configured.
To view the netgroups cache settings, run the following commands:
cluster::*> name-service cache netgroups settings show -vserver NFS -instance
(vserver services name-service cache netgroups settings show)
Vserver: NFS
Is Cache Enabled?: true
Is Negative Cache Enabled?: true
Time to Live: 24h
Negative Time to Live: 1m
TTL for netgroup members: 30m
Vserver: DEMO
Netgroup: netgroup1
Hosts: sles15-1,x.x.x.x
Create Time: 3/26/2020 12:40:56
Source of the Entry: ldap
When only a single netgroup entry is cached, the IP-to-netgroup and hosts reverse-lookup caches are
populated with the entry.
cluster::*> name-service cache netgroups ip-to-netgroup show -vserver DEMO -host x.x.x.y
(vserver services name-service cache netgroups ip-to-netgroup show)
Vserver IP Address Netgroup Source Create Time
---------- ----------- ------------ ------- -----------
DEMO x.x.x.y
netgroup1 ldap 3/26/2020 17:13:09
cluster::*> name-service cache hosts reverse-lookup show -vserver DEMO -ip x.x.x.y
(vserver services name-service cache hosts reverse-lookup show)
Vserver IP Address Host Source Create Time TTL(sec)
---------- --------------- -------------------- ------ --------------- --------
DEMO x.x.x.y centos8-ipa.centos-ldap.local
dns 3/26/2020 17:13:09
3600
28 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Exportfs support
In ONTAP, exportfs is replaced by the export-policy and name-service cache commands.
When running exportfs, the following would be seen:
"exportfs" is not supported: use the "vserver export-policy" command.
AUTH types
When an NFS client authenticates, an AUTH type is sent. An AUTH type specifies how the client is
attempting to authenticate to the server and depends on client-side configuration. Supported AUTH types
include:
29 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
AUTH_NONE/AUTH_NULL. This AUTH type specifies that the request coming in has no identity
(NONE or NULL) and is mapped to the anon user. For more information, see
http://www.ietf.org/rfc/rfc1050.txt and http://www.ietf.org/rfc/rfc2623.txt.
AUTH_SYS/AUTH_UNIX. This AUTH type specifies that the user is authenticated at the client (or
system) and comes in as an identified user. For more information, see
http://www.ietf.org/rfc/rfc1050.txt and http://www.ietf.org/rfc/rfc2623.txt.
AUTH_RPCGSS. This is Kerberized NFS authentication.
There are several ways to configure root access to an NFS share. For examples, see “Examples of
controlling the root user.”
Vserver: nfs_svm
Policy Name: default
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: none
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
In the preceding export policy rule, all clients have any RO and RW access. Root is squashed to anon,
which is set to 65534.
For example, if an SVM has three data volumes, all would be mounted under “/” and could be listed with a
basic ls command by any user accessing the mount.
# mount | grep /mnt
x.x.x.e:/ on /mnt type nfs (rw,nfsvers=3,addr=x.x.x.e)
# cd /mnt
# ls
nfs4 ntfs unix
In some environments, this behavior might be undesirable, because storage administrators might want to
limit visibility to data volumes to specific groups of users. Although read and write access to the volumes
themselves can be limited on a per-data-volume basis using permissions and export policy rules, users
can still see other paths using the default policy rules and volume permissions.
To limit the ability to users to be able to list SVM root volume contents (and subsequent data volume
paths) but still allow the traversal of the junction paths for data access, the SVM root volume can be
modified to allow only root users to list folders in SVM root. To do this, change the UNIX permissions on
the SVM root volume to 0711 using the volume modify command:
30 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
cluster::> volume modify -vserver nfs_svm -volume rootvol -unix-permissions 0711
After this is done, root still has Full Control using the 7 permissions, because it is the owner. Group and
Others get Execute permissions as per the one-mode bit, which only allows them to traverse the paths
using cd.
When a user who is not the root user attempts ls, that user has access denied.
sh-4.1$ ls
ls: cannot open directory .: Permission denied
In many cases, NFS clients log into their workstations as the root user. With the default export policy rule
created by Storage Manager and Vserver setup, root access is limited.
# id
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-
s0:c0.c1023
# ls -la
ls: cannot open directory .: Permission denied
This is because the export policy rule attribute superuser is set to None. If root access is desired by
certain clients, this can be controlled by adding export policy rules to the policy and specifying the host IP,
name, netgroup, or subnet in the clientmatch field. When creating this rule, list it ahead of any rule that
might override it, such as a clientmatch of 0.0.0.0/0 or 0/0, which is all hosts.
The following is an example of adding an administrative host rule to a policy:
cluster::> export-policy rule create -vserver nfs_svm -policyname default -clientmatch x.x.x.x -
rorule any -rwrule any -superuser any -ruleindex 1
Vserver: nfs_svm
Policy Name: default
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: x.x.x.x
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: any
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
Now the client is able to see the directories as the root user.
# ifconfig | grep "inet addr"
inet addr:x.x.x.x Bcast:x.x.225.255 Mask:255.255.255.0
inet addr:127.0.0.1 Mask:255.0.0.0
# id
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-
s0:c0.c1023
# ls
nfs4 ntfs unix
31 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
# ifconfig | grep "inet addr"
inet addr:x.x.x.y Bcast:x.x.225.255 Mask:255.255.255.0
# id
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-
s0:c0.c1023
# mount | grep mnt
x.x.x.e:/ on /mnt type nfs (rw,nfsvers=3,addr=x.x.x.e)
# ls /mnt
ls: cannot open directory .: Permission denied
For more information about export policy rules and their effect on the root user, review the “Root User”
section of this document.
For more information about mode bits, see the following link: http://www.zzee.com/solutions/unix-
permissions.shtml.
32 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Allow Creation of Devices: true
# mkdir /flexgroup4TB/�
mkdir: cannot create directory ‘/flexgroup4TB/\360\237\217\206’: Permission denied
In the preceding example, \360\237\217\206 is hex 0xF0 0x9F 0x8F 0x86 in UTF-8, which is a
trophy symbol.
ONTAP software did not natively support UTF-8 sizes that were greater than three bytes in NFS. To
handle character sizes that exceeded three bytes, ONTAP placed the extra bytes into an area in the
operating system known as bagofbits. These bits were stored until the client requests them. Then the
client interprets the character from the raw bits.
Also, ONTAP has an event management system message for issues with bagofbits handling.
Message Name: wafl.bagofbits.name
Severity: ERROR
Corrective Action: Use the "volume file show-inode" command with the file ID and volume name
information to find the file path. Access the parent directory from an NFSv3 client and rename
the entry using Unicode characters.
Description: This message occurs when a read directory request from an NFSv4 client is made to a
Unicode-based directory in which directory entries with no NFS alternate name contain non-Unicode
characters.
33 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFS version considerations
NFSv3 considerations
The following section covers functionality, known issues, and considerations with NFSv3 in ONTAP.
34 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
systems across multiple nodes by way of junction paths, this FSID can change depending on where data
lives. Some older Linux clients can have problems differentiating these FSID changes, resulting in failures
during basic attribute operations, such as chown and chmod.
NFSv3 permissions
NFSv3 offers basic file and folder permissions to control access to end users. These permissions follow
the NFS mode bits definitions in RFC 1813 starting on page 22.
35 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Table 6 compares the NFSv3 mode bits against the NFSv4.x ACL granularity.
Finally, NFS group membership (in both NFSv3 and NFSV4.x) is limited to a maximum of 16 as per the
RPC packet limits. Options exist in ONTAP to help extend beyond those limits in “Auxiliary GIDs —
addressing the 16 GID limitation for NFS.”
36 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Advanced NFSv3 security concepts
The following section covers two advanced security concepts for NFSv3. The term Advanced refers to the
concepts being either nonstandard or containing multiple steps for configuration.
Considerations:
On upgrade, ONTAP adds the portmap service to all existing firewall policies, default or custom.
When you create a new cluster or new IPspace, ONTAP adds the portmap service only to the default
data policy, not to the default management or intercluster policies.
37 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
You can add the portmap service to default or custom policies as needed, and remove the service as
needed.
The folder now has an “s” present in the execute portion of the group mode bits.
# ls -la | grep setguid
drwxrwsr-x 2 prof1 ProfGroup 4096 Oct 6 16:10 setguid
When a new file is created, it inherits the group from the parent directory. In this case, root is the user.
# cd setguid/
# id
uid=0(root) gid=0(root) groups=0(root)
# touch newfile
# ls -la
total 8
drwxrwsr-x 2 prof1 ProfGroup 4096 Oct 6 2020 .
drwxrwxrwx 20 root root 4096 Oct 6 16:10 ..
-rw-r--r-- 1 root ProfGroup 0 Oct 6 2020 newfile
NFSv4.x considerations
The following section covers functionality, known issues, and considerations with NFSv4.x in ONTAP.
Enabling NFSv4.x
To start using NFSv4.x with ONTAP in your environment, there is a list of steps to perform/review:
6. Set the NFS ID domain string in /etc/idmapd.conf to the same value as the -v4-id-domain
option on the NFS SVM.
Ensure that the users and groups accessing the NFSv4.x client also exist (or can be queried from)
the ONTAP SVM. These users and groups need to have the same names and case sensitivity.
For example, [email protected] on the NFS client needs to match [email protected] on
the ONTAP NFS server.
If using LDAP or NIS for UNIX identities, ensure the user and group lookups return the expected IDs
and group members.
Export policy rules for volumes should be set to allow NFSv4 as a protocol.
Data LIFs in the environment should have NFS as an allowed protocol (net int show -fields allowed-
protocols).
The desired NFSv4.x versions should be enabled. If only version 4.1 is desired, enable only that
version and disable version 4.0.
If NFSv4.x ACLs are desired, you must enable them for the specific NFSv4.x version you wish to use
them with. (-v4.0-acl, -v4.1-acl)
Clients will negotiate the highest NFS version the NFS server supports. If some clients require NFSv3,
they will need to change how they mount.
38 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Advantages of using NFSv4.x
The following are some advantages to using NFSv4.x in your environment:
Firewall-friendly because NFSv4 uses only a single port (2049) for its operations
Advanced and aggressive cache management, such as delegations in NFSv4.x
Strong RPC security choices that employ cryptography
Internationalization
Compound operations
Works only with TCP
Stateful protocol (not stateless like NFSv3)
Kerberos configuration for efficient authentication mechanisms:
Support for DES and 3DES for encryption in clustered ONTAP 8.2.x and earlier
AES support in 8.3 and later
Migration (for dNFS) using referrals
Support of access control that is compatible with UNIX and Windows
String-based user and group identifiers
Parallel access to data through pNFS (does not apply for NFSv4.0)
It is important that you treat each specific use case differently. NFSv4.x is not ideal for all workload types.
Be sure to test for desired functionality and performance before rolling out NFSv4.x en masse.
ONTAP currently does not support NFSv4.x session trunking.
NFSv4.x fastpath
NFS fastpath was introduced to potentially improve NFSv4 performance for reads and writes. This
improvement was made by bypassing the internal processing of NFSv4 packets into ONTAP-centric
packets when the data request is made on a LIF that is local to the node hosting the volume. When
combined with other features such as pNFS or referrals, localized data can be guaranteed for each read
and write request, thus allowing consistent use of the NFSv4 fastpath. NFSv3 has always had an NFS
fastpath concept. NFS fastpath is enabled by default.
39 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
ONTAP introduced an improvement to how streaming workload types such as VMware, Oracle, and SAP
HANA performed with NFSv4.1 by adding large I/O support. This allowed NFS (both v3 and 4.x) to use up
to 1MB for both reads and writes.
40 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFSv4.0
The Lenovo ONTAP NFSv4.x implementation provides the following capabilities:
Write order. The implementation provides the capability to write data blocks to shared storage in the
same order as they occur in the data buffer.
Synchronous write persistence. Upon return from a synchronous write call, ONTAP (clustered)
guarantees that all the data has been written to durable, persistent storage.
Distributed file locking. The implementation provides the capability to request and obtain an
exclusive lock on the shared storage, without assigning the locks to two servers simultaneously.
Unique write ownership. ONTAP (clustered) guarantees that the file lock is the only server process
that can write to the file. After ONTAP transfers the lock to another server, pending writes queued by
the previous owner fail.
41 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
v2/v3 UID or GID having the corresponding numeric value. If the client does have a name string that
matches, then the client uses the name string rather than the numeriCID. If the client and server have
matching user names but mismatched domain strings, then numerics won’t be used; instead, the user
name/group name reverts to nobody. This is a common scenario with the root user, because that user
always exists on client and server, while the ID strings for NFSv4 in ONTAP defaults to
defaultv4iddomain.com. Because the NFS clients default to no domain string setting in the
idmapd.conf file (and instead falls back to the DNS domain for NFSv4 domain), mismatches often
occur in that scenario.
Essentially, this option makes NFSv4.x behave more like NFSv3. The default value of this option is
Enabled. For considerations regarding the use of extended group support, see “Considerations for
numeric ID authentication (NFSv3 and NFSv4.x).”
For more information about NFSv4.x locking, see the section in this document on “NFSv4 locking.”
Because of this new locking methodology, as well as the statefulness of the NFSv4.x protocol, storage
failover operates differently as compared to NFSv3. For more information, see the section in this
document called “Nondisruptive operations with NFS.”
Name services
When deciding to use NFSv4.x, it is a Lenovo best practice to centralize the NFSv4.x users in name
services such as LDAP or NIS. Doing so allows all clients and ONTAP NFS servers to leverage the same
resources and guarantees that all names, UIDs, and GIDs are consistent across the implementation.
Firewall considerations
NFSv3 required several ports to be opened for ancillary protocols such as NLM, NSM, and so on in
addition to port 2049. NFSv4.x requires only port 2049. If you want to use NFSv3 and NFSv4.x in the
same environment, open all relevant NFS ports. These ports are referenced in “Default NFS ports in
ONTAP.” For more information and guidance on firewalls, see “NFS security best practices.”
42 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Japanese, Chinese, German, and so on. When using NFSv4.x, RFC 3530 states that UTF-8 is
recommended.
11. Internationalization
When changing a volume's language, every file in the volume must be accessed after the change to
make sure that they all reflect the language change. Use a simple ls -lR to access a recursive listing of
files. If the environment is a high file count environment, consider using XCP to scan the files quickly.
For more information, consult the product documentation for your specific version of ONTAP.
Client considerations
When you use NFSv4.x, clients are as important to consider as the NFS server. For specific questions
about the NFSv4.x configuration, contact the operating system vendor.
Follow these client considerations when implementing NFSv4.x:
Other considerations might be necessary.
NFSv4.x is supported.
The fstab file and NFS configuration files are configured properly. When mounting, the client
negotiates the highest NFS version available with the NFS server. If NFSv4.x is not allowed by the
client or fstab specifies NFSv3, then NFSv4.x is not used at mount.
The idmapd.conf file is configured with the proper settings, including the correct NFSv4.x ID domain.
The client either contains identical users/groups and UID/GID (including case sensitivity) in local
passwd and group files or uses the same name service server as the NFS server/ONTAP SVM.
If using name services on the client, ensure the client is configured properly for name services
(nsswitch.conf, ldap.conf, sssd.conf, and so on) and the appropriate services are started, running,
and configured to start at boot.
The NFSv4.x service is started, running, and configured to start at boot.
43 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Ensure that the NFSv4.x ID domains match on the SVM (-v4-id-domain) and in the client’s
idmapd.conf file (or is the same as the DNS domain name of the client)
Set /sys/module/nfsd/parameters/nfs4_disable_idmapping (or its equivalent) to Y on the
NFS client
When these steps are followed, the clients bypass ID mapping and instead fall back to numeric IDs when
a username is not present on both the server and client.
NFSv4.x ACLs
The NFSv4.x protocol can provide access control in the form of ACLs, which are similar in concept to
those found in CIFS. An NFSv4 ACL consists of individual Access Control Entries (ACEs), each of which
provides an access control directive to the server. ONTAP defaults to 400 ACEs and supports a
maximum of 1,024 ACEs with a configurable NFS option (-v4-acl-max-aces).
44 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Access is denied if a DENY ACE is present in the ACL; access is granted if an ALLOW ACE exists.
However, access is also denied if neither of the ACEs is present in the ACL.
A security descriptor consists of a Security ACL (SACL) and a Discretionary ACL (DACL). When NFSv4
interoperates with CIFS, the DACL is one-to-one mapped with NFSv4 and CIFS. The DACL consists of
the ALLOW and the DENY ACEs.
If a basic chmod is run on a file or folder with NFSv4.x ACLs set, the ACLs will be removed unless the
NFS option v4-acl-preserve is enabled.
A client using NFSv4 ACLs can set and view ACLs for files and directories on the system. When a new
file or subdirectory is created in a directory that has an ACL, the new file or subdirectory inherits all ACEs
in the ACL that have been tagged with the appropriate inheritance flags. For access checking, CIFS users
are mapped to UNIX users. The mapped UNIX user and that user’s group membership are checked
against the ACL.
If a file or directory has an ACL, that ACL is used to control access no matter which protocol—NFSv3,
NFSv4, or CIFS—is used to access the file or directory. The ACL is also used even if NFSv4 is no longer
enabled on the system.
Files and directories inherit ACEs from NFSv4 ACLs on parent directories (possibly with appropriate
modifications) as long as the ACEs have been tagged with the correct inheritance flags.
When a file or directory is created as the result of an NFSv4 request, the ACL on the resulting file or
directory depends on whether the file creation request includes an ACL or only standard UNIX file access
permissions. The ACL also depends on whether the parent directory has an ACL.
If the request includes an ACL, that ACL is used.
If the request includes only standard UNIX file access permissions and the parent directory does not
have an ACL, the client file mode is used to set standard UNIX file access permissions.
If the request includes only standard UNIX file access permissions and the parent directory has a
noninheritable ACL, a default ACL based on the mode bits passed into the request is set on the new
object.
If the request includes only standard UNIX file access permissions but the parent directory has an
ACL, the ACEs in the parent directory's ACL are inherited by the new file or directory as long as the
ACEs have been tagged with the appropriate inheritance flags.
1. A parent ACL is inherited even if -v4.0-acl is set to off.
ACL formatting
NFSv4.x ACLs have specific formatting. The following example is an ACE set on a file:
A::[email protected]:rwatTnNcCy
45 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
type:flags:principal:permissions
A type of A means allow. The flags are not set in this case, because the principal is not a group and does
not include inheritance. Also, because the ACE is not an AUDIT entry, there is no need to set the audit
flags. For more information about NFSv4.x ACLs, see http://linux.die.net/man/5/nfs4_acl.
If an NFSv4.x ACL is not set properly, the ACL might not behave as expected, or the ACL change may
fail to apply and throw an error.
Sample errors include:
Failed setxattr operation: Invalid argument
Scanning ACE string 'A::user@rwaDxtTnNcCy' failed.
Explicit DENY
NFSv4 permissions can include explicit DENY attributes for OWNER, GROUP, and EVERYONE. That is
because NFSv4 ACLs are default-deny, which means that if an ACL is not explicitly granted by an ACE,
then it is denied. Explicit DENY attributes will override any ACCESS ACEs, explicit or not.
DENY ACEs are set with an attribute tag of D.
For example:
sh-4.1$ nfs4_getfacl /mixed
A::[email protected]:ratTnNcCy
A::OWNER@:rwaDxtTnNcCy
D::OWNER@:
A:g:GROUP@:rxtncy
D:g:GROUP@:waDTC
A::EVERYONE@:rxtncy
D::EVERYONE@:waDTC
DENY ACEs should be avoided whenever possible, because they can be confusing and complicated.
When DENY ACEs are set, users might be denied access when they expect to be granted access. This is
because the ordering of NFSv4 ACLs affects how they are evaluated.
The preceding set of ACEs is equivalent to 755 in mode bits, which means:
The owner has full rights.
Groups have read only.
Others have read only.
However, even if permissions are adjusted to the 775 equivalent, access can be denied because of the
explicit DENY set on EVERYONE.
For an example of explicit DENY, see “NFSv4.x ACL Explicit DENY example.”
Mode bit display behavior from Windows created files with NFSv4 ACLs
In multiprotocol NAS environments (where both SMB and NFS are used to access the same data),
NFSv4 ACLs can create some undesirable behavior. In one such case, ONTAP 9.7 and earlier versions
would show “----------” for files created via SMB on NFS clients when NFSv4 ACLs are in use. ONTAP 9.8
46 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
and later introduces the is-inherit-modebits-with-nfsv4acl-enabled (defaults to disabled)
CIFS server option to resolve this issue. Set that option to enabled for proper mode bit display.
NFSv4 delegations
NFSv4 introduces the concept of delegations that provide an aggressive local client cache, which is
different from the ad hoc caching that NFSv3 provides. There are two forms of delegations: read and
write. Delegations value cache correctness over improving performance. For delegations to work, a
supported UNIX client is required along with the NFS 4.x version-specific delegation options enabled on
the Lenovo controller. These options are disabled by default.
When a server determines to delegate a complete file or part of the file to the client, the client caches it
locally and avoids additional RPC calls to the server. This reduces GETATTR calls in the case of read
delegations because there are fewer requests to the server to obtain the file’s information. However,
delegations do not cache metadata, which means that high file count workloads will not see as much of a
benefit with delegations as a streaming file workload might see.
Read delegations can be granted to numerous clients but write delegations can be granted only to one
client at a time, as any new write to a file invalidates the delegation. The server reserves the right to recall
the delegation for any valid reason. The server determines to delegate the file under two scenarios: a
confirmed call-back path from the client that the server uses to recall a delegation if needed and when the
client sends an OPEN function for a file.
Vserver: DEMO
Volume Object Path LIF Protocol Lock Type Client
-------- ------------------------- ----------- --------- ----------- ----------
flexgroup_16
/flexgroup_16/files/topdir_82/subdir_268/file3
data nfsv4.1 delegation -
Delegation Type: write
/flexgroup_16/files/topdir_27/subdir_249/file2
47 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
data nfsv4.1 delegation -
Delegation Type: write
Object: nfsv4_1
Instance: DEMO
Start-time: 4/9/2020 15:05:44
End-time: 4/9/2020 15:31:13
Elapsed-time: 1529s
Scope: cluster
Number of Constituents: 2 (complete_aggregation)
Counter Value
-------------------------------- --------------------------------
delegreturn_avg_latency 1366us
delegreturn_percent 4%
delegreturn_success 311465
delegreturn_total 311465
NFSv4 locking
For NFSv4 clients, ONTAP supports the NFSv4 file-locking mechanism, maintaining the state of all file
locks under a lease-based model. In accordance with RFC 3530, ONTAP "defines a single lease period
for all state held by an NFS client. If the client does not renew its lease within the defined period, all state
associated with the client's lease may be released by the server." The client can renew its lease explicitly
or implicitly by performing an operation, such as reading a file. Furthermore, ONTAP defines a grace
period, which is a period of special processing in which clients attempt to reclaim their locking state during
a server recovery.
Locks are issued by ONTAP to the clients on a lease basis. The server checks the lease on each client
by default every 30 seconds. In the case of a client reboot, the client can reclaim all the valid locks from
the server after it has restarted. If a server reboots, then upon restarting it does not issue any new locks
to the clients for a default grace period of 45 seconds (tunable in ONTAP to a maximum of 90 seconds).
After that time the locks are issued to the requesting clients. The lease time of 30 seconds can be tuned
based on the application requirements.
NFSv4.x referrals
An NFS referral directs a client to another LIF in the SVM upon initial NFS mount request. The NFSv4.x
client uses this referral to direct its access over the referred path to the target LIF from that point forward.
Referrals are issued when there is a LIF in the SVM that resides on the node where the data volume
resides. In other words, if a cluster node receives an NFSv4.x request for a nonlocal volume, the cluster
48 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
node is able to refer the client to the local path for that volume by means of the LIF. Doing so allows
clients faster access to the data using a direct path and avoids extra traffic on the cluster network.
49 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
OPEN for file access
The NFSv4.x clients do use the following:
READ, WRITE, and SETATTR with special stateid of all bits 0
OPEN only to create a file and close it right away
The NFSv4.x clients do not have a state established on the NFS server.
NFS migration support can be useful in the following scenarios in ONTAP:
Volume moves
LIF migration/failover
50 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFSv4.1
NFSv4.1 is considered a minor version update of NFSv4. This section covers NFSv4.1 specifics. The
previous section covered NFSv4.0, as well as topics that apply to both NFSv4.0 and NFSv4.1 (denoted
by NFSv4.x in this document).
It is possible to enable NFSv4.1 and disable NFSv4.0. This is recommended if you wish to prevent clients
from using NFSv4.0 for any reason.
To mount a client using NFSv4.1, there must be client support for NFSv4.1. Check with the client vendor
for support for NFSv4.1. Mounting NFSv4.1 is generally done with the minorversion mount option, but
newer Linux kernels will autonegotiate the highest supported NFS version.
Example:
# mount -o nfsvers=4,minorversion=1 NFSSERVER:/unix /unix
NFSv4.1 features
NFSv4.1 introduced a number of new features to the NFSv4 protocol standard, as covered in RFC-5661.
These differences are covered in Section 1.8 of the RFC.
Some features are listed as Required, which means that the feature must be implemented in/supported
by the NFS server to claim RFC standard. Other features are listed as Recommended or Optional
features and are supported ad hoc by the NFS server but are not required to claim RFC compliance. For
example, pNFS is listed as an Optional feature for NFSv4.1 and is supported by ONTAP, but NFS
session trunking and directory delegations (also Optionals features) are not currently supported by
ONTAP.
51 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
The device information is cached to the local node for improved performance.
To see pNFS devices in the cluster, run the following command in advanced privilege:
cluster::> set diag
cluster::*> vserver nfs pnfs devices cache show
pNFS components
There are three main components of pNFS:
MDS:
Handles all nondata traffic such as GETATTR, SETATTR, and so on
Responsible for maintaining metadata that informs the clients of the file locations
Located on the Lenovo NFS server
Data server:
Stores file data and responds to read and write requests
Located on the Lenovo NFS server
Inode information also resides here
Clients
These components leverage three different protocols. The control protocol is the way the metadata and
data servers stay in sync. The pNFS protocol is used between clients and the MDS. pNFS supports file,
block, and object-based storage protocols, but Lenovo currently only supports file-based pNFS.
Figure 8) pNFS data workflow.
Object: nfsv4_1_diag
Instance: nfs4_1_diag
52 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Start-time: 4/9/2020 16:29:50
End-time: 4/9/2020 16:31:03
Elapsed-time: 73s
Scope: node1
Counter Value
-------------------------------- --------------------------------
pnfs_layout_conversions 4053
For basic comparison testing of pNFS with NFSv3 and NFSv4.x, see the following sections in this
document:
“Performance comparison: NFSv3 and NFSv4 using nconnect and pNFS”
“NFSv3 vs. NFSv4.x — Performance comparisons”
“Performance examples for different TCP maximum transfer window sizes”
Note: When using pNFS, be sure to use the latest available client and ONTAP release to avoid bugs,
such as this one: SU323: Data corruption possible during pNFS I/O on Linux distributions.
You can blacklist specific clients from using pNFS by blacklisting the pNFS module
nfs_layout_nfsv41_files from loading on the client in the /etc/modprobe.d/
nfs_layout_nfsv41_files-blacklist.conf file.
For example:
cat /etc/modprobe.d/nfs_layout_nfsv41_files-blacklist.conf
blacklist nfs_layout_nfsv41_files
53 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
For more information about blacklisting modules, see How do I prevent a kernel module from loading
automatically?
NFSv4.1 delegations
NFSv4.1 delegations are very similar to NFSv4.0 delegations, but are part of the v4.1 protocol rather than
v4.0. Table 10 covers the new additions to NFSv4.1 and how they benefit an environment over NFSv4.0.
These additions are covered in detail in RFC 5661, Section 10.2.
NFSv4.1 sessions
As per RFC 5661:
A session is a dynamically created, long-lived server object created by a client and used over time
from one or more transport connections. Its function is to maintain the server's state relative to the
connection(s) belonging to a client instance. This state is entirely independent of the connection itself,
and indeed the state exists whether or not the connection exists. A client may have one or more
sessions associated with it so that client-associated state may be accessed using any of the sessions
associated with that client's client ID, when connections are associated with those sessions. When no
connections are associated with any of a client ID's sessions for an extended time, such objects as
locks, opens, delegations, layouts, and so on, are subject to expiration. The session serves as an
object representing a means of access by a client to the associated client state on the server,
independent of the physical means of access to that state.
A single client may create multiple sessions. A single session MUST NOT serve multiple clients.
From SNIA:
Sessions NFSv4.1 has brought two major pieces of functionality: sessions and pNFS. Sessions bring
the advantages of correctness and simplicity to NFS semantics. In order to improve the correctness
of NFSv4, NFSv4.1 sessions introduce “exactly-once” semantics. This is important for supporting
operations that were non-idempotent (that is, operations that if executed twice or more return different
results, for example the file RENAME operation). Making such operations idempotent is a significant
54 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
practical problem when the file system and the storage are separated by a potentially unreliable
communications link, as is the case with NFS. Servers maintain one or more session states in
agreement with the client; a session maintains the server's state relative to the connections belonging
to a client. Clients can be assured that their requests to the server have been executed, and that they
will never be executed more than once. Sessions extend the idea of NFSv4 delegations, which
introduced server-initiated asynchronous callbacks; clients can initiate session requests for
connections to the server. For WAN based systems, this simplifies operations through firewalls.
NFSv4.2
NFSv4.2 is the latest NFSv4.x version available and is covered in RFC-7862. ONTAP 9.8 introduced
basic support for the NFSv4.2 protocol. ONTAP 9.9 added support for labeled NFS, but no other ancillary
features are currently supported. NFSv4.2 does not have an independent option to enable/disable it, but it
is enabled/disabled when you enable NFSv4.1 via the -v4.1 NFS server option in ONTAP. If a client
supports NFSv4.2, it negotiates the highest supported version of NFS during the mount command if not
specified. Otherwise, use the minorversion=2 mount option. There is no difference in performance
between NFSv4.1 and NFSv4.2.
Support for labeled NFS means that ONTAP now recognizes and understands the NFS client’s SELinux
label settings.
Labeled NFS is covered in RFC-7204.
Use cases include:
MAC labeling of virtual machine (VM) images
Data security classification for the public sector (secret, top secret, and so on)
Security compliance
Diskless Linux
In this release, ONTAP supports the following enforcement modes:
Limited Server Mode. ONTAP cannot enforce the labels but can store and transmit them.
Note: The ability to change MAC labels is also up to the client to enforce.
Guest Mode. If the client is not labeled NFS-aware (v4.1 or lower), MAC labels are not transmitted.
ONTAP does not currently support Full Mode (storing and enforcing MAC labels).
55 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Name services
In enterprise NAS environments, thousands of clients, users, and groups are interacting with storage
systems every day. These clients, users, and groups require easy management that is consistent across
all NAS clients. User1 in client A should not be different than user1 in client B, and client A and client B
should not use the same host names or IP addresses.
That’s where name services come in.
DNS
DNS servers provide a centralized way to create and manage IP addresses, host names, and business-
critical service records. When all clients and storage systems point to the same DNS configurations, then
there is consistency in host name <-> IP mapping without the need to manage thousands of local files.
DNS is critical in many applications and network services, such as:
Kerberos
LDAP
Active Directory
Use of DNS in NFS environments is highly recommended – especially when Kerberos and LDAP are
involved.
Dynamic DNS
In DNS, it’s possible to have clients send DNS updates to the DNS servers when IP addresses are added,
deleted, or modified. This feature reduces the amount of management overhead required for DNS, but is
only possible if the DNS server supports it.
ONTAP provides a method for data LIFs to send updates to DNS servers through dynamic DNS. This is
managed with the vserver services name-service dns dynamic-update commands.
56 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Identity management name services
For identity management, LDAP and NIS provide a central repository for users and groups, as well as
netgroup functionality. These centralized services offer a way for clients and servers to maintain the same
information to ensure predictable, consistent identities when accessing NAS file systems.
ONTAP supports both LDAP and NIS for name services, but LDAP is recommended over NIS for its
security and replication support.
LDAP
The recommended method for identity management for users, groups, and netgroups is to use a LDAP
server. Not only does LDAP centralize the source for name services across NFS clients and servers, it
also provides a way to secure communication via secure binds and searches using SSL or Kerberos to
encrypt the LDAP packets. NIS servers don’t offer support for this by default.
Additionally, LDAP servers offer easier ways to replicate their information across multiple servers –
particularly when using Active Directory for UNIX identity management.
NIS
NIS databases can also be used for users, groups, and netgroup name services with ONTAP. ONTAP
uses the standard NIS *.byname functionality for lookups using ypserv calls. Any NIS server that
leverages standard NIS functionality can be used for lookups – including Windows Active Directory.
ONTAP can also enable local NIS group.byname and netgroup.byname functionality (similar to NIS slave)
for chatty NIS environments. This helps reduce the overall load on the network and NIS servers when
enabled.
Local files
Local files (such as passwd, group and netgroup) are also supported as name service sources. With
ONTAP, a storage administrator can either build the files with ONTAP commands, through the UI (UNIX
user and group creation), or they can import flat files from servers by using the load-from-uri
commands. By default, ONTAP SVMs support up to 64,000 entries for local UNIX users and groups.
If local files are going to be the primary name service and there will need to be more than 64,000 entries,
then enabling scaled/file-only mode would be an option.
57 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Note: If loading files larger than 10MB for users and 25MB for groups, use the –skip-file-size-
check option.
When using file-only mode, individual operations on users and groups are not allowed. This configuration
is not currently supported in Lenovo MetroCluster or SVM disaster recovery (SVM DR) scenarios.
Can you still use external name services when using file-only mode?
File-only mode does not mean you cannot use LDAP or NIS as a name service; it means that the local
users and groups are managed with files only (as opposed to replicated database entries). LDAP and NIS
lookups will still work properly when file-only mode is enabled.
When using file-only mode, be sure the preceding users exist in the files being used to manage
the cluster. After file-only mode is enabled, the default users are removed if the uploaded file
does not include them.
Limits
The following section covers the limits for using local users and groups in ONTAP. These limits are
cluster-wide.
58 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Local UNIX Scaled-mode users/groups
users/groups
*Group and passwd file sizes can be overridden with –
skip-file-size-check but larger file sizes have not
been tested.
As previously mentioned, the local UNIX user and group limits are cluster-wide and affect clusters with
multiple SVMs. Thus, if a cluster has four SVMs, then the maximum number of users in each SVM must
add up to the maximum limit set on the cluster.
For example:
SVM1 has 2,000 local UNIX users.
SVM2 has 40,000 local UNIX users.
SVM3 has 20 local UNIX users.
SVM4 would then have 23,516 local UNIX users available to be created.
Any attempted creation of any UNIX user or group beyond the limit would result in an error message.
For example:
cluster::> unix-group create -vserver NAS -name test -id 12345
Error: command failed: Failed to add "test" because the system limit of {limit number}
"local unix groups and members" has been reached.
Multiprotocol NAS
ONTAP supports multiprotocol NAS access to the same datasets in an SVM. With ONTAP multiprotocol
NAS access, users and groups can use CIFS/SMB and NFS to access volumes or qtrees and leverage
ACLs for users and groups, along with setting file and folder ownership as desired.
Qtrees
Qtrees allow a storage administrator to create folders from the ONTAP UI or CLI to provide logical
separation of data within a volume. Qtrees provide flexibility in data management by enabling unique
export policies, unique security styles, quotas, and granular statistics.
Qtrees have multiple use cases and are useful for home directory workloads because qtrees can be
named to reflect the user names of users accessing data, and dynamic shares can be created to provide
access based on a username.
The following list provides more information about qtrees in FlexGroup volumes:
Qtrees appear as directories to clients.
Qtrees can be created at the volume level; you cannot currently create qtrees below directories to
create qtrees that are subdirectories.
Qtrees are created and managed the same way as a FlexVol qtree is managed.
Qtrees cannot be replicated using SnapMirror. Currently, SnapMirror only performed at the volume
level. If you want more granular replication with a volume, use junction paths.
A maximum of 4,995 qtrees is supported per volume. Quota monitoring and enforcement
(enforcement in ONTAP 9.5 and later for FlexGroup volumes) can be applied at the qtree or user
level.
59 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Qtrees and file moves
A qtree is considered a unique file system in ONTAP. Although it looks like a directory from a NAS client
perspective, some operations might behave differently than if it were an actual directory. One example of
that is moving a file between qtrees in the same volume.
When a file move is performed in a volume across directories, the file is simply renamed to a new name,
and the file moves happens within seconds because the move is inside of the same file system.
When a file move occurs between two qtrees, the file is copied to the new location rather than being
renamed. This causes the operation to take much longer.
This is a behavior that occurs whether the qtree lives in a FlexVol or a FlexGroup volume.
Enabled:
- Rename in same volume and qtree: SUCCESS
- Rename in same volume, different qtrees: EACCESS
- Rename between volumes where qtree IDs differ: EACCESS
- Rename between volumes where qtree IDs match: XDEV
Disabled:
- Rename in same volume and qtree: SUCCESS
- Rename in same volume, different qtrees: SUCCESS
- Rename between volumes where qtree IDs differ: XDEV
- Rename between volumes where qtree IDs match: XDEV
60 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Mounting multiple Qtrees in the same volume on the same NFS client
Although qtrees effectively act as independent file systems, if they live in the same volume, then the NFS
conversation between client and server involves the same MSID/file handle from the parent volume. This
can result in the NFS client seeing the qtrees as the same file system mounted twice, and the used space
is the same regardless of what is actually being used in each qtree.
For example, these two qtrees are mounted to the same client at different mount points.
# mount | grep qtree
10.193.67.214:/testvol/qtree1 on /mnt/qtree1 type nfs
10.193.67.214:/testvol/qtree2 on /mnt/qtree2 type nfs
They both show the same space usage before we copy a file.
# df -h | grep qtree
10.193.67.214:/testvol/qtree1 973G 2.0M 973G 1% /mnt/qtree1
10.193.67.214:/testvol/qtree2 973G 2.0M 973G 1% /mnt/qtree2
Then we copy a 3.8GB file to qtree1. Both qtrees show the same space used.
# cp debian-8.2.0-amd64-DVD-1.iso /mnt/qtree1/
# df -h | grep qtree
10.193.67.214:/testvol/qtree1 973G 3.8G 970G 1% /mnt/qtree1
10.193.67.214:/testvol/qtree2 973G 3.8G 970G 1% /mnt/qtree2
We can get around this by applying a simple monitoring quota to one of the qtrees. Just by doing this, the
proper space usage is seen.
cluster::*> quota report -vserver NFS
Vserver: NFS
# df -h | grep qtree
10.193.67.214:/testvol/qtree1 973G 3.8G 970G 1% /mnt/qtree1
10.193.67.214:/testvol/qtree2 970G 0 970G 0% /mnt/qtree2
Subdirectory exports
Qtrees can be exported through NFS, which provides a single-level subdirectory path to define unique
export policies and rules for clients. However, individual directories cannot have export policies and rules
applied to them, and qtrees currently can only be created at the volume level in ONTAP. If your
environment requires exports lower in the directory tree, a combination of volumes, qtrees, and junction
paths can be used to simulate subdirectory exports. However, this does not secure the entire path,
because each level in the junction path has to allow read access to the export policy rules for the clients
to allow traversal.
For example, you could create a subdirectory export like this:
/volume1/qtree1/volume2/qtree2/volume3/qtree3
Each object in this path can be exported to the NFS clients with unique policies and rules. If you need
greater levels of security for these folders, consider using NTFS security styles/ACLs or Kerberos for NFS.
61 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
User and group owners
Starting in ONTAP 9.8, you can set the user and group owner of a qtree from the ONTAP CLI with qtree
create or qtree modify. In previous releases, this was done through the NAS protocol from a client.
This is currently only available through the CLI or a REST API. There is no ZAPI or ThinkSystem Storage
Manager support.
[ -user <user name> ] User ID
[ -group <group name> ] Group ID
Replay/reply cache
The replay (or reply) cache in ONTAP is crucial to preventing NFS requests from trying nonidempotent
requests twice. Nonidempotent requests are requests that can change data structures. For example,
reading a document twice at the same time is an idempotent operation because it’s harmless. Editing that
document twice at the same time is a nonidempotent operation and can be harmful if the document
doesn’t have locking in place to protect it. The replay/reply cache in ONTAP helps keep track of which
operations have arrived on the storage in case a network issue causes a client to resend the same
operation. The cache is used to reply to the operation rather than retrying in the storage layer.
This cache is stored at the data layer with the volumes. When this cache is lost, CREATE operations can
fail with E_EXIST and REMOVE operations can fail with E_NOENT. Table 12 shows different scenarios in
which replay cache is kept or lost, which determines the disruptiveness of the operation.
File locking
File locking mechanisms were created to prevent a file from being accessed for write operations by more
than one user or application at a time. NFS leverages file locking either using the NLM process in NFSv3
or by leasing and locking, which is built in to the NFSv4.x protocols. Not all applications leverage file
locking, however; for example, the application “vi” does not lock files. Instead, it uses a file swap method
to save changes to a file.
When an NFS client requests a lock, the client interacts with the ONTAP system to save the lock state.
Where the lock state is stored depends on the NFS version being used. In NFSv3, the lock state is stored
at the data layer. In NFSv4.x, the lock states are stored in the NAS protocol stack.
62 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
In NFSv3 environments, locks are managed by the NLM protocol, which is ancillary to the NFS protocol.
As a result, when locks are used in NFSv3, there might be stale locks left over after failovers that need to
be cleaned up manually. NFSv4.x locks are reclaimed based on a lease model and do not need manual
cleanup. For more information about NFS file locking, see “File locking concepts.”
To view or remove file locks in an SVM, run the following commands in Advanced Privilege:
cluster::> set advanced
cluster::*> vserver locks
break show
When potentially disruptive operations occur, lock states do not transfer in some instances. As a result,
delays in NFS operations can occur as the locks are reclaimed by the clients and reestablished with their
new locations. Table 13covers the scenarios in which locks are kept or lost.
63 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
out of the default grace period before the timer runs out, even if all clients have sent
RECLAIM_COMPLETE to the storage system. Bug 1392916 has been opened for this issue to
enhance the failover logic for lock reclamation.
If the grace period values are tuned too low, there is a risk that another client may try to write to the
same file or byte-range that was previously locked, which can result in file corruption.
For single write workloads (where there is no risk of other clients trying to write to a locked file),
lowering the grace periods doesn’t present the same type of risk as workloads where multiple
clients may be writing to a file at any given time.
Reclaim times can vary depending on the number of locks and clients. In environments with many
locks/clients, reclaims take longer than environments with fewer locks/clients. The number of
clients/locks are a factor in how low you can safely set the grace period values.
However, if you need to modify the grace period for data LIF failover operations, run the following
command:
cluster::> nfs server modify -vserver DEMO -v4-grace-seconds 45
To modify the grace period for storage failover operations, run the following command:
cluster::> node run [node names or *] options locking.grace_lease_seconds
By default, an NFSv4 lease is granted to a client for 30 seconds. If a failure event occurs (such as
network outage or storage failover), the lease will exist for 30 seconds. For an additional 15 seconds,
ONTAP and the client will try to reestablish those locks. If network or storage failures exceed 45 seconds,
those locks are released and the client/application must re-establish the lock on its own.
NFSv4.1 sessions
In ONTAP, NFSv4.1 sessions are supported. With NFSv4.1 sessions, LIF migrations can be disruptive to
NFSv4.1 operations, but they are less disruptive than with NFSv4.0. For more information, see RFC-5661,
section 2.10.13. For ways to get extra performance with sessions for both NFSv3 and NFSv4.x, see the
section called “Nconnect.”
64 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
migration, the LIF is then moved to the new location and lock states are reclaimed by NFS clients. Lock
state reclamation is controlled by the NFS option -v4-grace-seconds (45 seconds by default). With
NFSv4.1 sessions, this grace period is not needed, because the lock states are stored in the NFSv4.1
session. Busier systems cause longer latency in LIF migrations, because the system must wait longer for
the operations to quiesce and the LIF waits longer to migrate. However, disruptions occur only during the
lock reclamation process.
You can view the ARP caches on clients and compare the MAC entries with the port information on the
Lenovo cluster.
For example, this is what the client sees:
# arp -a x.x.x.a
demo.ntap.local (x.x.x.a) at 90:e2:ba:7f:d4:bc [ether] on eno16780032
cluster::*> net port show -node node2 -port e2a -fields mac
node port mac
------------------ ---- -----------------
Node2 e2a 90:e2:ba:7f:d4:bc
When the LIF is migrated, the client’s ARP cache gets updated with the new port’s MAC address.
cluster::*> net int migrate -lif data2 -destination-node node1 -destination-port e2a
cluster::*> net port show -node node1 -port e2a -fields mac
node port mac
------------------ ---- -----------------
65 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
node1 e2a 90:e2:ba:7f:da:bc
# arp -a x.x.x.a
demo.ntap.local (x.x.x.a) at 90:e2:ba:7f:da:bc [ether] on eno16780032
66 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Wdelay/No_wdelay
Some applications (such as RedHat OpenShift) make a specific requirement for the NFS server export
option no_wdelay. This is to help protect against cache incoherency, performance, and other issues that
some NFS servers that do not guarantee writes might have with application interoperability. ONTAP does
not make use of this export option, because all writes in ONTAP are immediately available after the write
has been acknowledged by the NFS client.
NFS auditing
The following section covers the setup and use of NFS auditing, which can use either NFSv4.x audit
ACEs (UNIX security styles) or Windows audit ACEs (NTFS security styles).
67 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
cluster::> vserver audit create -vserver nfs -destination /unix -rotate-size 100MB
This command allows auditing for NFS and CIFS access on the junction path /unix for the SVM named
nfs. After auditing is enabled on the ONTAP system, the AUDIT ACEs should be created.
After the AUDIT ACE is applied and the user that is being audited attempts access, and the events get
logged to an XML file on the volume.
For an example of a logged NFS audit event, see “NFS audit event example.”
68 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Follow best practices for aggregate and storage configurations as per the product documentation.
In most cases, leave the defaults unchanged unless there is a specific reason to change the settings.
Install Unified Manager and configure to monitor your ONTAP cluster.
Set up proactive alerts for events such as storage failovers, volume capacity alerts, and so on.
Consider deploying FlexGroup volume for your NFS workloads, particularly in high file count
environments.
69 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Option #2: Performance approach — multiple data LIFs per SVM
In an ONTAP cluster, multiple nodes can be made available for NAS connectivity and storage.
Remember, ONTAP clusters using NAS can only scale up to 24 nodes. Multiple nodes mean multiple
physical resources, such as CPU/RAM/network interfaces. As a result, having more than one data LIF in
an SVM can add considerable performance to your NAS workloads. Spreading network connections
across nodes alleviates CPU and network port contention, as well as avoiding scenarios where a node
might have too many TCP connections. For network load balancing of NAS connections, round robin
DNS, on-box DNS, or standard load balancing hardware can be leveraged.
In situations where the best possible performance is required, or where many clients will be accessing a
NAS device at the same time, creating multiple data LIFs per SVM is a sound approach. Additionally, use
of load balancing NAS features such as NFS referrals, CIFS autolocation and pNFS will require a data
LIF on each node where data resides.
Figure 12 shows multiple LIFs NAS interaction.
Figure 12) Multiple LIFs NAS interaction.
70 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
interface that is local to the requested volume. If the volume being used is a FlexGroup volume, then NFS
referrals and CIFS autolocation should not be used.
pNFS provides a metadata path on the initial mount request, but all reads and writes are automatically
redirected to local volumes through pNFS layout calls. pNFS is only available for the NFSv4.1 protocol
and only with NFS clients that support it. FlexGroup volumes can only use pNFS in ONTAP 9.7 and later.
For more information about pNFS, see “Parallel network file system.”
Without autolocation features, managing data LIF locality to avoid the cluster network adds management
complexity that might not be worth the effort, as the performance impact for most NFS workloads is
negligible. Ideally, NAS connections connect to data LIFs that are local to the volumes, but with
FlexGroup volumes/scale-out NAS and larger cluster backend networks, this becomes less important.
71 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
If an environment has thousands of clients that are mounted through NFS and generating I/O, it is
possible to exhaust all ports on an NFS server. For example, there is one scenario with ESX using NFS
datastores, because some legacy best practices would call for a data LIF/IP address per datastore. In
environments with many volumes/datastores, this scenario created a situation where NFS ports were
overrun. The remediation for this situation is to disable the mount-rootonly and/or the nfs-rootonly
options on the NFS server. This solution removes the 1 to 1,024 port range limit and allows up to 65,534
ports to be used in a NFS server. For more information on these options, see “The rootonly options –
nfsrootonly and mountrootonly.”
This situation affects the source port (client-side) only. The server-side mountd, portmapper, NFS, and
nlm ports are designated by ONTAP.
LIF service policies create several default policies, but you can add custom policies if you choose. These
are the following default policies, which allow SAN, NAS, or management traffic. Only one policy can be
assigned to a data LIF at a time.
cluster::*> network interface service-policy show -vserver DEMO
Vserver Policy Service: Allowed Addresses
--------- -------------------------- ----------------------------------------
DEMO
default-data-blocks data-core: 0.0.0.0/0, ::/0
data-iscsi: 0.0.0.0/0, ::/0
data-fpolicy-client: 0.0.0.0/0, ::/0
72 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
management-https: 0.0.0.0/0, ::/0
data-fpolicy-client: 0.0.0.0/0, ::/0
To create policies that allow only NFS or only CIFS/SMB, you can use network interface
service-policy create, or you can add or remove services with network interface service-
policy add-service or network interface service-policy remove-service. This can all
be done without taking an outage.
Securing NFSv3
While NFSv4.x is superior to NFSv3 for security, there are still some steps you can take to better secure
the protocol.
73 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
If the export policy rule on the vsroot volume does not allow read access to a specific client, then that
client will be unable to mount the export. Read access is required on an export policy rule to allow
clients to traverse the paths to the volume. Vsroot is “/” in the export path.
Vsroot uses the default export policy when it’s created. By default, the “default policy” has no
export rules.
When setting the clientmatch for the vsroot’s export policy rule to allow path traversal, consider
setting it by netgroup, subnet, or a list of clients. Avoid setting the entry to 0.0.0.0/0 or 0/0; this allows
all clients to traverse/read the export. For more information, see “Access control to vsroot.”
To allow (or prevent) root access for a client, use the Superuser export policy rule option. For more
information, see “Mapping all UIDs to a single UID (squash_all).”
In multiprotocol NAS environments (CIFS/SMB and NFS access), the way ONTAP responds to
permission or owner changes on NFS mounts of volumes with NTFS security depends on the setting
of the export policy. The options are Fail (send an error) or Ignore (fail silently with no error).
Setting -rw-rule, -ro-rule or -superuser to None will squash a client’s access to the user set
with -anon. In most cases, -anon should be set to 65534. Exceptions would include setting -anon
to 0 to allow root access to an admin host or setting it to an explicit user to control file ownership for a
specific client/set of clients.
Rather than setting host names and IP addresses in the export clientmatch, consider using netgroups
to mask host/client lists in the -clientmatch field. Netgroups can live in local files on an SVM, in
LDAP, or NIS.
For clients that are doing read/write access, set -superuser to None and -anon to 65534 to
prevent root access.
Consider setting up a group of clients that are designated for administrative/root-level tasks and set
the export policy to allow root access. See “Examples of controlling the root user” for details on how
to allow root access.
If you’re using Kerberos for NFSv3 mounts, ensure you’ve included both sys and krb5* in your rules.
NFSv3 uses ancillary protocols and only uses Kerberos for the NFS portion. Limiting access to export
policy rules in NFSv3 to only krb5* will result in failed mounts, because sys access is needed for
portmapper, mount, and so on.
Avoid using any for -rw-rule, -ro-rule, -superuser; allowing any reduces the security
effectiveness of the policy rule. Instead, specify the authentication method you require in those
options.
Avoid using any for -protocol, because this allows access to protocols you might not want to use.
If you require both NFSv3 an NFSv4.x access to mounts, set the protocol to nfs. If you only want
NFSv3 or only NFSv4.x to access a mount, specify either nfs3 or nfs4.
Client firewalls
By default, NFS clients enable firewalls (such as SELinux in RHEL or AppArmor in Ubuntu). In some
cases, those firewalls can impact NFS operations. Always check the client firewall rules if NFS access
issues occur, and, if necessary, during troubleshooting, stop the firewall service.
74 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Consider changing the default port numbers for NFS ancillary protocols to ports that are not well-
known and adjusting firewall ports as needed.
Consider allowing only the NFSv3 ports that are needed by your environment. For example, if rquota
is not being used, don’t allow it though a firewall. There are several ports that are required for NFS
mounts, however, so you should always allow those port numbers.
Portmapper (111), mountd (635) and NFS (2049) are required to allow NFS access. Other
ancillary ports (NLM, NSM, rquota) are only needed if an application or client needs them. Test in
your environment accordingly.
NFS mounts generate a source port on the client. By default, ONTAP sets the range of allowed
source ports between 1–1024 (-mount-rootonly). In some cases, this range of port numbers
might not be large enough to accommodate the number of clients that are mounting. If more source
port numbers are needed, set -mount-rootonly to Disabled and modify the firewall rules to
accommodate that change. For more information, see “The rootonly options – nfsrootonly and
mountrootonly.”
NFS operations also use a range of source ports that can be controlled through -nfs-rootonly. By
default, this value is set to Disabled, which means the port range is 1–65536. If a low number of NFS
clients is used (100 or less), consider setting this option to Enabled to reduce the port range to 1–
1024 for better security.
Avoid opening up NFS ports outside of your local network. If NFS over a WAN is required, strongly
consider using NFS Kerberos with krb5p for end-to-end encryption. Alternately, consider using
Lenovo FlexCache volumes to localize NFS traffic to a remote site. Lenovo FlexCache volumes use
TLS 1.2 to encrypt communications.
Export policy rules and name services for UNIX identity management might be dependent on external
name servers (such as DNS, LDAP, NIS, Active Directory, and so on). Make sure the firewall rules
allow traffic for those name servers.
Some firewalls might drop idle TCP connections after a set amount of time. For example, if a client
has an NFS mount connected, but doesn’t use it for a while, it’s considered Idle. When this occurs,
client access to mounts can hang, because the network connection has been severed by the firewall.
Keepalives can help prevent this, but it is better to address this either by configuring firewalls to
actively reject packets from stale sessions or configuring the ONTAP NFS server options -idle-
connection-timeout and -allow-idle-connection. These options were introduced in
ONTAP 9.5.
Permissions
While it’s possible to set read and write permissions for clients in export policies, you cannot set
permissions for specific users and groups with just export rules. In addition, you can’t get more granular
than read or write with export policies. As a result, file and folder permissions would be needed. NFSv3,
by default, uses a fairly basic permission model for files and folders via mode-bits, which means that
permissions can only be explicitly controlled for the file/folder owner and everyone else. Additionally, the
level of permissions is limited to what is defined in RFC-1813.
Figure 14 shows what each permission mode bit does from 0–7.
75 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Symbolic notation Mode bit value Access level
-wx 3 Write and execute
-w- 2 Write
---x 1 Execute
It is also possible to leverage an NFSv4.x admin client to mount a file system and add NFSv4.x ACLs,
which are then honored by NFSv3 clients. This provide more granular permissions for NFSv3
environments.
The following additional recommendations can help better secure NFSv3. These are general
recommendations and do not apply to all environments, so be sure to test in your environment.
For the vsroot volume, limit the ower and group to Read and Execute and everyone else to Execute
only privileges (551). Vsroot is generally small, so you want to prevent users from writing to it when
possible. Limiting everyone other than the owner and group to Execute permissions only allows
traversal of the path and does not allow file/directory listings in the vsroot volume.
Avoid setting permissions to 7 on volumes/qtrees, unless you want to allow root access to the volume
owner or group.
Follow standard UNIX permission best practices when managing access to NFS file systems.
Consider using NFSv4.x ACLs or NTFS security style volume permissions (if multiprotocol NAS is in
use) for more robust permission management.
If using LDAP or NIS for UNIX identities, ensure the user and group lookups are returning the proper
users and group memberships from the NFS client and server.
User authentication
When an NFS mount is accessed, the authentication method is negotiated between the client and server.
For standard AUTH_SYS access, the user credentials are passed over the wire in plain text, which
means anyone spying on the network can see those credentials and use them to gain access to the file
system. With standard AUTH_SYS, there is no user name/password gateway to safeguard against
unintended access.
For the best security result with authentication in NFSv3, consider enabling NFS Kerberos for your NFS
mounts. AUTH_GSS (or Kerberos) configuration provides a few levels of protection for access, including:
User name/password interaction with a Kerberos Key Distribution Center server (such as Windows
Active Directory, and FreeIPA)
Security settings protecting against changing the time on a client to reuse an expired Kerberos
credential.
Encryption of the NFS operations, ranging from logins (krb5), data integrity checks (krb5i) and end-to-
end NFS encryption (krb5p).
An increase of the maximum number of user group membership from 16 in AUTH_SYS to 32 in
AUTH_GSS. (ONTAP provides a way to increase this number to 1,024 in the “Auxiliary GIDs —
addressing the 16 GID limitation for NFS” section)
Encryption
ONTAP provides in-flight encryption for NFS through krb5p, but also offers a few options for encrypting
data at-rest. These options include:
SED and LSE drives
LVE
LAE
76 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Showmount
Historically, showmount on NFS clients has been how users can see exported file systems on an NFS
server. By default, NFS servers in ONTAP enable the showmount functionality to show exported paths,
but do not list the allowed client access. Instead, showmount displays that (everyone) has access.
The “/” root path is not displayed in the showmount commands by default. To control the behavior of
showmount and show the root path, use the following NFS server options:
-showmount
-showmount-rootonly
This functionality might cause security scanners to flag the NFS server that has a vulnerability, because
these scanners often use showmount to see what is being returned. In those scenarios, you might want
to disable showmount on the NFS server.
However, some applications make use of showmount for functionality, such as Oracle OVM. In those
scenarios, inform the security team of the application requirements.
Securing NFSv4.x
NFSv4.x offers more robust security than NFSv3 and should be used in any NFS environment that will
prioritize security over all else.
NFSv4.x security features include:
Single port for all NFS operations
ID domain string matching
NFSv4.x ACLs
Integrated Kerberos support for user name/password authentication and end-to-end encryption
Client ID checks to help prevent client spoofing
Client firewalls
NFS clients, by default, enable firewalls (such as SELinux in RHEL or AppArmor in Ubuntu). In some
cases, those firewalls might impact NFS operations. Always check the client firewall rules if NFS access
issues occur, and, if necessary during troubleshooting, stop the firewall service.
Permissions
NFSv4.x provides permissions through mode-bits, as well as NFSv4.x ACLs. The advantages of ACLs
are covered in “Benefits of Enabling NFSv4 ACLs,” but for the most robust and granular security with
NFSv4.x, use ACLs.
77 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
User authentication
For user authentication in NFSv4.x, the same guidance discussed in “Securing NFSv3” should be
followed – use Kerberos whenever the highest level of security is required.
Encryption
For encryption in-flight and at-rest in NFSv4.x, the same guidance discussed in “Securing NFSv3” should
be followed.
78 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
single node, you run the risk of exhausting system resources faster than if you mount 1000 clients to
four different cluster nodes. Avoid pointing all NAS clients to a single data LIF in large scale
environments—such as EDA—to mitigate resource exhaustion.
In an HA pair, if each node has more than half of the maximum connection IDs allowed and a storage
failover occurs, then you might hit the connection ID limit on the surviving node as clients attempt to
reestablish connectivity to the NFS mounts and connections will fail. If possible, keep NAS
connections to about 50% of the total allowed limit in case a failover occurs,
NFSv3 mounts have ancillary protocols that are used during the connection process, such as mount
and portmapper. These UDP-based protocols are also assigned unique IDs during the initial
connection process and age out after 60 seconds. In mount storm scenarios (thousands of clients
mounting at the same time), it might be possible to exhaust connection resources artificially and
prematurely due to the temporary ID assignments for the ancillary protocols. Consider staggering
client mounts and spreading your automounter workloads across multiple nodes in a cluster, or, if you
are using Cloud Volumes ONTAP, across multiple Cloud Volumes ONTAP instances that use
FlexCache volumes.
Unmounts also generate connection IDs for the mount/portmap protocols.
Both UDP and TCP connections count against the total network connections on a node.
Specifying tcp as a mount option eliminates those extra UDP calls and reduces the total number of
connections generated on each mount/unmount. In environments that generate a lot of
mounts/unmounts, use tcp as a mount option.
NFSv4.x always uses TCP and does not use mount/portmapper for mounts/unmounts, so no extra
connection IDs are generated for that NFS version.
When the maximum connection limits are exceeded, an EMS message is generated
(maxCID.threshold.exceeded or maxCID.limit.exceeded).
Modern NAS clients have features that attempt to add more parallel network connections to a single
NAS connection, such as SMB multichannel and NFS nconnect. When using features that provide
more TCP threads per NAS share, they generally improve performance, but they also use up more
unique IDs in the NAS stack of ONTAP. For example, a normal NFSv3 mount might only use a single
unique ID, but an NFSv3 mount using nconnect=4 might use up to four unique IDs per mount. Keep
this in mind when designing for scale in EDA environments.
Each unique ID in ONTAP has a limit of 128 concurrent NAS operations. If this limit is exceeded by
the client sending more concurrent operations than ONTAP is able to manage, a form of flow control
is enacted on the NAS stack in ONTAP until resources are freed up. You can mitigate this behavior
with a client-side configuration or by using more unique IDs per client. For more information about
network concurrency with NAS, see “Network connection concurrency and TCP slots: NFSv3”.
In addition to per-node connection ID limits and per-connection concurrent NAS operation limits, there
are also node-level limits for total available NAS operations at any given time. Each time a NAS
operation is performed, a resource context is reserved until that operation is completed. At that time,
the resource is released back to the system. If too many resources are requested at once on a single
node, then performance issues can occur. See “Exec context throttling” for information about these
resources and how to best prevent issues.
79 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Connection IDs generated with mounts/unmounts not using the “tcp” mount option
Before mount:
cluster::> network connections active show -remote-ip x.x.x.x
There are no entries matching your query.
After mount:
cluster::> network connections active show -remote-ip x.x.x.x
Vserver Interface Remote
Name Name:Local Port Host:Port Protocol/Service
---------- ---------------------- ---------------------------- ----------------
Node: cluster-02
DEMO data2:111 centos83-perf.ntap.local:56131
UDP/port-map
DEMO data2:635 centos83-perf.ntap.local:44961
UDP/mount
DEMO data2:635 centos83-perf.ntap.local:1022
UDP/mount
DEMO data2:2049 centos83-perf.ntap.local:879 TCP/nfs
4 entries were displayed.
After ~60 seconds, the UDP connection IDs disappear and only the NFS connection remains:
cluster::> network connections active show -remote-ip x.x.x.x
Vserver Interface Remote
Name Name:Local Port Host:Port Protocol/Service
---------- ---------------------- ---------------------------- ----------------
Node: cluster-02
DEMO data2:2049 centos83-perf.ntap.local:879 TCP/nfs
Client unmounts:
# umount /mnt/client1
NFS connection ID is gone, but now there are mount/portmap UDP connections for ~60 seconds:
cluster::> network connections active show -remote-ip x.x.x.x
Vserver Interface Remote
Name Name:Local Port Host:Port Protocol/Service
---------- ---------------------- ---------------------------- ----------------
Node: cluster-02
DEMO data2:111 centos83-perf.ntap.local:36775
UDP/port-map
DEMO data2:635 centos83-perf.ntap.local:33867
UDP/mount
DEMO data2:635 centos83-perf.ntap.local:966 UDP/mount
3 entries were displayed.
Connection IDs generated with mounts/unmounts with the “tcp” mount option
Before mount:
cluster::> network connections active show -remote-ip x.x.x.x
There are no entries matching your query.
80 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
After mount:
cluster::> network connections active show -remote-ip x.x.x.x
Vserver Interface Remote
Name Name:Local Port Host:Port Protocol/Service
---------- ---------------------- ---------------------------- ----------------
Node: cluster-02
DEMO data2:2049 centos83-perf.ntap.local:931 TCP/nfs
Client unmounts:
# umount /mnt/client1
Node: cluster-01
Vserver: DEMO
Data-Ip: x.x.x.x
Client-Ip Volume-Name Protocol Idle-Time Local-Reqs Remote-Reqs
--------------- ---------------- -------- ------------- ---------- -----------
x.x.x.y scripts nfs3 55s 73 0
If a container mount is initiated to the same data LIF, a new connection ID is generated for NFS along
with new UDP connection IDs for mount and portmap:
[root@centos7-docker ~]# docker exec -it centos bash
[root@f8cac0b471dc /]# mount -o vers=3 10.193.67.237:/scripts /mnt
81 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
When you start a new container and mount within it, an additional NFS connection ID and UDP
connection IDs are created from the same client IP, even though you are using the same data LIF,
volume and node:
[root@centos7-docker ~]# docker run --name centos2 --rm -it --cap-add SYS_ADMIN -d
parisi/centos7-secure
16dec486692dbc1133b4c1f74c6e78aa7aab875c0aed71d0d087461a6bed8060
[root@centos7-docker ~]# docker exec -it centos2 bash
[root@16dec486692d /]# mount -o vers=3 10.193.67.237:/scripts /mnt
As a result, connection IDs can begin to add up quickly in environments that generate a lot of NFS
mounts in a short period of time,particularly if containers are involved.
For an idea of how scaling container environments can create a potential impact on the NFS server,
consider the following scenario: If a container host starts 1000 containers and each container makes an
NFS mount request to the same data LIF in a cluster, then the total number of connection IDs that are
generated on that single data LIF will range between 1000 (if specifying the tcp mount option) and 4,000
(3,000 UDP connection IDs for mount/portmap that age out after ~60 seconds if the tcp mount option is
not specified).
If the containers mount to different data LIFs on the same node, the same connection ID dispersal applies.
If the containers mount to different data LIFs on different nodes, then the connection IDs are distributed
across nodes, and it takes longer for the connection ID limits to be reached. The more cluster nodes and
data LIFs used, the more connection ID dispersal takes place. For two nodes, 1000 connection IDs can
divide into 500 per node. For a 4-node cluster, 1000 connection IDs can divide into 250 per node.
82 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
If you plan on using nconnect in environments that generate a large number of mounts, be aware of the
connection ID limits per node for your platform, and plan to distribute connections across multiple cluster
nodes to balance connections appropriately. In cases where a single node/data LIF is being used for NFS
mounts, connection IDs run out much faster than if multiple nodes are used. Connection ID limits should
be factored into scale discussions when architecting the solutions.
In the preceding command, XCP is creating 16 network threads per IP address specified. In this case,
both IP1 and IP2 are on the same node of the source cluster. As a result, that node gets 34 total
connection IDs established for this job (18 on the first IP specified; 16 on the second IP):
cluster::*> network connections active show -remote-ip x.x.x.y -fields cid,proto,service,remote-
ip,local-address,node
node cid vserver local-address remote-ip proto service
------------------ ---------- ------- ------------- ------------- ----- -------
cluster-01 1011516323 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516327 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516328 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516329 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516330 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516331 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516332 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516333 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516334 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516335 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516336 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516337 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516338 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516339 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516340 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516342 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516343 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516344 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516345 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516346 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516347 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516348 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516349 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516350 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516351 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516352 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516353 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516354 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516355 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516356 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516357 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516358 DEMO x.x.x.z x.x.x.y TCP nfs
cluster-01 1011516359 DEMO x.x.x.x x.x.x.y TCP nfs
cluster-01 1011516360 DEMO x.x.x.z x.x.x.y TCP nfs
34 entries were displayed.
When using XCP, try to spread the connections across multiple IP addresses on multiple nodes and be
aware that the number of parallel threads means more total network connection IDs are in use.
83 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Viewing connection ID maximums and allocations
In ONTAP, it is possible to view how many connection IDs (CIDs) are available for a node, how many
connection IDs are currently in use, and which clients are using those connections.
To view connection ID information, note the following commands:
network connections active (admin privilege)
cluster::> network connections active ?
delete *Delete an active connection in this cluster
show Show the active connections in this cluster
show-clients Show a count of the active connections by client
show-lifs Show a count of the active connections by logical interface
show-protocols Show a count of the active connections by protocol
show-services Show a count of the active connections by service
The following is a sample of how those results might appear before an NFS mount is established:
Object: cid
Instance: cid
Start-time: 6/25/2021 16:54:27
End-time: 6/25/2021 16:56:37
Elapsed-time: 130s
Scope: cluster-01
Counter Value
-------------------------------- --------------------------------
alloc_failures_nomem 0
alloc_failures_reserved_toomany 0
alloc_failures_toomany 0
alloc_total 0
cid_max 115904
execs_blocked_on_cid 0
in_use 352
in_use_max 0
instance_name cid
node_name cluster-01
process_name -
reserved_cid 10526
In this output the node has a cid_max of 115904. Of those, 10526 are reserved_cid for ONTAP
system operations, which means the total number of available connection IDs for client operations is
115904 – 10526 = 105378. If a node exceeds that limit, the EMS triggers a maxCID.limit.exceeded
message.
84 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Currently, there are 352 in_use CIDs.
After an NFS mount with nconnect=8 is established on node1, this is how the numbers change:
Object: cid
Instance: cid
Start-time: 6/25/2021 16:54:27
End-time: 6/25/2021 17:04:54
Elapsed-time: 627s
Scope: cluster-01
Counter Value
-------------------------------- --------------------------------
alloc_failures_nomem 0
alloc_failures_reserved_toomany 0
alloc_failures_toomany 0
alloc_total 14
cid_max 115904
execs_blocked_on_cid 0
in_use 360
in_use_max 0
instance_name cid
node_name cluster-01
process_name -
reserved_cid 10526
Now, instead of 352 CIDs in_use, there are 360 used. This aligns with the eight active network
connections created with the nconnect mount.
When an NFSv3 mount is issued without the tcp mount option specified, four new CIDs are in_use and
you can see it in the statistics.
Counter Value
-------------------------------- --------------------------------
alloc_failures_nomem 0
alloc_failures_reserved_toomany 0
alloc_failures_toomany 0
alloc_total 28
cid_max 115904
execs_blocked_on_cid 0
in_use 364
in_use_max 1
instance_name cid
node_name cluster-01
process_name -
reserved_cid 10526
These CIDs include the mount and portmap UDP entries seen with the network connections
active show command.
cluster::*> network connections active show -remote-ip x.x.x.x -fields cid,proto,service,remote-
ip,local-address,node
node cid vserver local-address remote-ip proto service
------------------ ---------- ------- ------------- ------------- ----- -------
cluster-01 1011516253 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516256 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516257 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516258 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516259 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516260 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516261 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516262 DEMO x.x.x.y x.x.x.x TCP nfs
cluster-01 1011516268 DEMO x.x.x.z x.x.x.x UDP port-map
cluster-01 1011516269 DEMO x.x.x.z x.x.x.x UDP mount
cluster-01 1011516270 DEMO x.x.x.z x.x.x.x UDP mount
cluster-01 1011516272 DEMO x.x.x.z x.x.x.x TCP nfs
85 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
After the UDP entries age out (at about ~60 seconds), the CIDs are released for use by new connections,
and you only have one extra CID in use by the NFS connection on data LIF IP x.x.x.z.
Counter Value
-------------------------------- --------------------------------
alloc_failures_nomem 0
alloc_failures_reserved_toomany 0
alloc_failures_toomany 0
alloc_total 28
cid_max 115904
execs_blocked_on_cid 0
in_use 361
in_use_max 1
instance_name cid
node_name cluster-01
process_name -
reserved_cid 10526
For example, you can specify that the client must use a port outside of the reserved port range with the
mount option noresvport (resvport is the default if not specified and uses source ports between 1-
1024). When you do this and mount-rootonly is enabled for the NFS SVM, the mount fails:
cluster::*> nfs show -vserver DEMO -fields mount-rootonly,nfs-rootonly
vserver mount-rootonly nfs-rootonly
------- -------------- ------------
DEMO enabled disabled
86 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
# mount -o noresvport,vers=3 demo:/scripts /mnt/client1
mount.nfs: access denied by server while mounting demo:/scripts
From a packet trace, you can see that the client source port is outside of the allowed range (36643):
129 x.x.x.x x.x.x.y MOUNT 138 V3 MNT Call (Reply In 130) /scripts
User Datagram Protocol, Src Port: 36643, Dst Port: 635
A packet trace shows that the source port is within the 1024 range (703):
44 x.x.x.x x.x.x.y MOUNT 138 V3 MNT Call (Reply In 45) /scripts
User Datagram Protocol, Src Port: 703, Dst Port: 635
45 x.x.x.y x.x.x.x MOUNT 150 V3 MNT Reply (Call In 44)
The trace shows that the source port is outside of the 1024 range (58323)
101 x.x.x.x x.x.x.y MOUNT 138 V3 MNT Call (Reply In 102) /scripts
User Datagram Protocol, Src Port: 58323, Dst Port: 635
102 x.x.x.y x.x.x.x MOUNT 150 V3 MNT Reply (Call In 101)
Because NFSv4.x mounts do not use the ancillary mount protocols for NFS mounts, the mount-
rootonly port does not factor in to those operations and only affects NFSv3 mounts.
87 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
option noresvport to use non-privileged source ports, as NFS mounts default to the resvport mount
option when not specified.
From ONTAP, you can see there are three connection IDs generated from the failed mount request that
expire after 60 seconds. In a mount storm, 60 seconds can be an eternity:
cluster::*> network connections active show -remote-ip x.x.x.x -fields remote-port,cid,service
node cid vserver remote-port service
------------------ ---------- ------- ----------- --------
cluster-02 2328843541 DEMO 35470 port-map
cluster-02 2328843542 DEMO 60414 mount
cluster-02 2328843543 DEMO 33649 mount
3 entries were displayed.
TCP uses acknowledgements in its conversations so that ONTAP does not need to maintain connection
IDs after the mount request is successful. As a result, using -o tcp for your NFS mount option means
that ports in the 1-1024 range are freed up faster, preventing most of the cases in which port exhaustion
for mounts might occur.
88 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
connections and might contribute to faster resource exhaustion. The same is true for storage failovers
(planned or unplanned) because the failed node’s data LIF will migrate to the surviving partner and
that node must maintain all of the NAS connections. If a data LIF migration occurs, resolve the issue
that caused the migration and revert the data LIF back to its home node.
Monitor connection ID usage in the cluster by using statistics start -object cid or the
equivalent REST API functions. Try to keep connection ID in_use values to approximately 50% of
the max_cid values for the node in case of storage failover (roughly 50,000 in_use connection IDs
per node). You can do this by adding more nodes with new data LIFs to the cluster.
Monitor the EMS for the events maxCID.limit.exceeded, maxCID.threshold.exceeded, and
nblade.execsOverLimit, and contact Lenovo Support if these events are generated.
Multiplexing of NFS connections by way of more data LIFs per node or through nconnect can
increase performance for some workloads, but it can also use more available resources per storage
node (such as connection IDs and exec contexts). If you use multiplexing, be aware of the potential
side effects (as covered in “Impact of nconnect on total connections” and “Exec context throttling”).
If possible, use the tcp mount option for NFS mounts (-o tcp).This reduces the total number of
connection IDs generated per mount/unmount operation and helps to prevent resource exhaustion
(connection IDs and NFS mount ports) in mount storm scenarios.
If you require more than 1024 available incoming NFS mount ports (for instance, if 2000 clients are
mounting at the same time), you might need to consider disabling the NFS server option mount-
rootonly, using the noresvport mount option on the NFS clients/automounters, and using the
mount option tcp to reduce the number of connection IDs generated per mount. (UDP mount
connections remain in cache for up to 60 seconds; TCP mount connections are removed on client
ACK).
NFSv4.x is not susceptible to issues with UDP and the mount protocol but has its own challenges, as
covered in “NFSv4.x considerations.”
89 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
including details on the performance impact, see “Network connection concurrency and TCP slots:
NFSv3.”
Mount options
NFS mount option recommendations depend solely on the workload and application being used. There is
general advice for specific mount options, but the decision on which NFS options to use is dependent on
the client OS administrators and application vendor recommendations. There is no single correct
recommendation for mount options. The following sections cover only a subset of NFS mount options. For
a complete list of supported NFS mount options, use man nfs on your NFS client.
Wsize/rsize
The mount options wsize and rsize determine how much data is sent between the NFS client and server
for each packet sent. This might help optimize performance for specific applications but should be set as
per application vendor best practices, as what is best for one application may not be best for other
applications.
Newer NFS clients autonegotiate the wsize and rsize values to what the -tcp-max-xfer-size value is
set to on the ONTAP NFS server if the mount command does not explicitly set the values. ONTAP
defaults -tcp-max-xfer-size to 64K and can be set to a maximum of 1MB.
The general recommendation for -tcp-max-xfer-size is to increase that value in ONTAP to
262144 (256K) and then specify explicit mount options as the applications require it.
For examples of some performance testing with different workload types and different wsize/rsize values,
see “Performance examples for different TCP maximum transfer window sizes.”
NFS Readahead
Readahead in NFS is a way for NFS clients to predictively request blocks of a file to improve performance
and throughput for sequential I/O workloads. Until recently, the readahead value for NFS mounts was set
to 15 times the rsize value of the mount. For example, if you set rsize to 64KiB, then readahead size
would be 960KiB.
In modern NFS clients (such as RHEL 8.3 and later or Ubuntu 18.04 and later), the readahead value is no
longer determined by the mount rsize but is globally defaulted to 128KiB. This can cause severe negative
performance consequences on reads. In newer Linux client versions that use the default 128KiB
readahead value, it is recommended to set the value to a higher limit. Testing read performance with
90 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
different values is the preferred method, but internal Lenovo testing has found that the value can be
safely set to as high as 15360KiB for sequential read workloads.
For more information about setting and viewing readahead values, consult with your client OS vendor.
For example, this SUSE KB describes readahead for those OS client flavors: Tuning NFS client read
ahead on SLE 10 and 11.
For a CentOS/RedHat client, the process is similar.
# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
Find the BDEV information for the mount with this command (BDEV format is N:NN):
# cat /proc/self/mountinfo | grep /mnt/client1
125 39 0:46 / /mnt/client1 rw,relatime shared:107 - nfs DEMO:/files
rw,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mount
addr=10.193.67.219,mountvers=3,mountport=635,mountproto=udp,local_lock=none,addr=10.193.67.219
Use the BDEV information to find the readahead value (BDEV for that mount point is 0:46):
# cat /sys/class/bdi/0:46/read_ahead_kb
15360
In the above example, readahead is set to 15360KiB (15x the rsize) for the /mnt/client1 mount and
rsize is set to 1MB on my CentOS 7.8 client.
On CentOS 8.3, this is the value my mount is set to by default:
# cat /sys/class/bdi/0:50/read_ahead_kb
128
91 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
In these instances, there is a known lag in picking up new content and the application still works with
potentially out-of-date data. For those scenarios, nocto and actimeo can be used to control the time
period where out-of-data date can be managed. For example, in EDA with tools and libraries and other
static content, actimeo=600 works well because this data is typically updated infrequently. For small web
hosting where clients need to see their data updates timelier as they are editing their sites, actimeo=10
might be acceptable. For large scale web sites where there is content pushed to multiple file systems,
actimeo=60 might be more effective. As always, test with your individual environments.
Using this mount options reduce the workload to storage significantly in these instances (for example, a
recent EDA experience reduced IOPs to the tool volume from >150K to ~6K) and applications can run
significantly faster because they can trust the data in memory, rather than needing to query the NFS
storage. This also helps reduce overall CPU % and load on the ONTAP nodes.
Actimeo
The actimeo mount option controls the attribute cache timeout on NFS clients. The actimeo option covers
the entire range of available attribute caches, including:
acregmin=n
The minimum time (in seconds) that the NFS client caches attributes of a regular file before it
requests fresh attribute information from a server. If this option is not specified, the NFS
client uses a 3-second minimum.
acregmax=n
The maximum time (in seconds) that the NFS client caches attributes of a regular file before it
requests fresh attribute information from a server. If this option is not specified, the NFS
client uses a 60-second maximum.
acdirmin=n
The minimum time (in seconds) that the NFS client caches attributes of a directory before it
requests fresh attribute information from a server. If this option is not specified, the NFS
client uses a 30-second minimum.
acdirmax=n
The maximum time (in seconds) that the NFS client caches attributes of a directory before it
requests fresh attribute information from a server. If this option is not specified, the NFS
client uses a 60-second maximum.
Attribute caching provides some relief on networks by reducing the number of metadata calls. This also
helps reduce latency to some workloads, as these metadata operations can now occur locally on the
client. Attribute caching generally has no effect to the number of overall operations, unless all operations
to the storage were metadata – specifically ACCESS calls.
For example, in our Customer Proof of Concept (CPOC) labs, actimeo was set to 10 minutes (600
seconds) and observed latency cut in half with an EDA workload generated by vdbench (from ~2.08ms to
~1.05ms). Figure 13 shows the actimeo default latency and Figure 14 shows an actimeo 600 latency.
Figure 13) Default actimeo latency — vdbench.
92 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Figure 14) Actimeo=600 latency — vdbench.
The downside of setting the actimeo value too high is that attributes that change might not be reflected
properly until the cache timeout occurs, which could result in unpredictable access issues.
The recommendation for attribute caching is to leave the defaults, unless otherwise directed by
the application vendor or if testing shows a vast improvement in performance.
Nocto
Nocto stands for no close-to-open, which means that a file can close before a write has completed in
order to save time. What this means in NFS environments is that other clients that have a file open for
reading won’t get consistent updates to that file. By default, the nocto option is not set on NFS mounts,
which means that all files will wait to finish writes before allowing a close.
The nocto option is used primarily to increase raw performance. For example, in the same vdbench tests
run in our Customer Proof of Concept Labs, the nocto mount option reduced latency by an
additional .35ms to .7ms, as shown in Figure 15.
Figure 15) Actimeo=600, nocto latency — vdbench.
93 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
The recommendation for the nocto option is to use only with read-heavy/read-mostly workloads or
workloads where data is not shared between multiple systems (such as a single-writer workload).
Clientaddr
By default, the clientaddr mount option is set automatically by using the NFS client IP address. However,
in some cases, this option might need to be specified in NFS mounts.
Two scenarios where it might be necessary to specify clientaddr would be:
Multi-NIC clients might need to specify this option to ensure that NFS uses the desired IP address
for connectivity.
NFSv4.x clients might need to specify this option if two clients with the same host name (but different
IP addresses) try to access NFS exports in the same SVM. NFSv4.x will send a client ID to the NFS
server based on the host name. ONTAP responds with CLID_IN_USE and prevent the second client
from mounting if it uses the same client ID. Specifying clientaddr option can force the client to
increment the client ID on subsequent mount attempts.
Note: In most instances, the clientaddr option does not need to be specified.
Nconnect
A new NFS mount option called nconnect is in its nascent stages for use with NFS mounts. The nconnect
option is only available on newer Linux clients. Be sure to verify with the OS vendor documentation to
determine whether the option is supported in your kernel.
The purpose of nconnect is to provide multiple transport connections per TCP connection or mount point
on a client. This helps increase parallelism and performance for NFS mounts.
ONTAP 9.8 and later offers official support for the use of nconnect with NFS mounts, provided the NFS
client also supports it. To use nconnect, verify whether your client version provides it and use ONTAP 9.8
or later. ONTAP 9.8 and later supports nconnect by default with no option needed.
Nconnect is not recommended for use with NFSv4.0. NFSv3, NFSv4.1, and NFSv4.2 should work fine
with nconnect.
Table 15 shows results from a single Ubuntu client using different nconnect thread values.
94 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Nconnect value Threads per process Throughput Difference
2 128 2.4GB/s +66%
4 128 3.9GB/s +169%
8 256 4.07GB/s +181%
The recommendation for using nconnect depends on client OS and application needs. Testing
with this new option is highly recommended before deploying in production.
To determine whether nconnect is indeed working in your environment, you can verify a few things.
When nconnect is not being used, a single CID is established per client mount. You can verify those cids
by running the following command:
cluster::> network connections active show -node [nodes] -service nfs* -remote-host [hostname]
For example, this is the output from an active NFS connection without nconnect:
cluster::> network connections active show -node * -service nfs* -remote-host centos83-
perf.ntap.local
95 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Vserver Interface Remote
Name Name:Local Port Host:Port Protocol/Service
---------- ---------------------- ---------------------------- ----------------
Node: node1
DEMO data1:2049 centos83-perf.ntap.local:1013
TCP/nfs
When nconnect is in use, more cids per mount are present. In this example, we used nconnect=8.
cluster::> network connections active show -node * -service nfs* -remote-host centos83-
perf.ntap.local
Vserver Interface Remote
Name Name:Local Port Host:Port Protocol/Service
---------- ---------------------- ---------------------------- ----------------
Node: node1
DEMO data1:2049 centos83-perf.ntap.local:669 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:875 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:765 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:750 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:779 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:773 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:809 TCP/nfs
DEMO data1:2049 centos83-perf.ntap.local:897 TCP/nfs
Another way to determine whether nconnect is being used is through a statistics capture for the CID
object. You can start the statistics for that object by running the following command:
cluster::> set diag
cluster::*> statistics start -object cid
When that object runs, it tracks the number of total allocated cids (alloc_total).
For example, this is the number of alloc_total for a mount without nconnect:
cluster::*> statistics show -object cid -counter alloc_total
Counter Value
-------------------------------- --------------------------------
alloc_total 11
Counter Value
-------------------------------- --------------------------------
alloc_total 16
Counter Value
-------------------------------- --------------------------------
alloc_total 24
Hard/soft
The hard or soft mount options specify whether the program using a file using NFS should stop and wait
(hard) for the server to come back online if the NFS server is unavailable or if it should report an error
(soft).
If hard is specified, processes directed to an NFS mount that is unavailable cannot be terminated unless
the intr option is also specified.
96 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
If soft is specified, the timeo=<value> option can be specified, where <value> is the number of seconds
before an error is reported.
For business-critical NFS exports, Lenovo recommends using hard mounts.
Intr/nointr
The intr option allows NFS processes to be interrupted when a mount is specified as a hard mount. This
policy is deprecated in new clients such as RHEL 6.4 and is hardcoded to nointr. Kill -9 is the only way to
interrupt a process in newer kernels.
For business-critical NFS exports, Lenovo recommends using intr with hard mounts with NFS
clients that support it.
Additionally, to specify NFSv4.x locks, run the following command from Diag Privilege:
cluster::*> vserver locks nfsv4 show
NFS events cover a wide array of subsystems, from general NFS events (such as exports or other errors)
to name service and authentication issues. Event names can be filtered by portions of the event message
name in the CLI and in the GUI.
cluster::*> event log show -message-name nblade.nfs*
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
5/11/2020 18:37:50 node2
NOTICE nblade.nfs4SequenceInvalid: NFS client (IP:
x.x.x.x) sent sequence# 1, but server expected sequence# 2. Server error: OLD_STATEID.
5/11/2020 18:35:52 node1
NOTICE nblade.nfs4SequenceInvalid: NFS client (IP:
x.x.x.y) sent sequence# 1, but server expected sequence# 2. Server error: OLD_STATEID.
5/11/2020 18:34:23 node2
NOTICE nblade.nfs4SequenceInvalid: NFS client (IP:
x.x.x.x) sent sequence# 1, but server expected sequence# 2. Server error: OLD_STATEID.
97 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Figure 17) Filtering events in ThinkSystem Storage Manager UI.
EMS events are triggered with a severity level, which lets you know which messages are important (such
as ERROR and EMERGENCY), and which messages are simply informative (such as INFORMATIONAL
and NOTICE). In general, DEBUG-level messages can be ignored unless there are other noticeable
issues occurring in the environment. If you see messages ERROR, ALERT or EMERGENCY, it might be
worth opening a support case.
You can filter based on severity, message name, and other variables. The following event log commands
can show NFS/multiprotocol NAS-relevant EMS events:
cluster::> event log show -message-name dns*
cluster::> event log show -message-name *export*
cluster::> event log show -message-name ldap*
cluster::> event log show -message-name mgmt.nfs*
cluster::> event log show -message-name nameserv*
cluster::> event log show -message-name nblade*
cluster::> event log show -message-name netgroup*
cluster::> event log show -message-name *nfs*
cluster::> event log show -message-name secd*
NFS statistics
Statistics for NFS are collected within the ONTAP Counter Manager archives and can be viewed by using
ThinkSystem Storage Manager for up to a year.
If individual performance counters for NFS are desired, use the statistics start command to enable
captures for a specific counter object or for multiple objects. After the statistics counters are started, they
will run until you stop them. These are most useful to run during periods where performance might be
suffering, or if you wish to see the workload trends for a specific volume.
The output of these statistics shows masks for specific VM types (Table 16).
98 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
VM Type Mask
Citrix Xen 2
Red Hat KVM 4
If more than one VM application is being used, then the masks are added together to determine which
ones are in use. For example, if ESX/ESXi and Red Hat KVM are in use, then the masks would be 1 + 4
= 5.
Top clients
ONTAP provides the ability to track the top NAS clients writing data to a cluster by using the new ONTAP
9 feature called top clients. This feature tracks the overall incoming operations and lists the client IP
address, NAS protocol being used, total IOPS, node being accessed, and the SVM to which the client is
connecting.
You can also watch the top clients in real time by using the command line with the admin privilege
command statistics top client show. This command allows you to specify a minimum of 30-
second intervals, the number of iterations to show, as well as the maximum number of clients that should
be displayed. In the following example, a Python file create script was run from two clients to a Lenovo
FlexGroup volume over NFSv3 to show an example of what to expect from the command’s output.
cluster::> statistics top client show -interval 30 -iterations 5 -max 10
By default, the CLI orders by IOPS. It’s also possible to order by throughput by using the -sort-key
option.
For example:
cluster::> statistics top client show -interval 30 -iterations 5 -max 10 -sort-key write_data
99 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
150400 nfs node02 DEMO x.x.x.c
cluster::> statistics top file show -interval 30 -iterations 1 -max 10 -sort-key write_data
If you attempt to use an unsupported counter (such as latency) with sort-key, the following is displayed:
cluster::*> statistics top file show -interval 30 -iterations 1 -max 10 -sort-key latency
Error: command failed: Unsupported sort-key counter: only "selector" counters are valid for
statistically tracked objects. For a list of valid sort-key counters, use this diagnostic command:
"statistics catalog counter show -object top_file -properties *selector*"
100 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Viewing storepool allocations
In some cases, you might need to view NFSv4.x storepool allocations, usually if you suspect a
performance problem is due to running out of storepool resources or if you’re unable to access NFSv4
mounts.
When storepool exhaustion occurs, an EMS message (Nblade.nfsV4PoolExhaust) is triggered.
To view storepool objects, run the following commands:
cluster::> set diag
cluster::*> statistics start -object nfsv4_diag
cluster::*> statistics show -object nfsv4_diag -counter *storePool_* -raw
If you suspect an issue with storepool exhaustion, contact Lenovo support. In general, run the latest
available ONTAP releases when using NFSv4.x.
Additionally, it is possible to view network connections in a LISTEN state with network connections
listening show.
101 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
The use cases vary, but usually fall under one of these scenarios:
The need to discover who’s using a volume before performing migrations, cutovers, and so on
Troubleshooting issues
Load distribution
To view connected NFS clients:
cluster::> nfs connected-clients show ?
[ -instance | -fields <fieldname>, ... ]
[[-node] <nodename>] Node Name
[[-vserver] <vserver>] Vserver
[[-data-lif-ip] <IP Address>] Data LIF IP Address
[[-client-ip] <IP Address>] Client IP Address
[[-volume] <volume name>] Volume Accessed
[[-protocol] <Client Access Protocol>] Protocol Version
[ -idle-time <[<integer>d][<integer>h][<integer>m][<integer>s]> ] Idle Time (Sec)
[ -local-reqs <integer> ] Number of Local Reqs
[ -remote-reqs <integer> ] Number of Remote Reqs
ONTAP 9.8 enables you to see these clients from ThinkSystem Storage Manager. Either click the NFS
Clients link in the dashboard or navigate to Hosts NFS Clients in the left menu.
Figure 18) Viewing NFS client to volume mappings in ThinkSystem Storage Manager
Umask
In NFS operations, permissions can be controlled through mode bits, which leverage numerical attributes
to determine file and folder access. These mode bits determine read, write, execute, and special
attributes. Numerically, these are represented as:
Execute = 1
Read = 2
Write = 4
Total permissions are determined by adding or subtracting a combination of the preceding.
For example:
4 + 2 + 1 = 7 (can do everything)
4 + 2 = 6 (rw) and so on…
For more information about UNIX permissions, see UNIX Permissions Help.
Umask is a functionality that allows an administrator to restrict the level of permissions allowed to a client.
By default, the umask for most clients is set to 0022, which means that files created from that client are
assigned that umask. The umask is subtracted from the base permissions of the object. If a volume has
102 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
0777 permissions and is mounted using NFS to a client with a umask of 0022, objects written from the
client to that volume have 0755 access (0777 – 0022).
# umask
0022
# umask -S
u=rwx,g=rx,o=rx
However, many operating systems do not allow files to be created with execute permissions, but they do
allow folders to have the correct permissions. Thus, files created with a umask of 0022 might end up with
permissions of 0644.
The following is an example using RHEL 6.5:
# umask
0022
# cd /cdot
# mkdir umask_dir
# ls -la | grep umask_dir
drwxr-xr-x. 2 root root 4096 Apr 23 14:39 umask_dir
# touch umask_file
# ls -la | grep umask_file
-rw-r--r--. 1 root root 0 Apr 23 14:39 umask_file
When you list the file with numerics, you find that the owner:group is 65534.
# ls -lan | grep newfile
-rwxrwxrwx 1 65534 65534 0 May 19 13:30 newfile.txt
The user 65534 on most Linux clients is nfsnobody, but in ONTAP, that user is pcuser.
cluster::*> unix-user show -vserver DEMO -id 65534
User User Group Full
Vserver Name ID ID Name
-------------- --------------- ------ ------ --------------------------------
DEMO pcuser 65534 65534
When you look at the file permissions from the ONTAP cluster, you might see that the UNIX owner is
indeed 65534, but that there are also Windows ACLs and owners that are different.
cluster::*> vserver security file-directory show -vserver DEMO -path /data/newfile.txt
Vserver: DEMO
File Path: /data/newfile.txt
File Inode Number: 7088
Security Style: ntfs
Effective Style: ntfs
DOS Attributes: 20
DOS Attributes in Text: ---A----
103 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Expanded Dos Attributes: -
UNIX User Id: 65534
UNIX Group Id: 65534
UNIX Mode Bits: 777
UNIX Mode Bits in Text: rwxrwxrwx
ACLs: NTFS Security Descriptor
Control:0x8004
Owner:NTAP\ntfs
Group:NTAP\DomainUsers
DACL - ACEs
ALLOW-Everyone-0x1f01ff-(Inherited)
When nfsnobody or 65534 is displayed in the NFS listings, one of two things is more than likely
occurring:
The volume that is being exported to NFS clients is also used by Windows SMB clients and the
Windows users writing to the shares don’t map to valid UNIX users and/or groups.
The volume that is being exported to NFS clients has the anonymous user set to 65534 and
something is causing the NFS user to squash to the anonymous user. For more information about
squashing, see “The anon user.”
You can view the Windows-user-to-UNIX-user mapping by running the following command in Advanced
Privilege:
cluster::*> access-check name-mapping show -vserver DEMO -direction win-unix -name ntfs
'ntfs' maps to 'pcuser'
cluster::*> access-check name-mapping show -vserver DEMO -direction win-unix -name prof1
'prof1' maps to 'prof1'
NFSv4.x: nobody:nobody
One of the most common issues seen with an NFSv4.x configuration is when a file or folder is shown in a
listing using ls as being owned by the user:group combination of nobody:nobody.
For example:
sh-4.2$ ls -la | grep prof1-file
-rw-r--r-- 1 nobody nobody 0 Apr 24 13:25 prof1-file
On the cluster (and using NFSv3), that file ownership appears to have a proper UID/GID.
cluster::*> vserver security file-directory show -vserver DEMO -path /home/prof1/prof1-file
Vserver: DEMO
File Path: /home/prof1/prof1-file
File Inode Number: 9996
Security Style: unix
Effective Style: unix
DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
UNIX User Id: 1002
UNIX Group Id: 10002
UNIX Mode Bits: 644
UNIX Mode Bits in Text: rw-r--r--
ACLs: -
In some instances, the file might show the correct owner, but nobody as the group.
104 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
sh-4.2$ ls -la | grep newfile1
-rw-r--r-- 1 prof1 nobody 0 Oct 9 2019 newfile1
With NFSv4.x, the nobody user is the default user defined by the idmapd.conf file and can be defined
as any user you want to use.
# cat /etc/idmapd.conf | grep nobody
#Nobody-User = nobody
#Nobody-Group = nobody
The client and server must both agree that a user is indeed who they are claiming to be, so the following
should be checked to ensure that the user that the client sees has the same information as the user that
ONTAP sees.
NFSv4.x ID domain (Client: idmapd.conf file; ONTAP: -v4-id-domain option)
User name and numeric IDs (name service switch configuration – client: nsswitch.conf and/or
local passwd and group files; ONTAP: ns-switch commands)
Group name and numeric IDs (Name service switch configuration – client: nsswitch.conf and/or
local passwd and group files; ONTAP: ns-switch commands)
In almost all cases, if you see nobody in user and group listings from clients but ONTAP reports the
correct user and group information (through vserver security file-directory show), the issue is user or
group name domain ID translation.
You can also make use of the ONTAP option -v4-numeric-ids, covered in “Bypassing the name string
— Numeric IDs.”
105 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
In addition to using /var/log/messages to find an issue with NFSv4 IDs, you can use the nfsidmap
-l command on the NFS client to view which usernames have properly mapped to the NFSv4 domain.
For example, this is output of the command after a user that exists in both the client and ONTAP SVM
accesses an NFSv4.x mount:
# nfsidmap -l
4 .id_resolver keys found:
gid:[email protected]
uid:[email protected]
gid:[email protected]
uid:[email protected]
When a user that does not map properly into the NFSv4 ID domain (in this case, lenovo-user) tries to
access the same mount and touches a file, they get assigned nobody:nobody, as expected.
# su lenovo-user
sh-4.2$ id
uid=482600012(lenovo-user), 2000(secondary)
sh-4.2$ cd /mnt/nfs4/
sh-4.2$ touch newfile
sh-4.2$ ls -la
total 16
drwxrwxrwx 5 root root 4096 Jan 14 17:13 .
drwxr-xr-x. 8 root root 81 Jan 14 10:02 ..
-rw-r--r-- 1 nobody nobody 0 Jan 14 17:13 newfile
drwxrwxrwx 2 root root 4096 Jan 13 13:20 qtree1
drwxrwxrwx 2 root root 4096 Jan 13 13:13 qtree2
drwxr-xr-x 2 nfs4 daemon 4096 Jan 11 14:30 testdir
The nfsidmap -l output shows the user pcuser in the display, but not lenovo-user; this is the
anonymous user in our export-policy rule (65534).
# nfsidmap -l
6 .id_resolver keys found:
gid:[email protected]
uid:[email protected]
gid:[email protected]
uid:[email protected]
gid:[email protected]
uid:[email protected]
106 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Viewing and managing NFS credentials
In ONTAP 9.4, a global cache for name services was implemented to offer better performance, reliability,
resilience, and supportability for NAS credentials and name service server management.
Part of those changes was the implementation of the NFS credential cache, which stores NFS user and
group information in ONTAP when NFS exports are accessed.
These caches can be viewed and managed through the Advanced Privilege nfs credentials
commands.
cluster::*> nfs credentials ?
count *Count credentials cached by NFS
flush *Flush credentials cached by NFS
show *Show credentials cached by NFS
Cache entries will populate the node where the TCP connection for the NFS mount exists. This
information can be seen through the following command on the cluster:
cluster::*> nfs connected-clients show -vserver DEMO -client-ip x.x.x.x -fields data-lif-ip -
volume scripts
node vserver data-lif-ip client-ip volume protocol
------------------ ------- ------------- ------------- ------- --------
Node1 DEMO x.x.x.y x.x.x.x scripts nfs3
From the command above, we know the client IP x.x.x.x is connected to a data LIF on node1. That
helps us narrow down which node to focus on for cache entries.
The nfs credentials count command allows you to see how many credentials are currently stored
in the NFS credential cache. This can be useful in understanding the impact of clearing the cache.
cluster::*> nfs credentials count -node node1
Number of credentials cached by NFS on node "node1": 4
If a user traverses into an ONTAP NFS export, user IDs, group IDs, and so on are all added to the NFS
credential cache. For example, we have a user named prof1.
# id prof1
uid=1102(prof1) gid=10002(ProfGroup) groups=10002(ProfGroup),10000(Domain
Users),1202(group2),1101(group1),1220(sharedgroup),1203(group3)
That user has eight different entries – a numeric UID and seven group memberships. Then, the user
prof1 accesses an NFS export. Our credential cache increases by eight.
cluster::*> nfs credentials count -node node1
Number of credentials cached by NFS on node "node1": 12
That count is for the entire node – not just per SVM. If you have multiple SVMs in your environment, the
count might not be useful for troubleshooting.
Credentials
107 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
-----------
Node: node1
Vserver: DEMO
Client IP: -
Flags: unix-extended-creds-present, id-name-mapping-present
Time since Last Refresh: 52s
Time since Last Access: 44s
Hit Count: 4
UNIX Credentials:
Flags: 1
Domain ID: 0
UID: 1102
Primary GID: 10002
Additional GIDs: 10002
10000
1101
1202
1203
1220
Windows Credentials:
Flags: -
User SID: -
Primary Group SID: -
Domain SIDs: -
ID-Name Information:
Type: user
ID: 1102
Name: prof1
You can also see an entry for the user’s primary group.
cluster::*> nfs credentials show -node node1 -vserver DEMO -unix-group-name ProfGroup
Credentials
-----------
Node: node1
Vserver: DEMO
Client IP: -
Flags: id-name-mapping-present
Time since Last Refresh: 64s
Time since Last Access: 6s
Hit Count: 2
UNIX Credentials:
Flags: -
Domain ID: -
UID: -
Primary GID: -
Additional GIDs: -
Windows Credentials:
Flags: -
User SID: -
Primary Group SID: -
Domain SIDs: -
ID-Name Information:
Type: group
ID: 10002
Name: ProfGroup
You can also view credential cache entries for users and groups down to the client IP that tried the
access.
cluster::*> nfs credentials show -node node1 -vserver DEMO -client-ip x.x.x.x -unix-user-id 1102
108 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Credentials
-----------
Node: node1
Vserver: DEMO
Client IP: x.x.x.x
Flags: unix-extended-creds-present, id-name-mapping-present
Time since Last Refresh: 35s
Time since Last Access: 34s
Hit Count: 2
Reference Count: 4
Result of Last Update Attempt: no error
UNIX Credentials:
Flags: 1
Domain ID: 0
UID: 1102
Primary GID: 10002
Additional GIDs: 10002
10000
1101
1202
1203
1220
Windows Credentials:
Flags: -
User SID: -
Primary Group SID: -
Domain SIDs: -
ID-Name Information:
Type: user
ID: 1102
Name: prof1
The credential cache also keeps negative entries (entries that could not be resolved) in cache. Negative
entries occur when ONTAP can’t resolve the numeric UID to a valid user. In this case, the UID 1236
cannot be resolved by ONTAP, but attempted access to the NFS export.
# su cifsuser
bash-4.2$ cd /scripts/
bash: cd: /scripts/: Permission denied
bash-4.2$ id
uid=1236(cifsuser) gid=1236(cifsuser) groups=1236(cifsuser)
cluster::*> nfs credentials show -node node1 -vserver DEMO -unix-user-id 1236
Credentials
-----------
Node: node1
Vserver: DEMO
Client IP: -
Flags: no-unix-extended-creds, no-id-name-mapping
Time since Last Refresh: 33s
Time since Last Access: 7s
Hit Count: 15
UNIX Credentials:
Flags: -
Domain ID: -
UID: -
Primary GID: -
Additional GIDs: -
Windows Credentials:
Flags: -
User SID: -
Primary Group SID: -
Domain SIDs: -
109 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
ID-Name Information:
Type: -
ID: -
Name: -
UNIX Credentials:
Flags: 1
Domain ID: 0
UID: 1102
Primary GID: 10002
Additional GIDs: 10002
10000
1101
1202
1203
1220
Windows Credentials:
Flags: -
User SID: -
Primary Group SID: -
Domain SIDs: -
ID-Name Information:
Type: user
ID: 1102
Name: prof1
If the user accesses an export that has NTFS permissions/security style, we’d see the flag cifs-creds-
present, as well as the domain SID information under Windows Credentials:
Credentials
-----------
Node: node1
Vserver: DEMO
Client IP: x.x.x.x
Flags: ip-qualifier-configured, unix-extended-creds-present, cifs-creds-
present
Time since Last Refresh: 19s
Time since Last Access: 1s
Hit Count: 9
Reference Count: 2
Result of Last Update Attempt: no error
UNIX Credentials:
Flags: 0
Domain ID: 0
UID: 1102
Primary GID: 10002
110 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Additional GIDs: 10002
10000
1101
1202
1203
1220
Windows Credentials:
Flags: 8320
User SID: S-1-5-21-3552729481-4032800560-2279794651-1214
Primary Group SID: S-1-5-21-3552729481-4032800560-2279794651-513
Domain SIDs: S-1-5-21-3552729481-4032800560-2279794651
S-1-18
S-1-1
S-1-5
S-1-5-32
ID-Name Information:
Type: -
ID: -
Name: -
Cache entries maintain the time since last access/refresh (as seen in the show command). If an entry
stays idle for a period of time, it is eventually removed from the cache. If the entry is active, it is refreshed
and stays in cache.
These values can be modified to longer or shorter timeout values, depending on the desired effects:
Longer cache timeout values reduce network load and provide faster lookups of users but can
produce more false positives/false negatives as the cache entries are not always in sync with name
services.
Shorter cache timeout values increase load on the network and name servers and can add some
latency to name lookups (depending on name service source) but offer more accurate and up-to-date
entries.
The best practice is to leave the values as is. If you need to change the values, be sure to monitor the
results and adjust as needed.
111 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Flushing the NFS credential cache
In cases where a user has been added or removed from a group and does not have the desired access,
the credential cache entry can be flushed manually, rather than waiting for the cache entry to timeout.
The command can be run for a UNIX user or numeric ID or a UNIX group or numeric ID. Additionally, the
command can be run as granularly as down to the client IP address having the issue.
cluster::*> nfs credentials flush -node node1 -vserver DEMO -client-ip x.x.x.x -unix-user-id 1102
You can only flush one NFS credential cache entry at a time.
The NFS credential cache is separate from the name service cache.
That is, unless we set NFSv4.x ACLs to allow a user full control.
[root@centos7 mnt]# nfs4_getfacl /mnt/root/file
A::[email protected]:rwaxtTnNcCy
A::OWNER@:rwaxtTnNcCy
A:g:GROUP@:rxtncy
A::EVERYONE@:rxtncy
Vserver: DEMO
File Path: /home/root/file
File Inode Number: 8644
Security Style: unix
Effective Style: unix
DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
UNIX User Id: 0
UNIX Group Id: 1
UNIX Mode Bits: 755
UNIX Mode Bits in Text: rwxr-xr-x
ACLs: NFSV4 Security Descriptor
Control:0x8014
112 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
DACL - ACEs
ALLOW-user-prof1-0x1601bf
ALLOW-OWNER@-0x1601bf
ALLOW-GROUP@-0x1200a9-IG
ALLOW-EVERYONE@-0x1200a9
In the above example, we provided prof1 full control over the file. We then mounted through NFSv3.
When you become a user that isn’t on the NFSv4.x ACL, you can’t write to the file or remove the file (the
expected behavior).
[root@centos7 /]# su student1
sh-4.2$ cd /mnt/root
sh-4.2$ ls -la
total 8
drwxr-xr-x 2 root root 4096 Jul 13 10:42 .
drwxrwxrwx 11 root root 4096 Jul 10 10:04 ..
-rwxr-xr-x 1 root bin 0 Jul 13 10:23 file
-rwxr-xr-x 1 root root 0 Mar 29 11:37 test.txt
When you change to the prof1 user, you have access to do whatever you want, even though the mode
bit permissions in v3 say you should not be able to. That is because the NFSv4.x ACL is working:
[root@centos7 /]# su prof1
sh-4.2$ cd /mnt/root
sh-4.2$ ls -la
total 8
drwxr-xr-x 2 root root 4096 Jul 13 10:42 .
drwxrwxrwx 11 root root 4096 Jul 10 10:04 ..
-rwxr-xr-x 1 root bin 0 Jul 13 10:23 file
-rwxr-xr-x 1 root root 0 Mar 29 11:37 test.txt
sh-4.2$ vi file
sh-4.2$ cat file
NFSv4ACLS!
When you do a chmod, however, nothing seems to change from the NFSv4 ACL for the user. We set 700
on the file, which showed up in NFSv3 mode bits.
sh-4.2$ chmod 700 file
sh-4.2$ ls -la
total 8
drwxr-xr-x 2 root root 4096 Jul 13 10:42 .
drwxrwxrwx 11 root root 4096 Jul 10 10:04 ..
-rwx------ 1 root bin 11 Aug 11 09:58 file
-rwxr-xr-x 1 root root 0 Mar 29 11:37 test.txt
But notice how the prof1 user still has full control.
cluster::*> vserver security file-directory show -vserver DEMO -path /home/root/file
Vserver: DEMO
File Path: /home/root/file
File Inode Number: 8644
Security Style: unix
Effective Style: unix
DOS Attributes: 20
DOS Attributes in Text: ---A----
Expanded Dos Attributes: -
UNIX User Id: 0
UNIX Group Id: 1
UNIX Mode Bits: 700
UNIX Mode Bits in Text: rwx------
113 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
ACLs: NFSV4 Security Descriptor
Control:0x8014
DACL - ACEs
ALLOW-user-prof1-0x1601bf
ALLOW-OWNER@-0x1601bf
ALLOW-GROUP@-0x120088-IG
ALLOW-EVERYONE@-0x120088
That is because the NFSv4.x ACL preserve is enabled. If that option is disabled, chmod wipes the ACL.
For basic UID/GID information for all UNIX users (local and name services; Advanced Privilege), run the
following command:
cluster::*> access-check authentication show-ontap-admin-unix-creds
Or:
cluster::*> getxxbyyy getpwbyname -node node1 -vserver DEMO -username prof1 -show-source true
(vserver services name-service getxxbyyy getpwbyname)
Source used for lookup: LDAP
pw_name: prof1
pw_passwd:
pw_uid: 1102
pw_gid: 10002
pw_gecos:
pw_dir:
pw_shell:
cluster::*> getxxbyyy getpwbyname -node node1 -vserver DEMO -username host -show-source true
(vserver services name-service getxxbyyy getpwbyname)
Source used for lookup: Files
pw_name: host
pw_passwd: *
pw_uid: 598
pw_gid: 0
pw_gecos:
pw_dir:
pw_shell:
To view user information and group memberships (local and name services; Advanced Privilege), run the
following commands:
114 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
cluster::*> getxxbyyy getgrlist -node node1 -vserver DEMO -username prof1
(vserver services name-service getxxbyyy getgrlist)
pw_name: prof1
Groups: 10002 10002 10000 1101 1202 1203 48
Windows Membership:
S-1-5-21-3552729481-4032800560-2279794651-1301 NTAP\apache-group (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1106 NTAP\group2 (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-513 NTAP\DomainUsers (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1105 NTAP\group1 (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1107 NTAP\group3 (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1111 NTAP\ProfGroup (Windows Domain group)
S-1-5-21-3552729481-4032800560-2279794651-1231 NTAP\local-group.ntap (Windows Alias)
S-1-18-2 Service asserted identity (Windows Well known group)
S-1-5-32-551 BUILTIN\Backup Operators (Windows Alias)
S-1-5-32-544 BUILTIN\Administrators (Windows Alias)
S-1-5-32-545 BUILTIN\Users (Windows Alias)
User is also a member of Everyone, Authenticated Users, and Network Users
Privileges (0x22b7):
SeBackupPrivilege
SeRestorePrivilege
SeTakeOwnershipPrivilege
SeSecurityPrivilege
SeChangeNotifyPrivilege
Vserver: DEMO
File Path: /home/prof1
File Inode Number: 8638
Security Style: ntfs
Effective Style: ntfs
115 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
UNIX User Id: 0
UNIX Group Id: 0
UNIX Mode Bits: 777
UNIX Mode Bits in Text: rwxrwxrwx
ACLs: NTFS Security Descriptor
Control:0x8504
Owner:NTAP\prof1
Group:BUILTIN\Administrators
DACL - ACEs
ALLOW-Everyone-0x1f01ff-OI|CI
ALLOW-NTAP\prof1-0x1f01ff-OI|CI
ALLOW-NTAP\sharedgroup-0x1200a9-OI|CI
ALLOW-NTAP\Administrator-0x1f01ff-OI|CI
You can also verify which effective permissions a specific user has to a specific file or directory.
cluster::> file-directory show-effective-permissions -vserver DEMO -unix-user-name prof1 -path
/home/prof1
(vserver security file-directory show-effective-permissions)
Vserver: DEMO
Windows User Name: NTAP\prof1
Unix User Name: prof1
File Path: /home/prof1
CIFS Share Path: -
Effective Permissions:
Effective File or Directory Permission: 0x1f01ff
Read
Write
Append
Read EA
Write EA
Execute
Delete Child
Read Attributes
Write Attributes
Delete
Read Control
Write DAC
Write Owner
Synchronize
116 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
{ [ -windows-name <TextNoCase> ] Windows User Name
| [ -unix-name <TextNoCase> ] } UNIX User Name or User ID
[ -trace-allow {yes|no} ] Trace Allow Events (default: no)
[ -enabled {enabled|disabled} ] Filter Enabled (default: enabled)
[ -time-enabled {1..720} ] Minutes Filter is Enabled (default: 60)
If you desire, you can narrow the trace down to specific usernames or IP addresses.
cluster::*> vserver security trace filter modify -vserver DEMO -index 1 -protocols nfs -client-ip
x.x.x.x -trace-allow yes -enabled enabled
After the trace is created, you can see results in real time. When you view the results, you can filter by
successes, failures, user IDs, protocol, and more.
cluster::> vserver security trace trace-result show ?
[ -instance | -fields <fieldname>, ... ]
[[-node] <nodename>] Node
[ -vserver <vserver name> ] Vserver
[[-seqnum] <integer>] Sequence Number
[ -keytime <Date> ] Time
[ -index <integer> ] Index of the Filter
[ -client-ip <IP Address> ] Client IP Address
[ -path <TextNoCase> ] Path of the File Being Accessed
[ -win-user <TextNoCase> ] Windows User Name
[ -security-style <security style> ] Effective Security Style On File
[ -result <TextNoCase> ] Result of Security Checks
[ -unix-user <TextNoCase> ] UNIX User Name
[ -session-id <integer> ] CIFS Session ID
[ -share-name <TextNoCase> ] Accessed CIFS Share Name
[ -protocol {cifs|nfs} ] Protocol
[ -volume-name <TextNoCase> ] Accessed Volume Name
Here is an example of what a permission/access failure looks like for a specific user:
cluster::> vserver security trace trace-result show -node * -vserver DEMO -unix-user 1102 -result
*denied*
Vserver: DEMO
117 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
This not the same as the maximum transfer size value (-tcp-max-xfer-size) included under the NFS
server options.
Table 18 shows some examples of per-node values for available exec contexts.
In addition to per-node limits, there is also a limit of 128 concurrent operations (assigned exec contexts)
per TCP CID. If a client sends more than 128 concurrent operations, ONTAP blocks those operations
until a new resource is freed up. By default, Linux clients are configured to send up to 65,536 of these
operations concurrently, so the limits can start to be reached relatively quickly.
The implications of this process are covered in more detail in “Identifying potential issues with RPC slot
tables.”
In some cases, a single client can be overwhelming a node’s WAFL layer with requests to the same
volume, which then increases latency in WAFL that increases the amount of time it takes to free up exec
contexts and release them back to the system. Thus, reducing the total number of exec contexts available
to other workloads connecting to the same node. This is commonly seen in grid computing applications,
118 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
where many clients are sending NFS operations to the same volumes at once. For example, if every NFS
client is sending 128 concurrent operations at a time, then it would only take 24 clients to overwhelm the
limits.
ONTAP 9.9.1 introduces a new feature for platforms with >256GB of RAM that helps limit the impact that
bully workloads have on other workloads in the system when they send more concurrent operations than
supported by ONTAP. This new feature works by throttling back the number of available exec contexts for
all connections to prevent one workload from overrunning others. This throttling is based on the total
utilization of exec contexts on the node and helps scale back the operations to ensure the node totals
don’t get exceeded.
After a node hits 70% of the total available exec contexts, each connection is only able to perform 16
concurrent operations until the node’s total utilization drops back to 60%. The exec limit then increases
back to 128.
Some considerations:
This feature is only available in ONTAP 9.9.1 and later, and only on platforms with more than 256GB
of memory.
These platforms also increase the total number of available exec contexts per node to 10,000.
Because throttling does not occur until over 6,000 execs are allocated (previous total exec limit was
3,000), existing workloads should not notice any negative difference in performance.
This throttling does not help reduce the number of blocked exec contexts due to per-client overruns.
For tuning Linux clients and/or using nconnect, you should still follow the guidance in the “Network
connection concurrency and TCP slots: NFSv3” section.
Then, view the statistics. The following counters are displayed if exec throttling is in use:
throttle_scale. The scale-back factor which is currently in effect (1, 8, or 16).
throttle_increases. The number of times the scale-back factor has been increased.
throttle_decreases. The number of time the scale-back factor has been decreased.
throttle_hist. A histogram of allocations at each scale factor (incrementing counters in 1, 8, or 16).
119 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
For each CID, ONTAP allows 128 execs to be used at any given moment. If a client sends more than
128 operations at a time, then ONTAP pushes back on that client until a new resource is freed. These
pushbacks are only microseconds per operation in most cases, but over the course of millions of requests
across hundreds of clients, the pushback can manifest into performance issues that don’t have the usual
signatures of performance issues on storage systems, such as protocol, disk or node latency. As a result,
isolating these issues can be difficult and time consuming.
Older Linux kernels (pre-RHEL 6.x days) had a static setting for RPC slot tables of 16. In newer Linux
clients, that setting was changed to a maximum of 65,536 and the NFS client would dynamically increase
the number of slot tables needed. As a result, newer NFS clients could potentially flood NFS servers with
more requests than they can handle at one time.
NFSv4.x operations are sent as compound requests (such as three or four NFS operations per packet)
and NFSv4.x sessions slots are used to parallelize requests instead of RPC slot tables. For more
information, see “NFSv4.x concurrency — session slots.”
Here are a few ways to address performance issue caused by slot tables:
Add more NFSv3 mount points per client (use different locations in the volume to be effective)
Throttle the number of NFSv3 requests a single client can send per TCP connection/session
Use the nconnect mount option to get more TCP connections/sessions per mount (ensure both the
client and ONTAP versions support nconnect)
However, before you decide to address RPC slots in your environment it is important to remember that
lowering the RPC slot tables on an NFS client is effectively a form of throttling and can negatively affect
performance depending on the workload. An NFS client that needs to send one million NFS requests will
send those requests regardless of RPC slot table settings. Setting RPC slot tables is essentially telling
the NFS client to limit the number of requests it can send at any given time. The decision of whether to
throttle the NFS client or let the storage system enact a form of flow control depends on your workload
and use case.
Before adjusting these values, it’s important to test and identify if too many slot tables will cause
performance issues/impact to applications.
Then, review the statistics over a period of time to see if they are incrementing.
statistics show -object cid -instance cid -counter execs_blocked_on_cid
On NFS clients, you can leverage the nfsiostat command to show active in-flight slot tables.
120 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
# nfsiostat [/mount/path] [interval seconds]
When you set lower RPC slot table values on a client, the RPC slot table queue shifts from the storage to
the client, so the rpc bklog values will be higher.
With the slot tables set to 128, the rpc bklog got as high as 360 when creating 5,000,000 files from two
clients and sent around 26,000 ops/s.
# nfsiostat /mnt/FGNFS 1 | grep "bklog" -A 1
ops/s rpc bklog
25319.000 354.091
--
ops/s rpc bklog
24945.000 351.105
--
ops/s rpc bklog
26022.000 360.763
Counter Value
-------------------------------- --------------------------------
execs_blocked_on_cid 0
Counter Value
-------------------------------- --------------------------------
execs_blocked_on_cid 0
If the RPC slot table values are set to higher values, then the RPC queue (rpc bklog) will be lower on
the client. In this case, the slot tables were left as the default 65,536 value. The client backlog was 0 and
the ops/s were higher.
# nfsiostat /mnt/FGNFS 1 | grep "bklog" -A 1
ops/s rpc bklog
22308.303 0.000
--
ops/s rpc bklog
30684.000 0.000
That means the storage would need to absorb more of those RPC calls because the client isn’t holding
back as many of those operations. We can see that in the ONTAP statistics.
cluster::*> statistics show -object cid -counter execs_blocked_on_cid -sample-id
All_Multi1_bs65536
Counter Value
-------------------------------- --------------------------------
execs_blocked_on_cid 145324
Counter Value
-------------------------------- --------------------------------
execs_blocked_on_cid 124982
When we exceed a certain number of blocked execs, we’ll log an EMS. The following EMS was
generated:
121 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
than the maximum number of in-flight requests allowed (128). The client might see degraded
performance due to request throttling.
In general, if you aren’t seeing nblade.execOverLimit EMS events in ONTAP 9.8 and later, RPC slot
tables aren’t likely causing problems for your workloads. In ONTAP 9.7 and ealrlier, these events do not
exist, so you would want to monitor the CID stats and watch for large increments of
exec_blocked_on_cid. If you’re unsure if your environment is having an issue, contact Lenovo
support.
Example #1: RPC slot table impact on performance — high file count workload
In the example shown in Figure 19, a script was run to create 18 million files across 180,000
subdirectories. This load generation was done from three clients to the same NFS mount. The goal was
to generate enough NFS operations with clients that had the default RPC slot table settings to cause
ONTAP to enter a flow-control scenario. Then the same scripts were run again on the same clients—but
with the RPC slot tables set to 128.
The result was that the default slot tables (65,536) generated 18 million execs_blocked_on_cid
events and added 3ms of latency to the workload versus the run with the lower RPC slot table setting
(128).
Figure 19) Impact of RPC slot tables on NFSv3 performance.
Although 3ms might not seem like a lot of latency, it can add up over millions of operations, considerably
slowing down job completion.
Example #2: RPC slot table impact on performance — sequential I/O workload
In the section called “Performance examples for different TCP maximum transfer window sizes,” we show
a number of tests that illustrate NFSv3 vs. NFSv4.1 performance differences, along with different
wsize/rsize mount option values. While running these tests, we also saw the negative effects of RPC slot
tables increasing the number of execs blocked on CIDs causing performance bottlenecks that added
14.4ms of write latency to some of the performance runs, which in turn added 5.5 minutes to the overall
job completion times.
The tests were run across two clients on a 10GB network, using a script that runs multiple dd operations
in parallel. Overall, eight folders with two 16GB files each were created and then read and deleted.
When the RPC slots were set to the maximum dynamic value of 65,536, the dd operations took 20
minutes, 53 seconds.
122 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
When the RPC slots were lowered to 128, the same script took just 15 minutes, 23 seconds.
With the 1MB wsize/rsize mount options and 65,536 RPC slots, the execs_blocked_on_cid
incremented to ~1,000 per node.
cluster::*> statistics show -object cid -counter execs_blocked_on_cid
Scope: node1
Counter Value
-------------------------------- --------------------------------
execs_blocked_on_cid 1001
Scope: node2
Counter Value
-------------------------------- --------------------------------
execs_blocked_on_cid 1063
Figure 20 shows the side-by-side comparison of latency, IOPS, and throughput for the jobs using a 1MB
wsize/rsize mount value.
Figure 20) Parallel dd performance — NFSv3 and RPC slot tables; 1MB rsize/wsize.
Figure 21 shows the side-by-side comparison of latency, IOPS, and throughput for the jobs using a 256K
wsize/rsize mount value.
Figure 21) Parallel dd performance — NFSv3 and RPC slot tables; 256K rsize/wsize.
Table 20 shows the comparison of job times and latency with 65,536 and 128 RPC slots.
123 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Table 20) Job comparisons — parallel dd with 65,536 and 128 RPC slots.
Test Average read Average write Completion time
latency (ms) latency (ms)
NFSv3 – 1MB wsize/rsize; 65,536 slot tables 9.2 (+3.2ms) 42.3 (+14.4ms) 20m53s (+5m30s)
NFSv3 – 1MB wsize/rsize; 128 slot tables 6 27.9 15m23s
NFSv3 – 256K wsize/rsize; 65,536 slot tables 1.3 (-.1ms) 3.9 (+1ms) 19m12s (+4m55s)
NFSv3 – 256K wsize/rsize; 128 slot tables 1.4 2.9 14m17s
NFSv3 – 64K wsize/rsize; 65,536 slot tables .2 (0ms) 3.4 (+1.2ms) 17m2s (+2m14s)
NFSv3 – 64K wsize/rsize; 128 slot tables .2 2.2 14m48s
Table 21) Job comparisons — parallel dd with 65,536, 128, and 64 RPC slots.
Test Average read Average write Completion time
latency (ms) latency (ms)
NFSv3 – 1MB wsize/rsize; 65,536 slot tables 9.2 (+3.2ms) 42.3 (+19.4ms) 20m53s (+6m51s)
NFSv3 – 1MB wsize/rsize; 128 slot tables 6 (-.3ms) 27.9 (+5ms) 15m23s (+1m21s)
NFSv3 – 1MB wsize/rsize; 64 slot tables 6.3 22.9 14m2s
It is possible to get more performance out of a client’s NFS connectivity by connecting more mount points
to different IP addresses in the cluster on the same client, but that approach can create complexity. For
example, rather than mounting a volume at SVM:/volumename, multiple mount points on the same client
across different folders and IP addresses in the volume could be created.
For example:
124 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
LIF1:/volumename/folder1
LIF2:/volumename/folder2
LIF3:/volumename/folder3
Table 22) High file count creation (one million files) — NFSv3 — with and without nconnect — default slot
tables.
Test Average completion time Average total execs blocked in
ONTAP
NFSv3 – no nconnect ~69.5 seconds 214770
NFSv3 – nconnect=2 ~70.14 seconds 88038
NFSv3 – nconnect=4 ~70.1 seconds 11658
NFSv3 – nconnect=8 ~71.8 seconds 0
NFSv3 – nconnect=16 ~71.7 seconds 0
125 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Exec contexts being blocked can create performance issues (as described in “Example #1: RPC slot
table impact on performance — high file count workload”) but doesn’t always create performance
issues.
The same tests were run again using the 128 maximum RPC slot table setting and only without nconnect
and with nconnect=8. In this instance, throttling the RPC slot tables on the clients not only had a slightly
positive effect on performance for this workload’s average completion time, but it also had more
predictability. For instance, when the slot tables were set to 65,536, completion times out of 10 runs had a
wide variance - from 55.7 seconds to 81.7 seconds for this workload—whereas 128 slot tables kept
performance at a more consistent range of 68-71 seconds. This is because the storage never had to push
back on the NFS operations. With more clients added to the mix, the predictability becomes more
impactful, as one or two clients can potentially bully other clients into poor performance results –
especially in ONTAP releases prior to 9.9.1 (where exec context throttling was added to help mitigate
bully workloads).
When nconnect was added to these tests, the performance suffered a bit, because each client wasn’t
able to push as many NFS operations across the multiple TCP sessions because the client was throttled
to 128. If you plan on using nconnect for your workloads, you should consider not setting the RPC slot
table values on the clients at all and instead let nconnect spread the workload across TCP sessions.
Table 23) High file count creation (one million files) — NFSv3 — with and without nconnect — 128 slot tables.
Test Average completion time Average total execs blocked in ONTAP
NFSv3 – no nconnect ~69.4 seconds 0
NFSv3 – nconnect=8 ~71.2 seconds 0
When the slot tables were scaled back even further (to 16), the performance suffered significantly,
because the client is now sending just 16 requests at a time, so the script takes ~28.1 seconds more to
complete without nconnect. With nconnect, we were able to keep around the same average completion
time as the other tests with more slot tables (~70.6 seconds), as we get 16 slot tables per session (16 * 8
= 128), rather than 16 slot tables on a single session. Because we can send 128 operations per session
with nconnect and only 16 per session without, the performance is greatly improved when using nconnect
with this configuration. However, in environments with hundreds of clients, setting the slot tables to 16
might be the only way to avoid performance problems caused by slot table overruns.
Table 24) High file count creation (one million files) — NFSv3 — with and without nconnect — 16 slot tables.
Test Average completion time Average total execs blocked in ONTAP
NFSv3 – no nconnect ~99.3 seconds 0
NFSv3 – nconnect=8 ~70.6 seconds 0
From these results, nconnect can provide a way to better distribute the RPC slot requests across TCP
connections without costing much in the way of performance and removing the need to adjust RPC slot
tables in environments that require high performance and a large number of NFS operations. For more
information about nconnect, see the section called “Nconnect.”
Because of the variant nature of workloads and impacts to performance, you should always test various
client settings, mount options, and so on in your environment, because there is no single correct
NFS configuration.
Setting RPC slot tables for environments with large client counts
Although the ONTAP session limit for slot tables per TCP connection is 128, there are also node-level
limits for exec contexts that can be exceeded.
For example, if a single node can handle up to 3,000 exec contexts, and each TCP session can handle
up to 128 exec contexts, then you can have up to 24 concurrent TCP sessions maxing out the RPC slot
126 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
tables at any given time before the resources are exhausted and ONTAP has to pause to let the
resources release back for use (3,000/128 = 23.47). For examples of how to see the available exec
contexts per node, see “Exec context throttling.”
That does not mean that ONTAP can only support up to 24 clients per node; ONTAP supports up to
100,000 TCP connections per node (platform dependent). Rather, what this means is that if 24 or more
clients are sending the maximum allowed slot entries per connection (128) all at the same time to a single
node, then there will be some latency build up as ONTAP works to free up resources.
Table 25 shows examples of how many clients can send the maximum concurrent operations per
connection to a single node.
Table 25) Total clients at maximum concurrent operations (128) before node exec context exhaustion.
Node type Total available exec contexts Total clients sending maximum operations per
per node connection (128) per node
DM7000F 3,000 3,000/128 = ~23.4
Table 26) Total clients using 16 slot tables before node exec context exhaustion.
Node type Total available exec contexts Total clients sending max operations per
per node connection (128) per node
DM7000F 3,000 3,000/16 = ~187.5
In addition, the following approaches can help improve overall performance for workloads with many
clients:
Clusters with more nodes and data LIFs on each node.
DNS round robin to spread network connections across more nodes on initial mount.
FlexGroup volumes to leverage more hardware in the cluster.
FlexCache volumes to spread read-heavy workloads across more nodes and mount points.
Consider setting RPC slot table values lower on NFS clients to reduce the number of concurrent
operations; values are platform/ONTAP version-dependent and client count-dependent. For example,
see “Table 25.”
Nconnect to increase the single client performance.
ONTAP 9.9.1 or later when using platforms with 256GB RAM or greater for benefits of “Exec context
throttling” to help mitigate bully workload impact.
The slot table recommendations will adjust based on ONTAP hardware, ONTAP version, NFS mount
options, and so on. Testing different values in your environment is highly recommended. Does the RPC
slot table limit affect other NAS protocols?
RPC slot table limits affect only NFSv3 traffic:
SMB clients use different connection methodologies for concurrency, such as SMB multichannel,
SMB multiplex, and SMB credits. The SMB connection methodology depends on client/server
configuration and protocol version. For example, SMB 1.0 uses SMB multiplex (mpx), whereas
SMB2.x uses SMB credits.
127 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFSv4.x clients do not use RPC slot tables—instead, they use state IDs and session slots to control
the flow of concurrent traffic from clients.
Table 27 shows test run results from NFSv3 and NFSv4.1 using the same 65536 slot table values.
Table 27) Job comparisons - parallel dd — NFSv3 and NFSv4.1 with 65536 RPC slots.
Test Average read Average write Completion time
latency (ms) latency (ms)
NFSv3 – 1MB wsize/rsize; 65,536 slot tables 9.2 (+2.7ms) 42.3 (+5.5ms) 20m53s (+5m47s)
NFSv4.1 – 1MB wsize/rsize; 65,536 slot tables 6.5 36.8 15m6s
NFSv3 – 256K wsize/rsize; 65,536 slot tables 1.3 (-.1ms) 3.9 (+.7ms) 19m12s (+7m2s)
NFSv4.1 – 256K wsize/rsize; 65,536 slot tables 1.4 3.2 12m10s
NFSv3 – 64K wsize/rsize; 65,536 slot tables .2 (+.1ms) 3.4 (+2.2ms) 17m2s (+1m54s)
NFSv4.1 – 64K wsize/rsize; 65,536 slot tables .1 1.2 15m8s
The following describes the attribute for the values we will focus on in this section. These values can be
set without needing to remount or reboot the client to have them take effect.
128 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
vm.dirty_background_ratio | vm.dirty_background_bytes
These tunables define the starting point at which the Linux writeback mechanism begins flushing dirty
blocks to stable storage.
vm.dirty_expire_centisecs
This tunable defines how old a dirty buffer can be before it must be tagged for asynchronously writing out.
Some workloads might not close file handles immediately after writing data. Without a file close, there is
no flush until either memory pressure or 30 seconds passes (by default). Waiting for these values can
prove suboptimal for application performance, so reducing the wait time can help performance in some
use cases.
vm.dirty_writeback_centisecs
The kernel flusher thread is responsible for asynchronously flushing dirty buffers between each flush
thread sleep. This tunable defines the amount of time spent sleeping between buffer flushes. Lowering
this value in conjunction with vm.dirty_expire_centisecs can also improve performance for some
workloads.
129 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Table 28) One million files using f.write — NFSv3, 65536 slots — VM dirty bytes defaults versus tuned.
Test Average completion time Average total execs Time delta
blocked in ONTAP
NFSv3 – no nconnect – ~69.1 seconds 214,770 -
default vm.dirty settings
NFSv3 – no nconnect – ~69.5 seconds 144,790 +.4 seconds
vm.dirty settings tuned
NFSv3 – nconnect=8 – ~71.8 seconds 0 -
default vm.dirty settings
NFSv3 – nconnect=8 – ~69.5 seconds 0 -2.3 seconds
vm.dirty settings tuned
When the file size was increased, the overall completion times drastically improved with the virtual
memory settings, while nconnect didn’t make a lot of difference because our bottleneck here was not due
to TCP session limitations.
In this test, dd was used to create 50 x 500MB (2 clients, 1 file per folder, 25 folders, 25 simultaneous
processes per client) files, with a 256K wsize/rsize mount value.
Table 29) 50x 500MB files using dd — NFSv3, 65536 slots — VM dirty bytes defaults versus tuned.
Test Average completion time Average total execs Time delta
blocked in ONTAP (vm.dirty defaults
vs. vm.dirty set)
NFSv3 – no nconnect – ~134.4 seconds 0 -
default vm.dirty settings
NFSv3 – no nconnect – ~112.3 seconds 0 -22.1 seconds
vm.dirty settings tuned
NFSv3 – nconnect=8 – ~132.8 seconds 0 -
default vm.dirty settings
NFSv3 – nconnect=8 – ~112.9 seconds 0 -19.9 seconds
vm.dirty settings tuned
As seen in the high file count/small files example, these settings do not make a drastic difference for
every workload, so it’s important to test different values until you find the correct combination.
After you find the right values, you can use /etc/sysctl.conf to retain the values on reboot.
Note: The stated maximum is 2,000, but it is not recommended to exceed 1,024, because you might
experience NFSv4.x session hangs as per bug 1392470.
130 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
When an NFSv4.x session is set up, the client and server negotiate the maximum requests allowed for
the session, with the lower value (client and server settings) being applied. Most Linux clients default to
64 session slots. This value is tunable in Linux clients through the modprobe configuration.
$ echo options nfs max_session_slots=180 > /etc/modprobe.d/nfsclient.conf
$ reboot
You can see the current value for an NFSv4.x mount with the systool command (found in the sysfsutils
package):
# systool -v -m nfs | grep session
max_session_slots = "64"
Despite the value default of 180 for the NFS server, ONTAP still has a per-connection limit of 128 exec
contexts per CID, so it is still possible for an NFSv4.x client to overrun the available exec contexts on a
single TCP connection to ONTAP and enter a pause state while resources are freed to the system if the
value is set too high. If session slots are overrun, this also increaseS the execs_blocked_by_cid
counter mentioned in the “Identifying potential issues with RPC slot tables” section.
In most cases, the default 180 value does not encounter this issue, and setting the session slots to a
higher value can create conditions where you might run out of system resources. To get more
performance out of your clients, consider using the nconnect option for more parallel handling of these
operations rather than adjusting the session slot values.
Note: Be sure to use the latest patched release of ONTAP for NFSv4.x to avoid issues such as this:
High NFS Latency with multiple connections to node LIFs from common client TCP socket
131 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Test Completion time Average Average
(seconds) IOPS MBps
NFSv4.1 – 1 million x 4KB files (1,024 session slots) ~245.5 ~12378 ~16.3
NFSv4.1 – 32 x 2GB files (180 session slots) ~148.3 ~902 ~224.5
NFSv4.1 – 32 x 2GB files (256 session slots) ~149.6 ~889 ~221.5
NFSv4.1 – 32 x 2GB files (512 session slots) ~148.5 ~891 ~222
NFSv4.1 – 32 x 2GB files (1024 session slots) ~148.8 ~898 ~223.7
Table 31) NFSv4.x session slot performance — Percent change versus 180 slots.
Test Completion time IOPS MBps
NFSv4.1 – 1 million x 4KB files (256 session slots) -5.2% +8.5% +9.8%
NFSv4.1 – 1 million x 4KB files (512 session slots) -2.9% +5.6% +5.9%
NFSv4.1 – 1 million x 4KB files (1,024 session slots) -3.3% +5.9% +7.2%
NFSv4.1 – 32 x 2GB files (256 session slots) +0.9% -1.4% -1.3%
NFSv4.1 – 32 x 2GB files (512 session slots) +0.1% -1.2% -1.1%
NFSv4.1 – 32 x 2GB files (1,024 session slots) +0.3% -0.4% -0.4%
Observations
In the high metadata/high file count creation test, more session slots improved completion time, IOPS,
and throughput overall, but there seemed to be a sweet spot at 256 session slots against 1,024 session
slots.
In the sequential write workload, there was a slight performance degradation when using higher session
slots. This shows that increasing session slots can help, but not in all workload types.
As described to this point in the specification, the state model of NFSv4.1 is vulnerable to an
attacker that sends a SEQUENCE operation with a forged session ID and with a slot ID that it
expects the legitimate client to use next. When the legitimate client uses the slot ID with the
same sequence number, the server returns the attacker's result from the reply cache, which
disrupts the legitimate client and thus denies service to it. Similarly, an attacker could send
a CREATE_SESSION with a forged client ID to create a new session associated with the client
ID. The attacker could send requests using the new session that change locking state, such as
LOCKU operations to release locks the legitimate client has acquired. Setting a security policy
on the file that requires RPCSEC_GSS credentials when manipulating the file's state is one
potential work around, but has the disadvantage of preventing a legitimate client from releasing
state when RPCSEC_GSS is required to do so, but a GSS context cannot be obtained (possibly
because the user has logged off the client).
132 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
…
The SP4_MACH_CRED state protection option uses a machine credential where the principal that
creates the client ID MUST also be the principal that performs client ID and session maintenance
operations. The security of the machine credential state protection approach depends entirely on
safe guarding the per-machine credential. Assuming a proper safeguard using the per-machine
credential for operations like CREATE_SESSION, BIND_CONN_TO_SESSION, DESTROY_SESSION, and
DESTROY_CLIENTID will prevent an attacker from associating a rogue connection with a session, or
associating a rogue session with a client ID.
In these instances, the NFSv4.x client should retry the mount with a new client ID. In some cases, use of
the NFS mount option -clientaddr might be needed. For more information, see RedHat Bugzilla
#1821903.
Workarounds
Although the clients are behaving as expected per the RFC standards, there are a few ways to work
around this issue.
Workaround #3: Change the client name used with NFSv4.x mounts
By default, NFSv4.x uses the client’s host name for the client ID value when mounting to the NFS server.
However, there are client-side NFS options you can leverage to change that default behavior and
override the client ID used for NFSv4.x mounts.
To do this, set the NFS option nfs4-unique-id on the client to a static value for all clients that will use the
same host names (see RedHat Bugzilla 1582186). If you add this value to the
/etc/modprobe.d/nfsclient.conf file, it retains across reboots.
You can see the setting on the client as:
# systool -v -m nfs | grep -i nfs4_unique
nfs4_unique_id = ""
133 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
For example:
# echo options nfs nfs4_unique_id=uniquenfs4-1 > /etc/modprobe.d/nfsclient.conf
A command such as cat should update the atime for that file, and it does.
Now the atime shows Jan 14, 2021.
# cat vmware-1.log
# stat vmware-1.log
File: ‘vmware-1.log’
Size: 281796 Blocks: 552 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 442679 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-01-14 15:41:11.603404196 -0500
Modify: 2021-01-09 15:39:30.232596000 -0500
Change: 2021-01-14 15:40:29.946918292 -0500
In some instances, Linux clients might not update the atime properly due to client-side caching. If atimes
are not updating properly on commands such as cat, try remounting the NFS mount or dropping the
caches for the client (such as by running echo 1 > /proc/sys/vm/drop_caches; this drops all
caches on the client). From the CentOS 7.8 man pages:
134 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
The Linux client handles atime updates more loosely, however. NFS clients maintain good
performance by caching data, but that means that application reads, which normally update atime,
are not reflected to the server where a file's atime is actually maintained.
ONTAP updates atime whenever the client notifies the storage system of the update. If the client doesn’t
notify ONTAP, then there’s no way for the atime to update. You can disable atime updates with the
advanced privilege volume level option -atime-update.
[-atime-update {true|false}] - Access Time Update Enabled (privilege: advanced)
This optionally specifies whether the access time on inodes is updated when a file is read. The
default setting is true.
How it works
If a client sends too many packets to a node, the flow control adjusts the window size to zero and tells the
client to wait on sending any new NAS packets until the other packets are processed. If the client
continues to send packets during this zero window, then the NAS protocol stack flow control mechanism
sends a TCP reset to that client.
Contact Lenovo Technical Support if you suspect a performance problem related to NAS flow control.
135 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
How it works
The options to extend the group limitation work just the way that the manage-gids option for other NFS
servers works. Basically, rather than dumping the entire list of auxiliary GIDs a user belongs to, the option
does a lookup for the GID on the file or folder and returns that value instead.
From the man page for mountd:
-g or --manage-gids
Accept requests from the kernel to map user id numbers into lists of group id numbers for
use in access control. An NFS request will normally except when using Kerberos or other
cryptographic authentication) contains a user-id and a list of group-ids. Due to a
limitation in the NFS protocol, at most 16 groups ids can be listed. If you use the -g flag,
then the list of group ids received from the client will be replaced by a list of group ids
determined by an appropriate lookup on the server.
When an access request is made, only 16 GIDs are passed in the RPC portion of the packet (Figure 22).
Figure 22) RPC packet with 16 GIDs.
Any GID beyond the limit of 16 is dropped by the protocol. Extended GIDs can be used with external
name services, or locally on the cluster if the users and groups are configured properly. To make sure
that a local UNIX user is a member of multiple groups, use the unix-group adduser(s) command:
COMMANDS
adduser - Add a user to a local UNIX group
136 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFSv4.x with ONTAP has a feature that allows NFSv4.x mounts to leverage numeric ID strings instead of
name strings, which allows NFSv4.x operations without needing centralized name services, matching
names/numeric IDs on client/server, matching ID domains, etc. (-v4-numeric-ids).
Enabling the -auth-sys-extended-groups option causes numeric ID authentication to fail if the UNIX
user numeric ID can’t be translated into a valid UNIX user name in name services. This counteracts the -
v4-numeric-ids option because ONTAP needs to query the incoming numeric user ID to search for
any auxiliary groups for authentication. If the incoming numeric ID cannot be resolved to a valid UNIX
user or the client’s UNIX numeric UID is different than the numeric UID ONTAP knows about, then the
lookup fails with secd.authsys.lookup.failed in the event log and ONTAP responds to the client
with the AUTH_ERROR client must begin a new session, which appears as “Permission
denied.”
To use both options, use the following guidance:
If you require users and groups that either cannot be queried from both NFS client and server or have
mismatched numeric IDs, you can leverage NFS Kerberos and NFSv4.x ACLs to provide proper
authentication with NFSv4.x, as clients will send name strings instead of numeric IDs.
If you are using -auth-sys-extended-groups with AUTH_SYS and without NFSv4.x ACLs, any
user that requires access through NFS requires a valid UNIX user in the name service database
specified in ns-switch (can also be a local user).
For more information about the -v4-numeric-ids option, see “Bypassing the name string — Numeric
IDs.”
137 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
NFS on Windows
Red Hat’s Cygwin emulates NFS but leverages the SMB protocol rather than NFS, which requires a CIFS
license. True Windows NFS is available natively only through Services for Network File System or third-
party applications such as Hummingbird/OpenText.
struct nlm4_share {
string caller_name<LM_MAXSTRLEN>;
netobj fh;
netobj oh;
fsh4_mode mode;
fsh4_access access;
};
The way that Windows uses NLM is with nonmonitored lock calls. The following nonmonitored lock calls
are required for Windows NFS support:
NLM_SHARE
NLM_UNSHARE
NLM_NM_LOCK
Note: PCNFS, WebNFS, and HCLNFS (legacy Hummingbird NFS client) are not supported with
ONTAP storage systems and there are no plans to include support for these clients.
Note: These options are enabled by default. Disabling them does not harm NFS data, but might
cause some unexpected behavior, such as client crashes. If you want to prevent unexpected
issues, consider using a separate SVM for Windows NFS clients.
Always mount Windows NFS by using the mount option mtype=hard.
When using Windows NFS, the showmount option should be enabled. Otherwise, renames of files
and folders on Windows NFS clients might fail.
cluster::> nfs server modify -vserver SVM -showmount enabled
Windows NFS clients are not able to properly see the used space and space available through the df
commands.
Example of mounting NFS in Windows:
138 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
C:\>mount -o mtype=hard \\x.x.x.e\unix Z:
Z: is now successfully connected to \\x.x.x.e\unix
Appendix A
The following section contains command examples, configuration output, and other information that would
have cluttered the main sections of this document.
Examples
The following section shows examples of commands and of NFS feature functionality.
Squashing root
The following examples show how to squash root to anon in various configuration scenarios.
Vserver: vs0
Policy Name: root_squash
Rule Index: 1
Access Protocol: nfs only NFS is allowed (NFSv3 and v4)
139 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0 all clients
RO Access Rule: sys only AUTH_SYS is allowed
RW Access Rule: sys only AUTH_SYS is allowed
User ID To Which Anonymous Users Are Mapped: 65534 mapped to 65534
Superuser Security Types: none superuser (root) squashed to anon user
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
Example 2: Root is squashed to the anon user using superuser for a specific client.
In this example, sec=sys and sec=none are allowed.
cluster::> vserver export-policy rule show –policyname root_squash_client -instance
(vserver export-policy rule show)
Vserver: vs0
Policy Name: root_squash_client
Rule Index: 1
Access Protocol: nfs only NFS is allowed (NFSv3 and v4)
Client Match Hostname, IP Address, Netgroup, or Domain: x.x.x.x just this client
RO Access Rule: sys,none AUTH_SYS and AUTH_NONE are allowed
RW Access Rule: sys,none AUTH_SYS and AUTH_NONE are allowed
User ID To Which Anonymous Users Are Mapped: 65534 mapped to 65534
Superuser Security Types: none superuser (root) squashed to anon user
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
140 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
[root@nfsclient mnt]# ls -lan
drwxrwxrwx. 3 0 0 106496 Apr 24 2013 .
dr-xr-xr-x. 26 0 0 4096 Apr 24 11:24 ..
drwxrwxrwx. 12 0 0 4096 Apr 24 11:05 .snapshot
drwxr-xr-x. 2 0 1 4096 Apr 18 12:54 junction
-rw-r--r--. 1 65534 65534 0 Apr 24 2013 root_squash_client
Example 3: Root is squashed to the anon user using superuser for a specific set of clients.
This approach uses sec=krb5 (Kerberos) and only NFSv4 and CIFS are allowed.
cluster::> vserver export-policy rule show –policyname root_squash_krb5 -instance
(vserver export-policy rule show)
Vserver: vs0
Policy Name: root_squash_krb5
Rule Index: 1
Access Protocol: nfs4,cifs only NFSv4 and CIFS are allowed
Client Match Hostname, IP Address, Netgroup, or Domain: x.x.x.0/24 just clients with an IP
address of 10.10.100.X
RO Access Rule: krb5 only AUTH_RPCGSSD is allowed
RW Access Rule: krb5 only AUTH_RPCGSSD is allowed
User ID To Which Anonymous Users Are Mapped: 65534 mapped to 65534
Superuser Security Types: none superuser (root) squashed to anon user
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
The UID of 99 in this example occurs in NFSv4 when the user name cannot map into the NFSv4
domain. A look at /var/log/messages confirms this:
Apr 23 10:54:23 nfsclient nfsidmap[1810]: nss_getpwnam: name 'pcuser' not found in domain
nfsv4domain.lenovo.com'
In the preceding examples, when the root user requests access to a mount, it maps to the anon UID. In
this case, the UID is 65534. This mapping prevents unwanted root access from specified clients to the
NFS share. Because “sys” is specified as the rw and ro access rules in the first two examples, only clients
using sec=sys gain access. The third example shows a possible configuration using Kerberized NFS
authentication. Setting the access protocol to NFS allows only NFS access to the share (including NFSv3
141 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
and NFSv4). If multiprotocol access is desired, then the access protocol must be set to allow NFS and
CIFS. NFS access can be limited to only NFSv3 or NFSv4 here as well.
Example 1: Root is allowed access as root using superuser for all clients only for sec=sys.
In this example, sec=none and sec=sys are allowed rw and ro access; all other anon access is mapped to
65534.
cluster::> vserver export-policy rule show –policyname root_allow_anon_squash -instance
(vserver export-policy rule show)
Vserver: vs0
Policy Name: root_allow_anon_squash
Rule Index: 1
Access Protocol: nfs only NFS is allowed (NFSv3 and v4)
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0 all clients
RO Access Rule: sys,none AUTH_SYS and AUTH_NONE allowed
RW Access Rule: sys,none AUTH_SYS and AUTH_NONE allowed
User ID To Which Anonymous Users Are Mapped: 65534 mapped to 65534
Superuser Security Types: sys superuser for AUTH_SYS only
Honor SetUID Bits in SETATTR: true
Example 2: Root is allowed access as root using superuser for sec=krb5 only.
In this example, anon access is mapped to 65534; sec=sys and sec=krb5 are allowed, but only using
NFSv4.
cluster::> vserver export-policy rule show –policyname root_allow_krb5_only -instance
(vserver export-policy rule show)
Vserver: vs0
Policy Name: root_allow_krb5_only
Rule Index: 1
Access Protocol: nfs4 only NFSv4 is allowed
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0 all clients
RO Access Rule: sys,krb5 AUTH_SYS and AUTH_RPCGSS allowed
RW Access Rule: sys,krb5 AUTH_SYS and AUTH_RPCGSS allowed
User ID To Which Anonymous Users Are Mapped: 65534 mapped to 65534
142 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Superuser Security Types: krb5 superuser via AUTH_RPCGSS only
Honor SetUID Bits in SETATTR: true
NOTE: Again, the UID of an unmapped user in NFSv4 is 99. This is controlled via /etc/idmapd.conf
in Linux.
Example 3: Root and all anonymous users are allowed access as root using anon=0.
This example allows only for sec=sys and sec=krb5 over NFSv4.
cluster::> vserver export-policy rule show –policyname root_allow_anon0 -instance
(vserver export-policy rule show)
Vserver: vs0
Policy Name: root_allow_anon0
Rule Index: 1
Access Protocol: nfs4 only NFSv4 is allowed
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0 all clients
RO Access Rule: krb5, sys AUTH_SYS and AUTH_RPCGSS allowed
RW Access Rule: krb5, sys AUTH_SYS and AUTH_RPCGSS allowed
User ID To Which Anonymous Users Are Mapped: 0 mapped to 0
Superuser Security Types: none superuser (root) squashed to anon user
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
143 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
cluster::> volume show -vserver vs0 -volume nfsvol -fields policy
vserver volume policy
------- ------ -----------
vs0 nfsvol root_allow_anon0
cluster::*> vol create -vserver NFS -volume vsroot_mirror1 -aggregate aggr1_node1 -size 1g -type
DP
[Job 26705] Job succeeded: Successful
cluster::*> vol create -vserver NFS -volume vsroot_mirror2 -aggregate aggr1_node2 -size 1g -type
DP
[Job 26707] Job succeeded: Successful
144 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
[Job 26711] Job succeeded: SnapMirror: done
Vserver: nfs_svm
Policy Name: default
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: x.x.x.x
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: any
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
Vserver: nfs_svm
Policy Name: default
Rule Index: 2
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: none
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
2 entries were displayed.
As per the example in the section “Limiting Access to the SVM Root Volume,” root would not be able to
list the contents of the SVM root based on the volume permissions (711) and the existing export policy
rules on any hosts other than x.x.x.x.
# ifconfig | grep "inet addr"
inet addr:x.x.x.y Bcast:x.x.225.255 Mask:255.255.255.0
inet addr:127.0.0.1 Mask:255.0.0.0
# id
145 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-
s0:c0.c1023
# mount | grep mnt
x.x.x.e:/ on /mnt type nfs (rw,nfsvers=3,addr=x.x.x.e)
# cd /mnt
# ls
ls: cannot open directory .: Permission denied
If the data volumes in the SVM also are set to this export policy, they use the same rules, and only the
client set to have root access is able to log in as root.
If root access is desired to the data volumes, then a new export policy can be created and root access
can be specified for all hosts or a subset of hosts through subnet, netgroup, or multiple rules with
individual client IP addresses or host names.
The same concept applies to the other export policy rule attributes, such as RW.
For example, if the default export policy rule is changed to disallow write access to all clients except
x.x.x.x and to allow superuser, then even root is disallowed write access to volumes with that export
policy applied:
cluster::> export-policy rule modify -vserver nfs_svm -policyname default -ruleindex 2 -rwrule
never -superuser any
Vserver: nfs_svm
Policy Name: default
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: x.x.x.x
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: any
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
Vserver: nfs_svm
Policy Name: default
Rule Index: 2
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0
RO Access Rule: any
RW Access Rule: never
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: any
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
2 entries were displayed.
When a new policy and rule are created and applied to the data volume, the same user is allowed to write
to the data volume mounted below the SVM root volume. This is the case despite the export policy rule at
the SVM root volume disallowing write access.
146 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Example:
cluster::> export-policy create -vserver nfs_svm -policyname volume
cluster::> export-policy rule create -vserver nfs_svm -policyname volume -clientmatch 0.0.0.0/0 -
rorule any -rwrule any -allow-suid true -allow-dev true -ntfs-unix-security-ops fail -chown-mode
restricted -superuser any -protocol any -ruleindex 1 -anon 65534
Vserver: nfs_svm
Policy Name: volume
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: any
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
However, the read-only attribute for the export policy rules needs to allow read access from the parent to
allow mounts to occur. Setting rorule to never or not setting an export policy rule in the parent
volume’s export policy (empty policy) disallows mounts to volumes underneath that parent.
In the following example, the vsroot volume has an export policy that has rorule and rwrule set to
never, while the data volume has an export policy with a rule that is wide open:
cluster::> export-policy rule show -vserver nfs -policyname wideopen -instance
(vserver export-policy rule show)
Vserver: nfs
Policy Name: wideopen
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0
RO Access Rule: any
RW Access Rule: any
User ID To Which Anonymous Users Are Mapped: 0
Superuser Security Types: any
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
147 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Vserver: nfs
Policy Name: deny
Rule Index: 1
Access Protocol: any
Client Match Hostname, IP Address, Netgroup, or Domain: 0.0.0.0/0
RO Access Rule: never
RW Access Rule: never
User ID To Which Anonymous Users Are Mapped: 65534
Superuser Security Types: sys
Honor SetUID Bits in SETATTR: true
Allow Creation of Devices: true
When the deny policy is changed to allow read-only access, mounting is allowed.
cluster::> export-policy rule modify -vserver nfs -policyname deny -rorule any -ruleindex 1
As a result, storage administrators can have complete and granular control over what users see and
access file systems using export policies, rules, and volume permissions.
If those rules are flipped, the client is denied access despite the rule allowing access to everyone being in
the policy. Rule index numbers can be modified with the export-policy rule setindex command.
In the following example, rule #1 was changed to rule #99. Rule #99 gets moved to #98 by default.
148 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
cluster::> export-policy rule setindex -vserver NAS -policyname allow_all -ruleindex 1 -
newruleindex 99
Warning: You are about to flush the "all (but showmount)" cache for Vserver "NAS" on node "node2",
which will result in increased traffic to the name servers. Do you want to proceed with
flushing the cache? {y|n}: y
Permissions on the volume mixed are 775. The owner is root and the group is Domain Users:
[root@nfsclient /]# nfs4_getfacl /mixed
A::OWNER@:rwaDxtTnNcCy
D::OWNER@:
A:g:GROUP@:rwaDxtTnNcy
D:g:GROUP@:C
A::EVERYONE@:rxtncy
D::EVERYONE@:waDTC
Because ldapuser is a member of Domain Users, it should have write access to the volume, and it does:
sh-4.1$ cd /mixed
sh-4.1$ ls -la
total 12
drwxrwxr-x. 3 root Domain Users 4096 Apr 30 09:52 .
dr-xr-xr-x. 28 root root 4096 Apr 29 15:24 ..
drwxrwxrwx. 6 root root 4096 Apr 30 08:00 .snapshot
sh-4.1$ touch newfile
sh-4.1$ nfs4_getfacl /mixed
sh-4.1$ ls -la
total 12
drwxrwxr-x. 3 root Domain Users 4096 Apr 30 09:56 .
dr-xr-xr-x. 28 root root 4096 Apr 29 15:24 ..
drwxrwxrwx. 6 root root 4096 Apr 30 08:00 .snapshot
-rw-r--r--. 1 ldapuser Domain Users 0 Apr 30 09:56 newfile
However, if the ACLs are reordered and the explicit DENY for EVERYONE is placed ahead of group, then
ldapuser is denied access to write to the same volume it just had access to write to:
[root@nfsclient /]# nfs4_getfacl /mixed
A::OWNER@:rwaDxtTnNcCy
D::OWNER@:
A::EVERYONE@:rxtncy
D::EVERYONE@:waDTC
149 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
A:g:GROUP@:rwaDxtTnNcy
sh-4.1$ cd /mixed
sh-4.1$ ls -la
total 12
drwxrwxr-x. 3 root Domain Users 4096 Apr 30 09:56 .
dr-xr-xr-x. 28 root root 4096 Apr 29 15:24 ..
drwxrwxrwx. 6 root root 4096 Apr 30 08:00 .snapshot
-rw-r--r--. 1 ldapuser Domain Users 0 Apr 30 09:56 newfile
sh-4.1$ ls -la
total 12
drwxrwxr-x. 3 root Domain Users 4096 Apr 30 10:06 .
dr-xr-xr-x. 28 root root 4096 Apr 29 15:24 ..
drwxrwxrwx. 6 root root 4096 Apr 30 08:00 .snapshot
-rw-r--r--. 1 ldapuser Domain Users 0 Apr 30 09:56 newfile
-rw-r--r--. 1 ldapuser Domain Users 0 Apr 30 10:06 newfile2
Vserver: vs0
File Path: /unix
Security Style: unix
Effective Style: unix
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
Unix User Id: 0
Unix Group Id: 1
Unix Mode Bits: 755
Unix Mode Bits in Text: rwxr-xr-x
ACLs: -
150 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
In the preceding example, the volume (/unix) has 755 permissions. That means that the owner has ALL
access, the owning group has READ/EXECUTE access, and everyone else has READ/EXECUTE access.
Even though there are no NFSv4 ACLs in the fsecurity output, there are default values set that can be
viewed from the client:
[root@nfsclient /]# mount -t nfs4 krbsn:/unix /unix
[root@nfsclient /]# ls -la | grep unix
drwxr-xr-x. 2 root daemon 4096 Apr 30 11:24 unix
[root@nfsclient /]# nfs4_getfacl /unix
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:rxtncy
A::EVERYONE@:rxtncy
The NFSv4 ACLs earlier show the same: the owner has ALL access, the owning group has
READ/EXECUTE access, and everyone else has READ/EXECUTE access. The default mode bits are
tied to the NFSv4 ACLs.
When mode bits are changed, the NFSv4 ACLs are also changed:
[root@nfsclient /]# chmod 775 /unix
[root@nfsclient /]# ls -la | grep unix
drwxrwxr-x. 2 root daemon 4096 Apr 30 11:24 unix
[root@nfsclient /]# nfs4_getfacl /unix
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:rwaDxtTnNcy
A::EVERYONE@:rxtncy
When a user ACE is added to the ACL, the entry is reflected in the ACL on the appliance. In addition, the
entire ACL is now populated. Note that the ACL is in SID format.
[root@nfsclient /]# nfs4_setfacl -a A::[email protected]:ratTnNcCy /unix
[root@nfsclient /]# nfs4_getfacl /unix
A::[email protected]:ratTnNcCy
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:rwaDxtTnNcy
A::EVERYONE@:rxtncy
Vserver: vs0
File Path: /unix
Security Style: unix
Effective Style: unix
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
Unix User Id: 0
Unix Group Id: 1
Unix Mode Bits: 775
Unix Mode Bits in Text: rwxrwxr-x
ACLs: NFSV4 Security Descriptor
Control:0x8014
DACL - ACEs
ALLOW-S-1-8-55-0x16019d
ALLOW-S-1-520-0-0x1601ff
ALLOW-S-1-520-1-0x1201ff-IG
ALLOW-S-1-520-2-0x1200a9
To see the translated ACLs, use fsecurity from the node shell on the node that owns the volume:
cluster::> node run -node node2 fsecurity show /vol/unix
151 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Unix security:
uid: 0
gid: 1
mode: 0775 (rwxrwxr-x)
When a change is made to the mode bit when NFSv4 ACLs are present, the NFSv4 ACL that was just set
is wiped by default:
[root@nfsclient /]# chmod 755 /unix
[root@nfsclient /]# ls -la | grep unix
drwxr-xr-x. 2 root daemon 4096 Apr 30 11:24 unix
[root@nfsclient /]# nfs4_getfacl /unix
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:rxtncy
A::EVERYONE@:rxtncy
Unix security:
uid: 0
gid: 1
mode: 0755 (rwxr-xr-x)
After the option is enabled, the ACL stays intact when mode bits are set.
[root@nfsclient /]# nfs4_setfacl -a A::[email protected]:ratTnNcCy /unix
[root@nfsclient /]# ls -la | grep unix
drwxr-xr-x. 2 root daemon 4096 Apr 30 11:24 unix
[root@nfsclient /]# nfs4_getfacl /unix
A::[email protected]:ratTnNcCy
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:rxtncy
A::EVERYONE@:rxtncy
Vserver: vs0
File Path: /unix
Security Style: unix
Effective Style: unix
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
Unix User Id: 0
Unix Group Id: 1
Unix Mode Bits: 755
152 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Unix Mode Bits in Text: rwxr-xr-x
ACLs: NFSV4 Security Descriptor
Control:0x8014
DACL - ACEs
ALLOW-S-1-8-55-0x16019d
ALLOW-S-1-520-0-0x1601ff
ALLOW-S-1-520-1-0x1200a9-IG
ALLOW-S-1-520-2-0x1200a9
Unix security:
uid: 0
gid: 1
mode: 0755 (rwxr-xr-x)
Note that the ACL is still intact after mode bits are set.
[root@nfsclient /]# chmod 777 /unix
[root@nfsclient /]# ls -la | grep unix
drwxrwxrwx. 2 root daemon 4096 Apr 30 11:24 unix
[root@nfsclient /]# nfs4_getfacl /unix
A::[email protected]:ratTnNcCy
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:rwaDxtTnNcy
A::EVERYONE@:rwaDxtTnNcy
Vserver: vs0
File Path: /unix
Security Style: unix
Effective Style: unix
DOS Attributes: 10
DOS Attributes in Text: ----D---
Expanded Dos Attributes: -
Unix User Id: 0
Unix Group Id: 1
Unix Mode Bits: 777
Unix Mode Bits in Text: rwxrwxrwx
ACLs: NFSV4 Security Descriptor
Control:0x8014
DACL - ACEs
ALLOW-S-1-8-55-0x16019d
ALLOW-S-1-520-0-0x1601ff
ALLOW-S-1-520-1-0x1201ff-IG
ALLOW-S-1-520-2-0x1201ff
153 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
DOS attributes: 0x0010 (----D---)
Unix security:
uid: 0
gid: 1
mode: 0777 (rwxrwxrwx)
154 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
cluster::> net int show -vserver vs0 -curr-node node1 -role data
(network interface show)
Logical Status Network Current Current Is
Vserver Interface Admin/Oper Address/Mask Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
vs0
data1 up/up x.x.x.a/24 node1 e0a true
The client makes a mount request to the data LIF on node2, at the IP address x.x.x.a.
[root@nfsclient /]# mount -t nfs4 x.x.x.a:/nfsvol /mnt
But the cluster shows that the connection was actually established to node1, where the data volume lives.
No connection was made to node2.
cluster::> network connections active show -node node1 -service nfs*
Vserver Interface Remote
CID Ctx Name Name:Local Port Host:Port Protocol/Service
--------- --- --------- ----------------- -------------------- ----------------
Node: node1
286571835 6 vs0 data:2049 x.x.x.z:763 TCP/nfs
Node: node1
Vserver: DEMO
Data-Ip: 10.x.x.a
Client-Ip Volume-Name Protocol Idle-Time Local-Reqs Remote-Reqs
--------------- ---------------- -------- ------------- ---------- -----------
10.x.x.b scripts nfs3 1h 3m 46s 391 0
10.x.x.c XCP_catalog nfs3 17s 0 372
10.x.x.d scripts nfs3 17s 372 0
10.x.x.c home nfs4 12h 50m 28s 256 0
10.x.x.b home nfs4.1 9h 51m 48s 47 0
155 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
In these tests, the following results were seen. Latency for both were sub 1ms.
The same exact workload using NFSv4.1 produces a very different statistical profile. Because NFSv4.x
operates very differently than NFSv3 for file creates, you see the following:
Higher metadata (15%)
156 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Fewer CREATE operations (1000 versus. 1000000 with NFSv3)
OPEN/CLOSE operations for each file creation
Several other metadata operation types (COMPOUND, SEQUENCE, GETFH, PUTFH)
Write totals stay the same but percentage wise, are much lower (7%)
Object: nfsv4_1
Instance: DEMO
Counter Value
-------------------------------- --------------------------------
access_percent 7%
access_total 1001014
close_percent 7%
close_total 1000000
compound_percent 23%
compound_total 3002078
create_session_total 2
create_total 1003
exchange_id_total 8
getattr_percent 15%
getattr_total 2002064
getfh_percent 7%
getfh_total 1001012
lookup_total 7
open_percent 7%
open_total 1000000
putfh_percent 23%
putfh_total 3002064
putrootfh_total 2
reclaim_complete_total 2
sequence_percent 23%
sequence_total 3002068
total_ops 6417
write_percent 7%
write_total 1000000
As a result, just by using NFSv4.1 for this workload, there are more overall metadata operations, which
NFSv4.1 tends to perform poorly with. This is illustrated in a side-by-side comparison of various
performance metrics, as shown in Table 33.
Table 33) NFSv3 versus NFSv4.1 performance — High file creation workload
Test Average IOPS Average MBps Average latency Completion time Average
CPU %
NFSv3 55118 71.9 1.6ms 54.7 seconds 51%
NFSv4.1 25068 13.9 11.5ms 283.5 seconds 24%
As you can see, the latency is higher, IOPS and throughput is lower, and the completion time is nearly
five times greater for this workload when using NFSv4.1. The CPU usage is lower because the storage is
not being asked to do as much (such as, process as many IOPS).
This does not mean NFSv4.x always performs worse than NFSv3; in some cases, it can perform as well
or better. It all depends on the type of workload being used.
For a highly sequential write workload, NFSv4.1 is able to compete with NFSv3 because there is less
metadata to process. Using a multithreaded dd operation to create eight 10GB files, this is the NFSv3
workload profile:
Object: nfsv3
Instance: DEMO
Counter Value
-------------------------------- --------------------------------
access_total 18
create_total 8
fsinfo_total 4
157 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
getattr_total 7
lookup_total 11
mkdir_total 11
null_total 4
pathconf_total 2
total_ops 5357
write_percent 99%
write_total 1248306
In this case, it is at nearly 100% writes. For NFSv4.1, the write percentage is lower but the accompanying
metadata operations are not the types that incur performance penalties (COMPOUND and PUTFH).
Object: nfsv4_1
Instance: DEMO
Counter Value
-------------------------------- --------------------------------
access_total 25
close_total 4
compound_percent 33%
compound_total 1238160
create_session_total 4
create_total 11
destroy_clientid_total 3
destroy_session_total 3
exchange_id_total 16
getattr_total 72
getdeviceinfo_total 8
getfh_total 28
layoutget_total 8
layoutreturn_total 1
lookup_total 7
open_total 8
putfh_percent 33%
putfh_total 1238107
putrootfh_total 2
reclaim_complete_total 4
sequence_percent 33%
sequence_total 1238134
total_ops 15285
write_percent 33%
write_total 1238032
This results in a much better performance comparison for NFSv4.1. IOPS, throughput, and latency are
nearly identical and NFSv4.1 takes 8 seconds longer (~3.7% more) than NFSv3, as shown in Table 34.
Table 34) NFSv3 vs. NFSv4.1 performance — High sequential writes
Test Average IOPS Average MBps Average latency Completion time Average
CPU %
NFSv3 6085 383.1 5.4ms 216.6 seconds 28%
NFSv4.1 6259 366.3 4.4 224.7 seconds 26%
With a more standard benchmarking tool (vdbench), you can see the differences between read and write
performance with different workload types. NFSv3 performs better in most cases but the gap is not
especially wide for workloads that read/write existing datasets. Writes tend to have a wider performance
disparity but read performance is nearly identical. Sequential writes actually perform a bit better for
NFSv4.x in these tests as well, as shown in the following figures.
Figure 23) Random reads, 4K, NFSv3 versus NFSv4.x — IOPS/Latency
158 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Figure 24) Random writes, 4K, NFSv3 versus NFSv4.x –—IOPS/Latency
159 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Figure 26) Sequential writes, 32K, NFSv3 versus NFSv4.x — IOPS/Latency
160 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
To find the scripts, see: https://github.com/whyistheinternetbroken/NetAppFlexGroup/.
The goal for these tests was not to show the maximum possible performance; to do that, we would have
used larger AFA systems, more nodes in the cluster, more clients, and so on.
Instead, there are three goals for these tests:
Show the impact of different wsize and rsize values on different workload types.
Show NFSv3, NFSv4.1, and NFSv4.2 performance comparisons.
Show the effect that NFS Kerberos has for these workloads.
These tests were run five times each and the values were averaged out for a more accurate
representation of the results.
Here are some key points about the test environment:
NFSv3, NFSv4.1 and NFSv4.2 (with and without pNFS)
Clients:
Two RHEL 8.3 VMs
nconnect=8
TCP slot tables left as defaults (65536)
Virtual memory tuned to 40% vm.dirty_ratio and 20% vm.dirty_background_ratio
AFA DM7000F HA pair — two nodes (ONTAP 9.9.1)
FlexGroup volume (16 member volumes, 8 per node)
10GB network
Note: This test was not intended to show the maximum performance of NFS, or the systems being
used, but to show comparisons between NFS versions and block sizes by using tests on the same
network and hardware.
The NFS operations for the 4KB file test breakdown looked like this (even create/lookup/write):
Object: nfsv3
Instance: DEMO
Counter Value
-------------------------------- --------------------------------
access_total 1017
create_percent 33%
create_total 1000000
fsinfo_total 4
getattr_total 3
lookup_percent 33%
lookup_total 1000003
mkdir_total 1003
null_total 4
pathconf_total 2
total_ops 36169
write_percent 33%
write_total 1000000
The NFS operations for the 2GB file test breakdown looked like this (100% write):
Object: nfsv3
Instance: DEMO
Number of Constituents: 2 (complete_aggregation)
Counter Value
-------------------------------- --------------------------------
access_total 32
create_total 16
fsinfo_total 4
getattr_total 18
lookup_total 19
161 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
mkdir_total 19
null_total 4
pathconf_total 2
total_ops 3045
write_percent 99%
write_total 511520
Test #1: High file count test — Many folders and files
This test created approximately one million 4KB files across 1,000 folders and used multiple different TCP
transfer size values with the mounts. The file creation method was an f.write call from Python that wrote
16 characters over and over again to a file until it reaches 4KB in size. This generated more NFS
operations. This test used multiple different transfer sizes, NFS versions (NFSv3, NFSv4.1, NFSv4.2) and
NFSv4.1 with and without pNFS, using nconnect=8.
Table 35 breaks down the average completion times, average IOPs, and average throughput for each
protocol and wsize/rsize value, as well as average CPU busy%.
The commands qos statistics performance show (for latency, IOPS and throughput) and
statistics show-periodic (for CPU busy %) were used to collect this information.
Table 35) High file count test results — one million files.
Test Completion time Average IOPS Average MBps
(seconds)
NFSv3 – 64K wsize/rsize ~68.6 ~55836 ~72.6
NFSv3 – 256K wsize/rsize ~71 ~55574 ~72.2
NFSv3 – 1MB wsize/rsize ~73.1 ~55865 ~72.7
NFSv4.1 – 64K wsize/rsize ~251.1 ~11182 ~15.4
NFSv4.1 – 256K wsize/rsize ~259.5 ~12041 ~15.7
NFSv4.1 – 1MB wsize/rsize ~257.9 ~11956 ~15.6
NFSv4.1 – 64K wsize/rsize (pNFS) ~254 ~11818 ~15.4
NFSv4.1 – 256K wsize/rsize (pNFS) ~253.9 ~11688 ~15.2
NFSv4.1 – 1MB wsize/rsize (pNFS) ~253.1 ~11850 ~15.4
NFSv4.2 – 64K wsize/rsize (pNFS) ~256.3 ~11756 ~15.5
NFSv4.2 – 256K wsize/rsize (pNFS) ~255.5 ~11890 ~15.4
NFSv4.2 – 1MB wsize/rsize (pNFS) ~256.1 ~11764 ~15.2
Table 36) High file count test results — one million files — average CPU busy % and average latency.
Test Average total latency (ms) Average CPU busy %
NFSv3 – 64K wsize/rsize ~15.4 ~64%
NFSv3 – 256K wsize/rsize ~15.1 ~63%
NFSv3 – 1MB wsize/rsize ~15.3 ~63%
NFSv4.1 – 64K wsize/rsize ~30.3 ~27%
NFSv4.1 – 256K wsize/rsize ~29.8 ~27%
NFSv4.1 – 1MB wsize/rsize ~30 ~27%
NFSv4.1 (pNFS) – 64K wsize/rsize ~30.5 ~26%
NFSv4.1 (pNFS) – 256K wsize/rsize ~30.9 ~26%
NFSv4.1 (pNFS) – 1MB wsize/rsize ~30.4 ~26%
NFSv4.2 (pNFS) – 64K wsize/rsize ~30.3 ~26%
162 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Test Average total latency (ms) Average CPU busy %
NFSv4.2 (pNFS) – 256K wsize/rsize ~30.1 ~26%
NFSv4.2 (pNFS) – 1MB wsize/rsize ~30.5 ~27%
Observations
Regardless of the mount’s wsize/rsize options, roughly the same number of IOPS were generated
because of the nature of the workload (many small files). Regardless of whether we sent blocks in
64K or 1MB chunks, we still only sent 4K files, so the client sent those one by one. The average
maximum throughput did not change much for NFSv3 but had some increases with larger wsize/rsize
values.
Overall, NFSv3 performed substantially better than NFSv4.x for this type of workload (up to 3.5x
better) in ONTAP 9.9.1, generally because of better throughput and higher IOPS.
NFSv4.x had lower overall CPU busy % because it generated fewer overall IOPS. Less operations to
process = less CPU to use.
pNFS didn’t help performance much here.
NFSv4.2 did not offer benefits for performance over NFSv4.1.
Table 38) Low file count test results — Average CPU busy % and average latency.
Test Average total latency (ms) Average CPU busy %
NFSv3 – 64K wsize/rsize ~0.44 ~28%
NFSv3 – 256K wsize/rsize ~0.87 ~25%
NFSv3 – 1MB wsize/rsize ~3.98 ~25%
NFSv4.1 – 64K wsize/rsize ~0.44 ~30%
163 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Test Average total latency (ms) Average CPU busy %
NFSv4.1 – 256K wsize/rsize ~0.89 ~27%
NFSv4.1 – 1MB wsize/rsize ~4.15 ~26%
NFSv4.1 (pNFS) – 64K wsize/rsize ~0.31 ~29%
NFSv4.1 (pNFS) – 256K wsize/rsize ~0.65 ~25%
NFSv4.1 (pNFS) – 1MB wsize/rsize ~2.51 ~26%
NFSv4.2 (pNFS) – 64K wsize/rsize ~0.35 ~28%
NFSv4.2 (pNFS) – 256K wsize/rsize ~0.65 ~25%
NFSv4.2 (pNFS) – 1MB wsize/rsize ~2.1 ~25%
Observations
Generally speaking, NFSv3 and NFSv4.x performed nearly identically for this type of workload. The
completion times for the tests were within 1–2 seconds of each other and the throughput numbers
were all similar across the board. The main difference was in average latency.
NFSv4.x had lower latency than NFSv3 – especially when using pNFS to ensure data locality. Having
data requests use the cluster network added between 0.1 and 1.5 ms write latency to the workloads
(depending on mount wsize) and 100–110MB of additional traffic across the cluster network.
With NFSv4.2, there was no real noticeable performance increase from NFSv4.1 except when using
1MB wsize (~.4ms improvement).
For sequential workloads that were mostly reads/writes, NFSv4.1 and later with pNFS performed just
as well as NFSv3 for the same workloads.
164 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
RFC 5661: Network File System (NFS) Version 4 Minor Version 1 Protocol
http://www.ietf.org/rfc/rfc5661.txt
RFC 5331: RPC: Remote Procedure Call Protocol Specification Version 2
http://tools.ietf.org/html/rfc5531
165 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Contacting Support
You can contact Support to obtain help for your issue.
You can receive hardware service through a Lenovo Authorized Service Provider. To locate a service
provider authorized by Lenovo to provide warranty service, go to https://datacentersupport.lenovo.com/
serviceprovider and use filter searching for different countries. For Lenovo support telephone numbers,
see https://datacentersupport.lenovo.com/supportphonelist for your region support details.
Notices
Lenovo may not offer the products, services, or features discussed in this document in all countries.
Consult your local Lenovo representative for information on the products and services currently available
in your area.
Any reference to a Lenovo product, program, or service is not intended to state or imply that only that
Lenovo product, program, or service may be used. Any functionally equivalent product, program, or
service that does not infringe any Lenovo intellectual property right may be used instead. However, it is
the user's responsibility to evaluate and verify the operation of any other product, program, or service.
Lenovo may have patents or pending patent applications covering subject matter described in this
document. The furnishing of this document is not an offer and does not provide a license under any
patents or patent applications. You can send inquiries in writing to the following:
Lenovo (United States), Inc. 8001 Development Drive
Morrisville, NC 27560 U.S.A.
Attention: Lenovo Director of Licensing
LENOVO PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-
INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow
disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply
to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically
made to the information herein; these changes will be incorporated in new editions of the publication.
Lenovo may make improvements and/or changes in the product(s) and/or the program(s) described in
this publication at any time without notice.
The products described in this document are not intended for use in implantation or other life support
applications where malfunction may result in injury or death to persons. The information contained in this
document does not affect or change Lenovo product specifications or warranties. Nothing in this
document shall operate as an express or implied license or indemnity under the intellectual property
rights of Lenovo or third parties. All information contained in this document was obtained in specific
environments and is presented as an illustration. The result obtained in other operating environments
may vary.
Lenovo may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Any references in this publication to non-Lenovo Web sites are provided for convenience only and do not
in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not
part of the materials for this Lenovo product, and use of those Web sites is at your own risk.
166 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.
Any performance data contained herein was determined in a controlled environment. Therefore, the result
obtained in other operating environments may vary significantly. Some measurements may have been
made on development-level systems and there is no guarantee that these measurements will be the
same on generally available systems. Furthermore, some measurements may have been estimated
through extrapolation. Actual results may vary. Users of this document should verify the applicable data
for their specific environment.
Trademarks
LENOVO, LENOVO logo, and THINKSYSTEM are trademarks of Lenovo. All other trademarks are the
property of their respective owners. © 2023 Lenovo
167 NFS in Lenovo ONTAP: Best Practices and Implementation Guide © 2023 Lenovo. All Rights Reserved.