From: Jeff Layton <jlayton@redhat.com> Date: Wed, 11 Aug 2010 11:25:06 -0400 Subject: [fs] nfs: fix NFS4ERR_FILE_OPEN handling in Linux/NFS Message-id: <1281525906-17559-1-git-send-email-jlayton@redhat.com> Patchwork-id: 27501 O-Subject: [RHEL5.6 PATCH] BZ#604044: NFS4ERR_FILE_OPEN handling in Linux/NFS Bugzilla: 604044 RH-Acked-by: Steve Dickson <SteveD@redhat.com> From: NeilBrown <neilb@suse.de> The following problem was reported by someone running a solaris server as a dual NFS/CIFS host. Apparently there is a ZFS option that makes all locks mandatory when you run the host in this configuration. When accessing a file that is locked on such a filesystem via NFSv4, the server will return NFS4ERR_FILE_OPEN. It's also conceivable that windows-based NFSv4 servers or dual NFS/CIFS appliances could return this error. RHEL5 just errors out with -EIO when it hits this error. This patch changes the client such that it retries for 1s and then fails with -EBUSY if the problem hasn't been resolved. Neil Brown's original patch description follows. In it he mentions that the current behavior makes it retry the op indefinitely. That's not the case in RHEL5 which doesn't have a mapping for this error and just translates it to -EIO. This patch was tested by the customer who reported that it resolved the problem for them. ----------------------------[snip]------------------------- NFS4ERR_FILE_OPEN is returned by the server when an operation cannot be performed because the file is currently open and local (to the server) semantics prohibit the operation while the file is open. A typical case is a RENAME operation on an MS-Windows platform, which prevents rename while the file is open. While it is possible that such a condition is transitory, it is also very possible that the file will be held open for an extended period of time thus preventing the operation. The current behaviour of Linux/NFS is to retry the operation indefinitely. This is not appropriate - we do not expect a rename to take an arbitrary amount of time to complete. Rather, and error should be returned. The most obvious error code would be EBUSY, which is a legal at least for 'rename' and 'unlink', and accurately captures the reason for the error. This patch allows a few retries until about 2 seconds have elapsed, then returns EBUSY. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 9168cb1..67b5ed1 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2923,6 +2923,14 @@ int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct if (ret == 0) exception->retry = 1; break; + case -NFS4ERR_FILE_OPEN: + if (exception->timeout > HZ) { + /* We have retried a decent amount, time to + * fail + */ + ret = -EBUSY; + break; + } case -NFS4ERR_GRACE: case -NFS4ERR_DELAY: ret = nfs4_delay(server->client, &exception->timeout);