Discussion:
[patch] unmount/mount race
(too old to reply)
Richard P
2015-01-22 19:04:59 UTC
Permalink
There appears to be a race condition in amd where if an unmount is in progress
and then a request comes in to mount the same fs (via another map entry) the
second mount can end up referencing an unmounted fs. You reproduce this with
df /mountpoint1; sleep 300; df /mountpoint2 - you should see the second df
displaying the root fs (might take a couple of goes.)

I have a naiive patch below that resolves this:

--- amd-6.1.5.orig/amd/autil.c 2009-06-04 11:06:44.000000000 -0400
+++ amd-6.1.5/amd/autil.c 2015-01-22 01:45:56.000000000 -0500
@@ -688,6 +688,11 @@
}

/*
+ * Clear the mounted flag in case there is a pending mount for the same fs
+ */
+ mf->mf_flags &= ~MFF_MOUNTED;
+
+ /*
* If this is a pseudo-directory then adjust the link count
* in the parent
*/

Also, while investigating this I found that amd does an unnecessary fork when
mounting a node that refers to an already-mounted fs. This is the proposed
patch (without re-indentation):

--- amd-6.1.5.orig/amd/amfs_generic.c 2009-08-11 05:57:18.000000000 -0400
+++ amd-6.1.5/amd/amfs_generic.c 2015-01-22 03:02:16.000000000 -0500
@@ -844,6 +844,11 @@
#endif /* HAVE_FS_AUTOFS */

already_mounted:
+#ifdef HAVE_FS_AUTOFS
+ if (mp->am_flags & AMF_AUTOFS || !(mf->mf_flags & MFF_MOUNTED)) {
+#else
+ if (!(mf->mf_flags & MFF_MOUNTED)) {
+#endif /* HAVE_FS_AUTOFS */
mf->mf_flags |= MFF_MOUNTING;
if (mf->mf_fsflags & FS_MBACKGROUND) {
dlog("backgrounding mount of \"%s\"", mf->mf_mount);
@@ -861,6 +866,7 @@
}

mf->mf_flags &= ~MFF_MOUNTING;
+ }
if (this_error > 0)
goto failed;
if (this_error == 0) {
--- amd-6.1.5.orig/amd/map.c 2009-08-11 05:57:18.000000000 -0400
+++ amd-6.1.5/amd/map.c 2015-01-22 02:39:03.000000000 -0500
@@ -765,8 +765,7 @@
error = autofs_mount_fs(mp, mf);
else
#endif /* HAVE_FS_AUTOFS */
- if (!(mf->mf_flags & MFF_MOUNTED))
- error = mf->mf_ops->mount_fs(mp, mf);
+ error = mf->mf_ops->mount_fs(mp, mf);

if (error > 0)
dlog("mount_node: call to mf_ops->mount_fs(%s) failed: %s",

Not sure if these patches are correct but we'd definitely like to see
a fix for the race condition. I've included some (bowdlerized) logs below
illustrating the problem:

Jan 20 08:16:33 client amd[4623]: Timing out automount points...
Jan 20 08:16:33 client amd[4623]: "/mountpoint1" on /a/server/share timed out
Jan 20 08:16:33 client amd[4623]: Trying unmount in background
Jan 20 08:16:33 client amd[4623]: forked process 33532
Jan 20 08:16:33 client amd[33532]: backgrounded
Jan 20 08:16:33 client amd[33532]: backgrounded
Jan 20 08:16:33 client amd[4623]: Next mount timeout in 120s
Jan 20 08:16:33 client amd[33532]: Unmounting </mountpoint1>
</a/server/share> (server:/share) flags 15
Jan 20 08:16:33 client amd[33532]: attempting nfs umount
Jan 20 08:16:33 client amd[4623]: Discarding filehandle for server:/share
Jan 20 08:16:33 client amd[4623]: Select waits for 120s
Jan 20 08:16:33 client amd[33532]: Trying unmount(/a/server/share)
Jan 20 08:16:33 client amd[33532]: unlock_mntlist: releasing
-%-
Jan 20 08:16:33 client amd[4623]: NFS: mounting remote server
"server", remote fs "/share" on "/a/server/share"
Jan 20 08:16:33 client amd[4623]: Locating mntfs reference to
(/a/server/share,server:/share)
Jan 20 08:16:33 client amd[4623]: mf->mf_flags = 0x115
Jan 20 08:16:33 client amd[4623]: Got a hit with nfs
Jan 20 08:16:33 client amd[4623]: free_mntfs <//nil//> type error
mf_refc 1 flags 0
Jan 20 08:16:33 client amd[4623]: Immediately discarding mntfs for //nil//
Jan 20 08:16:33 client amd[4623]: in amfs_generic_mount_child
Jan 20 08:16:33 client amd[4623]: free_loc 0x2999f50
Jan 20 08:16:33 client amd[4623]: free_mntfs <//nil//> type error
mf_refc 1 flags 0
Jan 20 08:16:33 client amd[4623]: Immediately discarding mntfs for //nil//
Jan 20 08:16:33 client amd[4623]: mount of "server:/share" already pending
Jan 20 08:16:33 client amd[4623]: will retry ...
Jan 20 08:16:33 client amd[4623]: Arranging to retry mount of /mountpoint2
Jan 20 08:16:33 client amd[4623]: SLEEP on 0x36f7230
Jan 20 08:16:33 client amd[4623]: Saving RPC transport for /mountpoint2
Jan 20 08:16:33 client amd[4623]: Timing out automount points...
Jan 20 08:16:33 client amd[4623]: Next mount timeout in 217s
Jan 20 08:16:33 client amd[4623]: Select waits for 2s
Jan 20 08:16:33 client amd[33532]: unmount(/a/server/share) succeeded
Jan 20 08:16:33 client amd[33532]: Finished unmount(/a/server/share)
Jan 20 08:16:33 client amd[33532]: unlock_mntlist: releasing
Jan 20 08:16:33 client amd[4623]: Process 33532 exited with status 0
Jan 20 08:16:33 client amd[4623]: select interrupted
Jan 20 08:16:33 client amd[4623]: free_loc 0x203c2b0
Jan 20 08:16:33 client amd[4623]: free_mntfs </a/server/share> type
nfs mf_refc 3 flags 11
Jan 20 08:16:33 client amd[4623]: Commencing retry for mount of /mountpoint2
Jan 20 08:16:33 client amd[4623]: free_loc 0x2658380
Jan 20 08:16:33 client amd[4623]: free_mntfs </a/server/share> type
nfs mf_refc 2 flags 11
Jan 20 08:16:33 client amd[4623]: duplicate mount of "server:/share" ...
Jan 20 08:16:33 client amd[4623]: backgrounding mount of "/a/server/share"
Jan 20 08:16:33 client amd[4623]: forked process 33534
Jan 20 08:16:33 client amd[33534]: backgrounded
Jan 20 08:16:33 client amd[4623]: Timing out automount points...
Jan 20 08:16:33 client amd[4623]: Next mount timeout in 217s
Jan 20 08:16:33 client amd[4623]: Select waits for 217s
Jan 20 08:16:33 client amd[4623]: Process 33534 exited with status 0
Jan 20 08:16:33 client amd[4623]: select interrupted
Jan 20 08:16:33 client amd[4623]: amfs_cont: '/mountpoint2'
Jan 20 08:16:33 client amd[4623]: Mounting /mountpoint2 returned success
Jan 20 08:16:33 client amd[4623]: server:/share referenced fstype nfs
on /a/server/share
Jan 20 08:16:33 client amd[4623]: /mountpoint2 set to timeout in 300 seconds
Jan 20 08:16:33 client amd[4623]: Quick reply sent for /a/server/share
Jan 20 08:16:33 client amd[4623]: free_continuation
Jan 20 08:16:33 client amd[4623]: free_loc 0x2658380
Jan 20 08:16:33 client amd[4623]: free_mntfs </a/server/share> type
nfs mf_refc 2 flags 11
Christos Zoulas
2015-01-28 16:57:17 UTC
Permalink
On Jan 22, 7:04pm, ***@gmail.com (Richard P) wrote:
-- Subject: [patch] unmount/mount race

| There appears to be a race condition in amd where if an unmount is in progress
| and then a request comes in to mount the same fs (via another map entry) the
| second mount can end up referencing an unmounted fs. You reproduce this with
| df /mountpoint1; sleep 300; df /mountpoint2 - you should see the second df
| displaying the root fs (might take a couple of goes.)

Hi Richard,

I am looking into this patch, but I am wondering is there a bugs database
for am-utils? I think it would be nice if reports like this did not get
lost in the noise of the mail... Erez?

Thanks,

christos
Erez Zadok
2015-01-28 18:59:15 UTC
Permalink
Yup: https://bugzilla.fsl.cs.sunysb.edu/

Erez.
Post by Christos Zoulas
-- Subject: [patch] unmount/mount race
| There appears to be a race condition in amd where if an unmount is in progress
| and then a request comes in to mount the same fs (via another map entry) the
| second mount can end up referencing an unmounted fs. You reproduce this with
| df /mountpoint1; sleep 300; df /mountpoint2 - you should see the second df
| displaying the root fs (might take a couple of goes.)
Hi Richard,
I am looking into this patch, but I am wondering is there a bugs database
for am-utils? I think it would be nice if reports like this did not get
lost in the noise of the mail... Erez?
Thanks,
christos
_______________________________________________
am-utils mailing list
http://www.fsl.cs.sunysb.edu/mailman/listinfo/am-utils
Ian Kent
2016-07-26 08:22:27 UTC
Permalink
Post by Richard P
There appears to be a race condition in amd where if an unmount is in progress
and then a request comes in to mount the same fs (via another map entry) the
second mount can end up referencing an unmounted fs. You reproduce this with
df /mountpoint1; sleep 300; df /mountpoint2 - you should see the second df
displaying the root fs (might take a couple of goes.)
Mmm .. I don't remember seeing this.

Sounds like a nasty problem.
Nothing like the suggestions below appear in the 6.2 tree.

Richard, any chance you could spend a little time to reproduce it on a 6.2
install?

Anyone have any thoughts on Richards problem description?
Post by Richard P
--- amd-6.1.5.orig/amd/autil.c 2009-06-04 11:06:44.000000000 -0400
+++ amd-6.1.5/amd/autil.c 2015-01-22 01:45:56.000000000 -0500
@@ -688,6 +688,11 @@
}
/*
+ * Clear the mounted flag in case there is a pending mount for the same fs
+ */
+ mf->mf_flags &= ~MFF_MOUNTED;
+
+ /*
* If this is a pseudo-directory then adjust the link count
* in the parent
*/
Also, while investigating this I found that amd does an unnecessary fork when
mounting a node that refers to an already-mounted fs. This is the proposed
--- amd-6.1.5.orig/amd/amfs_generic.c 2009-08-11 05:57:18.000000000 -0400
+++ amd-6.1.5/amd/amfs_generic.c 2015-01-22 03:02:16.000000000 -0500
@@ -844,6 +844,11 @@
#endif /* HAVE_FS_AUTOFS */
+#ifdef HAVE_FS_AUTOFS
+ if (mp->am_flags & AMF_AUTOFS || !(mf->mf_flags & MFF_MOUNTED)) {
+#else
+ if (!(mf->mf_flags & MFF_MOUNTED)) {
+#endif /* HAVE_FS_AUTOFS */
mf->mf_flags |= MFF_MOUNTING;
if (mf->mf_fsflags & FS_MBACKGROUND) {
dlog("backgrounding mount of \"%s\"", mf->mf_mount);
@@ -861,6 +866,7 @@
}
mf->mf_flags &= ~MFF_MOUNTING;
+ }
if (this_error > 0)
goto failed;
if (this_error == 0) {
--- amd-6.1.5.orig/amd/map.c 2009-08-11 05:57:18.000000000 -0400
+++ amd-6.1.5/amd/map.c 2015-01-22 02:39:03.000000000 -0500
@@ -765,8 +765,7 @@
error = autofs_mount_fs(mp, mf);
else
#endif /* HAVE_FS_AUTOFS */
- if (!(mf->mf_flags & MFF_MOUNTED))
- error = mf->mf_ops->mount_fs(mp, mf);
+ error = mf->mf_ops->mount_fs(mp, mf);
if (error > 0)
dlog("mount_node: call to mf_ops->mount_fs(%s) failed: %s",
Not sure if these patches are correct but we'd definitely like to see
a fix for the race condition. I've included some (bowdlerized) logs below
Jan 20 08:16:33 client amd[4623]: Timing out automount points...
Jan 20 08:16:33 client amd[4623]: "/mountpoint1" on /a/server/share timed out
Jan 20 08:16:33 client amd[4623]: Trying unmount in background
Jan 20 08:16:33 client amd[4623]: forked process 33532
Jan 20 08:16:33 client amd[33532]: backgrounded
Jan 20 08:16:33 client amd[33532]: backgrounded
Jan 20 08:16:33 client amd[4623]: Next mount timeout in 120s
Jan 20 08:16:33 client amd[33532]: Unmounting </mountpoint1>
</a/server/share> (server:/share) flags 15
Jan 20 08:16:33 client amd[33532]: attempting nfs umount
Jan 20 08:16:33 client amd[4623]: Discarding filehandle for server:/share
Jan 20 08:16:33 client amd[4623]: Select waits for 120s
Jan 20 08:16:33 client amd[33532]: Trying unmount(/a/server/share)
Jan 20 08:16:33 client amd[33532]: unlock_mntlist: releasing
-%-
Jan 20 08:16:33 client amd[4623]: NFS: mounting remote server
"server", remote fs "/share" on "/a/server/share"
Jan 20 08:16:33 client amd[4623]: Locating mntfs reference to
(/a/server/share,server:/share)
Jan 20 08:16:33 client amd[4623]: mf->mf_flags = 0x115
Jan 20 08:16:33 client amd[4623]: Got a hit with nfs
Jan 20 08:16:33 client amd[4623]: free_mntfs <//nil//> type error
mf_refc 1 flags 0
Jan 20 08:16:33 client amd[4623]: Immediately discarding mntfs for //nil//
Jan 20 08:16:33 client amd[4623]: in amfs_generic_mount_child
Jan 20 08:16:33 client amd[4623]: free_loc 0x2999f50
Jan 20 08:16:33 client amd[4623]: free_mntfs <//nil//> type error
mf_refc 1 flags 0
Jan 20 08:16:33 client amd[4623]: Immediately discarding mntfs for //nil//
Jan 20 08:16:33 client amd[4623]: mount of "server:/share" already pending
Jan 20 08:16:33 client amd[4623]: will retry ...
Jan 20 08:16:33 client amd[4623]: Arranging to retry mount of /mountpoint2
Jan 20 08:16:33 client amd[4623]: SLEEP on 0x36f7230
Jan 20 08:16:33 client amd[4623]: Saving RPC transport for /mountpoint2
Jan 20 08:16:33 client amd[4623]: Timing out automount points...
Jan 20 08:16:33 client amd[4623]: Next mount timeout in 217s
Jan 20 08:16:33 client amd[4623]: Select waits for 2s
Jan 20 08:16:33 client amd[33532]: unmount(/a/server/share) succeeded
Jan 20 08:16:33 client amd[33532]: Finished unmount(/a/server/share)
Jan 20 08:16:33 client amd[33532]: unlock_mntlist: releasing
Jan 20 08:16:33 client amd[4623]: Process 33532 exited with status 0
Jan 20 08:16:33 client amd[4623]: select interrupted
Jan 20 08:16:33 client amd[4623]: free_loc 0x203c2b0
Jan 20 08:16:33 client amd[4623]: free_mntfs </a/server/share> type
nfs mf_refc 3 flags 11
Jan 20 08:16:33 client amd[4623]: Commencing retry for mount of /mountpoint2
Jan 20 08:16:33 client amd[4623]: free_loc 0x2658380
Jan 20 08:16:33 client amd[4623]: free_mntfs </a/server/share> type
nfs mf_refc 2 flags 11
Jan 20 08:16:33 client amd[4623]: duplicate mount of "server:/share" ...
Jan 20 08:16:33 client amd[4623]: backgrounding mount of "/a/server/share"
Jan 20 08:16:33 client amd[4623]: forked process 33534
Jan 20 08:16:33 client amd[33534]: backgrounded
Jan 20 08:16:33 client amd[4623]: Timing out automount points...
Jan 20 08:16:33 client amd[4623]: Next mount timeout in 217s
Jan 20 08:16:33 client amd[4623]: Select waits for 217s
Jan 20 08:16:33 client amd[4623]: Process 33534 exited with status 0
Jan 20 08:16:33 client amd[4623]: select interrupted
Jan 20 08:16:33 client amd[4623]: amfs_cont: '/mountpoint2'
Jan 20 08:16:33 client amd[4623]: Mounting /mountpoint2 returned success
Jan 20 08:16:33 client amd[4623]: server:/share referenced fstype nfs
on /a/server/share
Jan 20 08:16:33 client amd[4623]: /mountpoint2 set to timeout in 300 seconds
Jan 20 08:16:33 client amd[4623]: Quick reply sent for /a/server/share
Jan 20 08:16:33 client amd[4623]: free_continuation
Jan 20 08:16:33 client amd[4623]: free_loc 0x2658380
Jan 20 08:16:33 client amd[4623]: free_mntfs </a/server/share> type
nfs mf_refc 2 flags 11
_______________________________________________
am-utils mailing list
http://www.fsl.cs.sunysb.edu/mailman/listinfo/am-utils
Loading...