From: Eric Sandeen <sandeen@redhat.com> Date: Mon, 24 Sep 2007 12:59:12 -0500 Subject: ext3: orphan list check on destroy_inode Message-id: 46F7FAF0.4080906@redhat.com O-Subject: [PATCH RHEL5.2] ext3: orphan list check on destroy_inode Bugzilla: 269401 For Bugzilla Bug 269401: kernel oops on corrupted ext3 directory removal & unmount I hit this problem while trying to reproduce another customer problem, with a corrupted ext3 filesystem. Seems worth taking, since the end result is an oops. The bug below references some debug code that he merged upstream, but I don't think it's critical that we take it; IOW the below fix works fine without his previous mod. Tested on image attached to the bug. Thanks, -Eric ---------------------------------------- From: Vasily Averin <vvs@sw.ru> Date: Mon, 16 Jul 2007 06:40:46 +0000 (-0700) Subject: ext3/ext4: orphan list corruption due bad inode X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=a6c15c2b0fbfd5c0a84f5f0e1e3f20f85d2b8692 ext3/ext4: orphan list corruption due bad inode After ext3 orphan list check has been added into ext3_destroy_inode() (please see my previous patch) the following situation has been detected: EXT3-fs warning (device sda6): ext3_unlink: Deleting nonexistent file (37901290), 0 Inode 00000101a15b7840: orphan list check failed! 00000773 6f665f00 74616d72 00000573 65725f00 06737270 66000000 616d726f ... Call Trace: [<ffffffff80211ea9>] ext3_destroy_inode+0x79/0x90 [<ffffffff801a2b16>] sys_unlink+0x126/0x1a0 [<ffffffff80111479>] error_exit+0x0/0x81 [<ffffffff80110aba>] system_call+0x7e/0x83 First messages said that unlinked inode has i_nlink=0, then ext3_unlink() adds this inode into orphan list. Second message means that this inode has not been removed from orphan list. Inode dump has showed that i_fop = &bad_file_ops and it can be set in make_bad_inode() only. Then I've found that ext3_read_inode() can call make_bad_inode() without any error/warning messages, for example in the following case: ... if (inode->i_nlink == 0) { if (inode->i_mode == 0 || !(EXT3_SB(inode->i_sb)->s_mount_state & EXT3_ORPHAN_FS)) { /* this inode is deleted */ brelse (bh); goto bad_inode; ... Bad inode can live some time, ext3_unlink can add it to orphan list, but ext3_delete_inode() do not deleted this inode from orphan list. As result we can have orphan list corruption detected in ext3_destroy_inode(). However it is not clear for me how to fix this issue correctly. As far as i see is_bad_inode() is called after iget() in all places excluding ext3_lookup() and ext3_get_parent(). I believe it makes sense to add bad inode check to these functions too and call iput if bad inode detected. Signed-off-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Jarod Wilson <jwilson@redhat.com> --- fs/ext3/namei.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c index 19a3e25..e97619c 100644 --- a/fs/ext3/namei.c +++ b/fs/ext3/namei.c @@ -1025,6 +1025,11 @@ static struct dentry *ext3_lookup(struct inode * dir, struct dentry *dentry, str if (!inode) return ERR_PTR(-EACCES); + + if (is_bad_inode(inode)) { + iput(inode); + return ERR_PTR(-ENOENT); + } } return d_splice_alias(inode, dentry); } @@ -1060,6 +1065,11 @@ struct dentry *ext3_get_parent(struct dentry *child) if (!inode) return ERR_PTR(-EACCES); + if (is_bad_inode(inode)) { + iput(inode); + return ERR_PTR(-ENOENT); + } + parent = d_alloc_anon(inode); if (!parent) { iput(inode); -- 1.5.3.5.645.gbb47