- Mar 22, 2023
-
-
Jan Kara via Ocfs2-devel authored
commit 90410bcf upstream. When buffered write fails to copy data into underlying page cache page, ocfs2_write_end_nolock() just zeroes out and dirties the page. This can leave dirty page beyond EOF and if page writeback tries to write this page before write succeeds and expands i_size, page gets into inconsistent state where page dirty bit is clear but buffer dirty bits stay set resulting in page data never getting written and so data copied to the page is lost. Fix the problem by invalidating page beyond EOF after failed write. Link: https://lkml.kernel.org/r/20230302153843.18499-1-jack@suse.cz Fixes: 6dbf7bb5 ("fs: Don't invalidate page buffers in block_write_full_page()") Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paulo Alcantara authored
commit 6284e46b upstream. Use DFS root session whenever possible to get new DFS referrals otherwise we might end up with an IPC tcon (tcon->ses->tcon_ipc) that doesn't respond to them. It should be safe accessing @ses->dfs_root_ses directly in cifs_inval_name_dfs_link_error() as it has same lifetime as of @tcon. Signed-off-by:
Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paulo Alcantara authored
commit f446a630 upstream. Return the DFS root session id in /proc/fs/cifs/DebugData to make it easier to track which IPC tcon was used to get new DFS referrals for a specific connection, and aids in debugging. A simple output of it would be Sessions: 1) Address: 192.168.1.13 Uses: 1 Capability: 0x300067 Session Status: 1 Security type: RawNTLMSSP SessionId: 0xd80000000009 User: 0 Cred User: 0 DFS root session id: 0x128006c000035 Signed-off-by:
Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paulo Alcantara authored
commit 396935de upstream. The UAF bug occurred because we were putting DFS root sessions in cifs_umount() while DFS cache refresher was being executed. Make DFS root sessions have same lifetime as DFS tcons so we can avoid the use-after-free bug is DFS cache refresher and other places that require IPCs to get new DFS referrals on. Also, get rid of mount group handling in DFS cache as we no longer need it. This fixes below use-after-free bug catched by KASAN [ 379.946955] BUG: KASAN: use-after-free in __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.947642] Read of size 8 at addr ffff888018f57030 by task kworker/u4:3/56 [ 379.948096] [ 379.948208] CPU: 0 PID: 56 Comm: kworker/u4:3 Not tainted 6.2.0-rc7-lku #23 [ 379.948661] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014 [ 379.949368] Workqueue: cifs-dfscache refresh_cache_worker [cifs] [ 379.949942] Call Trace: [ 379.950113] <TASK> [ 379.950260] dump_stack_lvl+0x50/0x67 [ 379.950510] print_report+0x16a/0x48e [ 379.950759] ? __virt_addr_valid+0xd8/0x160 [ 379.951040] ? __phys_addr+0x41/0x80 [ 379.951285] kasan_report+0xdb/0x110 [ 379.951533] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.952056] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.952585] __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.953096] ? __pfx___refresh_tcon.isra.0+0x10/0x10 [cifs] [ 379.953637] ? __pfx___mutex_lock+0x10/0x10 [ 379.953915] ? lock_release+0xb6/0x720 [ 379.954167] ? __pfx_lock_acquire+0x10/0x10 [ 379.954443] ? refresh_cache_worker+0x34e/0x6d0 [cifs] [ 379.954960] ? __pfx_wb_workfn+0x10/0x10 [ 379.955239] refresh_cache_worker+0x4ad/0x6d0 [cifs] [ 379.955755] ? __pfx_refresh_cache_worker+0x10/0x10 [cifs] [ 379.956323] ? __pfx_lock_acquired+0x10/0x10 [ 379.956615] ? read_word_at_a_time+0xe/0x20 [ 379.956898] ? lockdep_hardirqs_on_prepare+0x12/0x220 [ 379.957235] process_one_work+0x535/0x990 [ 379.957509] ? __pfx_process_one_work+0x10/0x10 [ 379.957812] ? lock_acquired+0xb7/0x5f0 [ 379.958069] ? __list_add_valid+0x37/0xd0 [ 379.958341] ? __list_add_valid+0x37/0xd0 [ 379.958611] worker_thread+0x8e/0x630 [ 379.958861] ? __pfx_worker_thread+0x10/0x10 [ 379.959148] kthread+0x17d/0x1b0 [ 379.959369] ? __pfx_kthread+0x10/0x10 [ 379.959630] ret_from_fork+0x2c/0x50 [ 379.959879] </TASK> Signed-off-by:
Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paulo Alcantara authored
commit b56bce50 upstream. Set the DFS root session pointer earlier when creating a new SMB session to prevent racing with smb2_reconnect(), cifs_reconnect_tcon() and DFS cache refresher. Signed-off-by:
Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Volker Lendecke authored
commit 211baef0 upstream. If cifs_get_writable_path() finds a writable file, smb2_compound_op() must use that file's FID and not the COMPOUND_FID. Cc: stable@vger.kernel.org Signed-off-by:
Volker Lendecke <vl@samba.org> Reviewed-by:
Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shyam Prasad N authored
commit 05ce0448 upstream. Before my changes to how multichannel reconnects work, the primary channel was always used to do a non-binding session setup. With my changes, that is not the case anymore. Missed this place where channel at index 0 was forcibly updated with the signing key. Signed-off-by:
Shyam Prasad N <sprasad@microsoft.com> Reviewed-by:
Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 70e42fea upstream. Fixes: 0813299c ("ext4: Fix possible corruption when moving a directory") Link: https://lore.kernel.org/r/5efbe1b9-ad8b-4a4f-b422-24824d2b775c@kili.mountain Reported-by:
Dan Carpenter <error27@gmail.com> Reported-by:
<syzbot+0c73d1d8b952c5f3d714@syzkaller.appspotmail.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Baokun Li authored
[ Upstream commit 0f7bfd6f ] Syzbot reported a hung task problem: ================================================================== INFO: task syz-executor232:5073 blocked for more than 143 seconds. Not tainted 6.2.0-rc2-syzkaller-00024-g512dee0c00ad #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-exec232 state:D stack:21024 pid:5073 ppid:5072 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5244 [inline] __schedule+0x995/0xe20 kernel/sched/core.c:6555 schedule+0xcb/0x190 kernel/sched/core.c:6631 __wait_on_freeing_inode fs/inode.c:2196 [inline] find_inode_fast+0x35a/0x4c0 fs/inode.c:950 iget_locked+0xb1/0x830 fs/inode.c:1273 __ext4_iget+0x22e/0x3ed0 fs/ext4/inode.c:4861 ext4_xattr_inode_iget+0x68/0x4e0 fs/ext4/xattr.c:389 ext4_xattr_inode_dec_ref_all+0x1a7/0xe50 fs/ext4/xattr.c:1148 ext4_xattr_delete_inode+0xb04/0xcd0 fs/ext4/xattr.c:2880 ext4_evict_inode+0xd7c/0x10b0 fs/ext4/inode.c:296 evict+0x2a4/0x620 fs/inode.c:664 ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 __ext4_fill_super fs/ext4/super.c:5516 [inline] ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fa5406fd5ea RSP: 002b:00007ffc7232f968 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa5406fd5ea RDX: 0000000020000440 RSI: 0000000020000000 RDI: 00007ffc7232f970 RBP: 00007ffc7232f970 R08: 00007ffc7232f9b0 R09: 0000000000000432 R10: 0000000000804a03 R11: 0000000000000202 R12: 0000000000000004 R13: 0000555556a7a2c0 R14: 00007ffc7232f9b0 R15: 0000000000000000 </TASK> ================================================================== The problem is that the inode contains an xattr entry with ea_inum of 15 when cleaning up an orphan inode <15>. When evict inode <15>, the reference counting of the corresponding EA inode is decreased. When EA inode <15> is found by find_inode_fast() in __ext4_iget(), it is found that the EA inode holds the I_FREEING flag and waits for the EA inode to complete deletion. As a result, when inode <15> is being deleted, we wait for inode <15> to complete the deletion, resulting in an infinite loop and triggering Hung Task. To solve this problem, we only need to check whether the ino of EA inode and parent is the same before getting EA inode. Link: https://syzkaller.appspot.com/bug?extid=77d6fcc37bbb92f26048 Reported-by:
<syzbot+77d6fcc37bbb92f26048@syzkaller.appspotmail.com> Signed-off-by:
Baokun Li <libaokun1@huawei.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230110133436.996350-1-libaokun1@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Baokun Li authored
[ Upstream commit 3039d8b8 ] When mounting a crafted ext4 image, s_journal_inum may change after journal replay, which is obviously unreasonable because we have successfully loaded and replayed the journal through the old s_journal_inum. And the new s_journal_inum bypasses some of the checks in ext4_get_journal(), which may trigger a null pointer dereference problem. So if s_journal_inum changes after the journal replay, we ignore the change, and rewrite the current journal_inum to the superblock. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216541 Reported-by:
Luís Henriques <lhenriques@suse.de> Signed-off-by:
Baokun Li <libaokun1@huawei.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230107032126.4165860-3-libaokun1@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Zhang Xiaoxu authored
[ Upstream commit d0dc4111 ] When send SMB_COM_NT_CANCEL and RFC1002_SESSION_REQUEST, the in_send statistic was lost. Let's move the in_send statistic to the send function to avoid this scenario. Fixes: 7ee1af76 ("[CIFS]") Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
- Mar 17, 2023
-
-
Seth Forshee authored
commit 42d0c4bd upstream. A user should be allowed to take out a lease via an idmapped mount if the fsuid matches the mapped uid of the inode. generic_setlease() is checking the unmapped inode uid, causing these operations to be denied. Fix this by comparing against the mapped inode uid instead of the unmapped uid. Fixes: 9caccd41 ("fs: introduce MOUNT_ATTR_IDMAP") Cc: stable@vger.kernel.org Signed-off-by:
Seth Forshee (DigitalOcean) <sforshee@kernel.org> Signed-off-by:
Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kara authored
[ Upstream commit 3c92792d ] As lockdep properly warns, we should not be locking i_rwsem while having transactions started as the proper lock ordering used by all directory handling operations is i_rwsem -> transaction start. Fix the lock ordering by moving the locking of the directory earlier in ext4_rename(). Reported-by:
<syzbot+9d16c39efb5fade84574@syzkaller.appspotmail.com> Fixes: 0813299c ("ext4: Fix possible corruption when moving a directory") Link: https://syzkaller.appspot.com/bug?extid=9d16c39efb5fade84574 Signed-off-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230301141004.15087-1-jack@suse.cz Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Gao Xiang authored
[ Upstream commit 647dd2c3 ] Let's revert commit 12724ba3 ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL") since kvmalloc() already supports __GFP_NOFAIL in commit a421ef30 ("mm: allow !GFP_KERNEL allocations for kvmalloc"). So the original fix was wrong. Actually there was some issue as [1] discussed, so before that mm fix is landed, the warn could still happen but applying this commit first will cause less. [1] https://lore.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com Fixes: 12724ba3 ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL") Reviewed-by:
Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230309053148.9223-1-hsiangkao@linux.alibaba.com Signed-off-by:
Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Chuck Lever authored
[ Upstream commit fd9a2e1d ] Flole observes this WARNING on occasion: [1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0 Reported-by:
<flole@flole.de> Suggested-by:
Jan Kara <jack@suse.cz> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217123 Fixes: 73da852e ("nfsd: use vfs_iter_read/write") Reviewed-by:
Jeff Layton <jlayton@kernel.org> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Filipe Manana authored
[ Upstream commit e4cc1483 ] At btrfs_drop_extent_map_range() we are clearing the EXTENT_FLAG_LOGGING bit on a 'flags' variable that was not initialized. This makes static checkers complain about it, so initialize the 'flags' variable before clearing the bit. In practice this has no consequences, because EXTENT_FLAG_LOGGING should not be set when btrfs_drop_extent_map_range() is called, as an fsync locks the inode in exclusive mode, locks the inode's mmap semaphore in exclusive mode too and it always flushes all delalloc. Also add a comment about why we clear EXTENT_FLAG_LOGGING on a copy of the flags of the split extent map. Reported-by:
Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/linux-btrfs/Y%2FyipSVozUDEZKow@kili/ Fixes: db21370b ("btrfs: drop extent map range more efficiently") Signed-off-by:
Filipe Manana <fdmanana@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jan Kara authored
[ Upstream commit 0813299c ] When we are renaming a directory to a different directory, we need to update '..' entry in the moved directory. However nothing prevents moved directory from being modified and even converted from the inline format to the normal format. When such race happens the rename code gets confused and we crash. Fix the problem by locking the moved directory. CC: stable@vger.kernel.org Fixes: 32f7f22c ("ext4: let ext4_rename handle inline dir") Signed-off-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230126112221.11866-1-jack@suse.cz Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jan Kara authored
[ Upstream commit f54aa97f ] The condition determining whether the preallocation can be used had an off-by-one error so we didn't discard preallocation when new allocation was just following it. This can then confuse code in inode_getblk(). CC: stable@vger.kernel.org Fixes: 16d05565 ("udf: Discard preallocation before extending file with a hole") Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Zhihao Cheng authored
commit f5361da1 upstream. If the boot loader inode has never been used before, the EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the i_size to 0. However, if the "never before used" boot loader has a non-zero i_size, then i_disksize will be non-zero, and the inconsistency between i_size and i_disksize can trigger a kernel warning: WARNING: CPU: 0 PID: 2580 at fs/ext4/file.c:319 CPU: 0 PID: 2580 Comm: bb Not tainted 6.3.0-rc1-00004-g703695902cfa RIP: 0010:ext4_file_write_iter+0xbc7/0xd10 Call Trace: vfs_write+0x3b1/0x5c0 ksys_write+0x77/0x160 __x64_sys_write+0x22/0x30 do_syscall_64+0x39/0x80 Reproducer: 1. create corrupted image and mount it: mke2fs -t ext4 /tmp/foo.img 200 debugfs -wR "sif <5> size 25700" /tmp/foo.img mount -t ext4 /tmp/foo.img /mnt cd /mnt echo 123 > file 2. Run the reproducer program: posix_memalign(&buf, 1024, 1024) fd = open("file", O_RDWR | O_DIRECT); ioctl(fd, EXT4_IOC_SWAP_BOOT); write(fd, buf, 1024); Fix this by setting i_disksize as well as i_size to zero when initiaizing the boot loader inode. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217159 Cc: stable@kernel.org Signed-off-by:
Zhihao Cheng <chengzhihao1@huawei.com> Link: https://lore.kernel.org/r/20230308032643.641113-1-chengzhihao1@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ye Bin authored
commit 2b96b4a5 upstream. Syzbot found the following issue: EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni" fscrypt: AES-256-XTS using implementation "xts-aes-aesni" ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5071 at mm/page_alloc.c:5525 __alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 Modules linked in: CPU: 1 PID: 5071 Comm: syz-executor263 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:__alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 RSP: 0018:ffffc90003c2f1c0 EFLAGS: 00010246 RAX: ffffc90003c2f220 RBX: 0000000000000014 RCX: 0000000000000000 RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffc90003c2f248 RBP: ffffc90003c2f2d8 R08: dffffc0000000000 R09: ffffc90003c2f220 R10: fffff52000785e49 R11: 1ffff92000785e44 R12: 0000000000040d40 R13: 1ffff92000785e40 R14: dffffc0000000000 R15: 1ffff92000785e3c FS: 0000555556c0d300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95d5e04138 CR3: 00000000793aa000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __alloc_pages_node include/linux/gfp.h:237 [inline] alloc_pages_node include/linux/gfp.h:260 [inline] __kmalloc_large_node+0x95/0x1e0 mm/slab_common.c:1113 __do_kmalloc_node mm/slab_common.c:956 [inline] __kmalloc+0xfe/0x190 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] kzalloc include/linux/slab.h:720 [inline] ext4_update_inline_data+0x236/0x6b0 fs/ext4/inline.c:346 ext4_update_inline_dir fs/ext4/inline.c:1115 [inline] ext4_try_add_inline_entry+0x328/0x990 fs/ext4/inline.c:1307 ext4_add_entry+0x5a4/0xeb0 fs/ext4/namei.c:2385 ext4_add_nondir+0x96/0x260 fs/ext4/namei.c:2772 ext4_create+0x36c/0x560 fs/ext4/namei.c:2817 lookup_open fs/namei.c:3413 [inline] open_last_lookups fs/namei.c:3481 [inline] path_openat+0x12ac/0x2dd0 fs/namei.c:3711 do_filp_open+0x264/0x4f0 fs/namei.c:3741 do_sys_openat2+0x124/0x4e0 fs/open.c:1310 do_sys_open fs/open.c:1326 [inline] __do_sys_openat fs/open.c:1342 [inline] __se_sys_openat fs/open.c:1337 [inline] __x64_sys_openat+0x243/0x290 fs/open.c:1337 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue happens as follows: ext4_iget ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60 ext4_try_add_inline_entry __ext4_mark_inode_dirty ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44 ext4_xattr_shift_entries ->after shift i_inline_off is incorrect, actually is change to 176 ext4_try_add_inline_entry ext4_update_inline_dir get_max_inline_xattr_value_size if (EXT4_I(inode)->i_inline_off) entry = (struct ext4_xattr_entry *)((void *)raw_inode + EXT4_I(inode)->i_inline_off); free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)); ->As entry is incorrect, then 'free' may be negative ext4_update_inline_data value = kzalloc(len, GFP_NOFS); -> len is unsigned int, maybe very large, then trigger warning when 'kzalloc()' To resolve the above issue we need to update 'i_inline_off' after 'ext4_xattr_shift_entries()'. We do not need to set EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty() already sets this flag if needed. Setting EXT4_STATE_MAY_INLINE_DATA when it is needed may trigger a BUG_ON in ext4_writepages(). Reported-by:
<syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com> Cc: stable@kernel.org Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ye Bin authored
commit 1dcdce59 upstream. The only caller of ext4_find_inline_data_nolock() that needs setting of EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode(). In ext4_write_inline_data_end() we just need to update inode->i_inline_off. Since we are going to add one more caller that does not need to set EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA out to ext4_iget_extra_inode(). Signed-off-by:
Ye Bin <yebin10@huawei.com> Cc: stable@kernel.org Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-2-yebin@huaweicloud.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit c993799b upstream. Apparently syzbot figured out that issuing this FSMAP call: struct fsmap_head cmd = { .fmh_count = ...; .fmh_keys = { { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, }, ... }; ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd); Produces this crash if the underlying filesystem is a 1k-block ext4 filesystem: kernel BUG at fs/ext4/ext4.h:3331! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 3 PID: 3227965 Comm: xfs_io Tainted: G W O 6.2.0-rc8-achx Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4] RSP: 0018:ffffc90007c03998 EFLAGS: 00010246 RAX: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000 RDX: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11 RBP: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400 R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001 R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398 FS: 00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0 Call Trace: <TASK> ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] __ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] __x64_sys_ioctl+0x82/0xa0 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fdf20558aff RSP: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff RDX: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003 RBP: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010 R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000 For GETFSMAP calls, the caller selects a physical block device by writing its block number into fsmap_head.fmh_keys[01].fmr_device. To query mappings for a subrange of the device, the starting byte of the range is written to fsmap_head.fmh_keys[0].fmr_physical and the last byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical. IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd set the inputs as follows: fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3}, fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14}, Which would return you whatever is mapped in the 12 bytes starting at physical offset 3. The crash is due to insufficient range validation of keys[1] in ext4_getfsmap_datadev. On 1k-block filesystems, block 0 is not part of the filesystem, which means that s_first_data_block is nonzero. ext4_get_group_no_and_offset subtracts this quantity from the blocknr argument before cracking it into a group number and a block number within a group. IOWs, block group 0 spans blocks 1-8192 (1-based) instead of 0-8191 (0-based) like what happens with larger blocksizes. The net result of this encoding is that blocknr < s_first_data_block is not a valid input to this function. The end_fsb variable is set from the keys that are copied from userspace, which means that in the above example, its value is zero. That leads to an underflow here: blocknr = blocknr - le32_to_cpu(es->s_first_data_block); The division then operates on -1: offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> EXT4_SB(sb)->s_cluster_bits; Leaving an impossibly large group number (2^32-1) in blocknr. ext4_getfsmap_check_keys checked that keys[0].fmr_physical and keys[1].fmr_physical are in increasing order, but ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least s_first_data_block. This implies that we have to check it again after the adjustment, which is the piece that I forgot. Reported-by:
<syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com> Fixes: 4a495624 ("ext4: fix off-by-one fsmap error on 1k block filesystems") Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002 Cc: stable@vger.kernel.org Signed-off-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnolia Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Whitney authored
commit c9f62c8b upstream. A significant number of xfstests can cause ext4 to log one or more warning messages when they are run on a test file system where the inline_data feature has been enabled. An example: "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode #16385: comm fsstress: No space for directory leaf checksum. Please run e2fsck -D." The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068, 070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585, 589, 626, 631, and 650. In this situation, the warning message indicates a bug in the code that performs the RENAME_WHITEOUT operation on a directory entry that has been stored inline. It doesn't detect that the directory is stored inline, and incorrectly attempts to compute a dirent block checksum on the whiteout inode when creating it. This attempt fails as a result of the integrity checking in get_dirent_tail (usually due to a failure to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message is then emitted. Fix this by simply collecting the inlined data state at the time the search for the source directory entry is performed. Existing code handles the rest, and this is sufficient to eliminate all spurious warning messages produced by the tests above. Go one step further and do the same in the code that resets the source directory entry in the event of failure. The inlined state should be present in the "old" struct, but given the possibility of a race there's no harm in taking a conservative approach and getting that information again since the directory entry is being reread anyway. Fixes: b7ff91fd ("ext4: find old entry again if failed to rename whiteout") Cc: stable@kernel.org Signed-off-by:
Eric Whitney <enwlinux@gmail.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230210173244.679890-1-enwlinux@gmail.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit ffec85d5 upstream. When writing a page from an encrypted file that is using filesystem-layer encryption (not inline encryption), ext4 encrypts the pagecache page into a bounce page, then writes the bounce page. It also passes the bounce page to wbc_account_cgroup_owner(). That's incorrect, because the bounce page is a newly allocated temporary page that doesn't have the memory cgroup of the original pagecache page. This makes wbc_account_cgroup_owner() not account the I/O to the owner of the pagecache page as it should. Fix this by always passing the pagecache page to wbc_account_cgroup_owner(). Fixes: 001e4a87 ("ext4: implement cgroup writeback support") Cc: stable@vger.kernel.org Reported-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by:
Eric Biggers <ebiggers@google.com> Acked-by:
Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20230203005503.141557-1-ebiggers@kernel.org Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gao Xiang authored
commit 8f121dfb upstream. As the call trace shown, the root cause is kunmap incorrect pages: BUG: kernel NULL pointer dereference, address: 00000000 CPU: 1 PID: 40 Comm: kworker/u5:0 Not tainted 6.2.0-rc5 #4 Workqueue: erofs_worker z_erofs_decompressqueue_work EIP: z_erofs_lzma_decompress+0x34b/0x8ac z_erofs_decompress+0x12/0x14 z_erofs_decompress_queue+0x7e7/0xb1c z_erofs_decompressqueue_work+0x32/0x60 process_one_work+0x24b/0x4d8 ? process_one_work+0x1a4/0x4d8 worker_thread+0x14c/0x3fc kthread+0xe6/0x10c ? rescuer_thread+0x358/0x358 ? kthread_complete_and_exit+0x18/0x18 ret_from_fork+0x1c/0x28 ---[ end trace 0000000000000000 ]--- The bug is trivial and should be fixed now. It has no impact on !HIGHMEM platforms. Fixes: 622ceadd ("erofs: lzma compression support") Cc: <stable@vger.kernel.org> # 5.16+ Reviewed-by:
Yue Hu <huyue2@coolpad.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230305134455.88236-1-hsiangkao@linux.alibaba.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Filipe Manana authored
commit 675dfe12 upstream. We can often end up inserting a block group item, for a new block group, with a wrong value for the used bytes field. This happens if for the new allocated block group, in the same transaction that created the block group, we have tasks allocating extents from it as well as tasks removing extents from it. For example: 1) Task A creates a metadata block group X; 2) Two extents are allocated from block group X, so its "used" field is updated to 32K, and its "commit_used" field remains as 0; 3) Transaction commit starts, by some task B, and it enters btrfs_start_dirty_block_groups(). There it tries to update the block group item for block group X, which currently has its "used" field with a value of 32K. But that fails since the block group item was not yet inserted, and so on failure update_block_group_item() sets the "commit_used" field of the block group back to 0; 4) The block group item is inserted by task A, when for example btrfs_create_pending_block_groups() is called when releasing its transaction handle. This results in insert_block_group_item() inserting the block group item in the extent tree (or block group tree), with a "used" field having a value of 32K, but without updating the "commit_used" field in the block group, which remains with value of 0; 5) The two extents are freed from block X, so its "used" field changes from 32K to 0; 6) The transaction commit by task B continues, it enters btrfs_write_dirty_block_groups() which calls update_block_group_item() for block group X, and there it decides to skip the block group item update, because "used" has a value of 0 and "commit_used" has a value of 0 too. As a result, we end up with a block item having a 32K "used" field but no extents allocated from it. When this issue happens, a btrfs check reports an error like this: [1/7] checking root items [2/7] checking extents block group [1104150528 1073741824] used 39796736 but extent items used 0 ERROR: errors found in extent allocation tree or chunk allocation (...) Fix this by making insert_block_group_item() update the block group's "commit_used" field. Fixes: 7248e0ce ("btrfs: skip update of block group item if used bytes are the same") CC: stable@vger.kernel.org # 6.2+ Reviewed-by:
Qu Wenruo <wqu@suse.com> Signed-off-by:
Filipe Manana <fdmanana@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johannes Thumshirn authored
commit 95cd356c upstream. We have a report, that the info message for block-group reclaim is crossing the 100% used mark. This is happening as we were truncating the divisor for the division (the block_group->length) to a 32bit value. Fix this by using div64_u64() to not truncate the divisor. In the worst case, it can lead to a div by zero error and should be possible to trigger on 4 disks RAID0, and each device is large enough: $ mkfs.btrfs -f /dev/test/scratch[1234] -m raid1 -d raid0 btrfs-progs v6.1 [...] Filesystem size: 40.00GiB Block group profiles: Data: RAID0 4.00GiB <<< Metadata: RAID1 256.00MiB System: RAID1 8.00MiB Reported-by:
Forza <forza@tnonline.net> Link: https://lore.kernel.org/linux-btrfs/e99483.c11a58d.1863591ca52@tnonline.net/ Fixes: 5f93e776 ("btrfs: zoned: print unusable percentage when reclaiming block groups") CC: stable@vger.kernel.org # 5.15+ Reviewed-by:
Anand Jain <anand.jain@oracle.com> Reviewed-by:
Qu Wenruo <wqu@suse.com> Signed-off-by:
Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by:
David Sterba <dsterba@suse.com> [ add Qu's note ] Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Naohiro Aota authored
commit 98e8d36a upstream. Current btrfs_log_dev_io_error() increases the read error count even if the erroneous IO is a WRITE request. This is because it forget to use "else if", and all the error WRITE requests counts as READ error as there is (of course) no REQ_RAHEAD bit set. Fixes: c3a62baf ("btrfs: use chained bios when cloning") CC: stable@vger.kernel.org # 6.1+ Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by:
Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by:
David Sterba <dsterba@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 609d5444 upstream. Google-Bug-Id: 114199369 Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Mar 11, 2023
-
-
Zhang Yi authored
[ Upstream commit e3645d72 ] Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to show incorrect options, e.g. show fc_debug_force if we mount with errors=continue mode and miss it if we set. $ mkfs.ext4 /dev/pmem0 $ mount -o errors=remount-ro /dev/pmem0 /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force #empty $ mount -o remount,errors=continue /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force fc_debug_force $ mount -o remount,errors=remount-ro,fc_debug_force /mnt $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force #empty Fixes: 995a3ed6 ("ext4: add fast_commit feature and handling for extended mount options") Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230129034939.3702550-1-yi.zhang@huaweicloud.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Daeho Jeong authored
[ Upstream commit a46bebd5 ] To fix a race condition between atomic write aborts, I use the inode lock and make COW inode to be re-usable thoroughout the whole atomic file inode lifetime. Reported-by:
<syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com> Fixes: 3db1de0e ("f2fs: change the current atomic write way") Signed-off-by:
Daeho Jeong <daehojeong@google.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Wang Jianjian authored
[ Upstream commit 934b0de1 ] If commit interval is 0, it means using default value. Fixes: 6e47a3cc ("ext4: get rid of super block and sbi from handle_mount_ops()") Signed-off-by:
Wang Jianjian <wangjianjian3@huawei.com> Link: https://lore.kernel.org/r/20221219015128.876717-1-wangjianjian3@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Biggers authored
[ Upstream commit 11768cfd ] To avoid 'sparse' warnings about missing endianness conversions, don't store native endianness values into struct ext4_fc_tl. Instead, use a separate struct type, ext4_fc_tl_mem. Fixes: dcc58274 ("ext4: factor out ext4_fc_get_tl()") Cc: Ye Bin <yebin10@huawei.com> Signed-off-by:
Eric Biggers <ebiggers@google.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221217050212.150665-1-ebiggers@kernel.org Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yangtao Li authored
[ Upstream commit c5bf8348 ] For LFS mode, it should update outplace and no need inplace update. When using LFS mode for small-volume devices, IPU will not be used, and the OPU writing method is actually used, but F2FS_IPU_FORCE can be read from the ipu_policy node, which is different from the actual situation. And remount to lfs mode should be disallowed when f2fs ipu is enabled, let's fix it. Fixes: 84b89e5d ("f2fs: add auto tuning for small devices") Signed-off-by:
Yangtao Li <frank.li@vivo.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yangtao Li authored
[ Upstream commit fdb7ccc3 ] IS_F2FS_IPU_* macro can be used to identify whether f2fs ipu related policies are enabled. BTW, convert to use BIT() instead of open code. Signed-off-by:
Yangtao Li <frank.li@vivo.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: c5bf8348 ("f2fs: fix to set ipu policy") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Chao Yu authored
[ Upstream commit a84153f9 ] We should update age extent in f2fs_do_zero_range() like we did in f2fs_truncate_data_blocks_range(). Fixes: 71644dff ("f2fs: add block_age-based extent cache") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Chao Yu authored
[ Upstream commit 8c0ed062 ] nr_free may be less than len, we should update age extent cache w/ range [fofs, len] rather than [fofs, nr_free]. Fixes: 71644dff ("f2fs: add block_age-based extent cache") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yangtao Li authored
[ Upstream commit 0dbbf0fb ] Add iotype sanity check to avoid potential memory corruption. This is to fix the compile error below: fs/f2fs/iostat.c:231 __update_iostat_latency() error: buffer overflow 'io_lat->peak_lat[type]' 3 <= 3 vim +228 fs/f2fs/iostat.c 211 static inline void __update_iostat_latency(struct bio_iostat_ctx *iostat_ctx, 212 enum iostat_lat_type type) 213 { 214 unsigned long ts_diff; 215 unsigned int page_type = iostat_ctx->type; 216 struct f2fs_sb_info *sbi = iostat_ctx->sbi; 217 struct iostat_lat_info *io_lat = sbi->iostat_io_lat; 218 unsigned long flags; 219 220 if (!sbi->iostat_enable) 221 return; 222 223 ts_diff = jiffies - iostat_ctx->submit_ts; 224 if (page_type >= META_FLUSH) ^^^^^^^^^^ 225 page_type = META; 226 227 spin_lock_irqsave(&sbi->iostat_lat_lock, flags); @228 io_lat->sum_lat[type][page_type] += ts_diff; ^^^^^^^^^ Mixup between META_FLUSH and NR_PAGE_TYPE leads to memory corruption. Fixes: a4b68176 ("f2fs: introduce periodic iostat io latency traces") Reported-by:
kernel test robot <lkp@intel.com> Reported-by:
Dan Carpenter <error27@gmail.com> Suggested-by:
Chao Yu <chao@kernel.org> Suggested-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Yangtao Li <frank.li@vivo.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Chao Yu authored
[ Upstream commit 933141e4 ] Otherwise, 32-bits binary call ioctl(F2FS_IOC_START_ATOMIC_REPLACE) will fail in 64-bits kernel. Fixes: 41e8f85a ("f2fs: introduce F2FS_IOC_START_ATOMIC_REPLACE") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Zhihao Cheng authored
[ Upstream commit 66f4742e ] There are two states for ubifs writing pages: 1. Dirty, Private 2. Not Dirty, Not Private The normal process cannot go to ubifs_releasepage() which means there exists pages being private but not dirty. Reproducer[1] shows that it could occur (which maybe related to [2]) with following process: PA PB PC lock(page)[PA] ubifs_write_end attach_page_private // set Private __set_page_dirty_nobuffers // set Dirty unlock(page) write_cache_pages[PA] lock(page) clear_page_dirty_for_io(page) // clear Dirty ubifs_writepage do_truncation[PB] truncate_setsize i_size_write(inode, newsize) // newsize = 0 i_size = i_size_read(inode) // i_size = 0 end_index = i_size >> PAGE_SHIFT if (page->index > end_index) goto out // jump out: unlock(page) // Private, Not Dirty generic_fadvise[PC] lock(page) invalidate_inode_page try_to_release_page ubifs_releasepage ubifs_assert(c, 0) // bad assertion! unlock(page) truncate_pagecache[PB] Then we may get following assertion failed: UBIFS error (ubi0:0 pid 1683): ubifs_assert_failed [ubifs]: UBIFS assert failed: 0, in fs/ubifs/file.c:1513 UBIFS warning (ubi0:0 pid 1683): ubifs_ro_mode [ubifs]: switched to read-only mode, error -22 CPU: 2 PID: 1683 Comm: aa Not tainted 5.16.0-rc5-00184-g0bca5994cacc-dirty #308 Call Trace: dump_stack+0x13/0x1b ubifs_ro_mode+0x54/0x60 [ubifs] ubifs_assert_failed+0x4b/0x80 [ubifs] ubifs_releasepage+0x67/0x1d0 [ubifs] try_to_release_page+0x57/0xe0 invalidate_inode_page+0xfb/0x130 __invalidate_mapping_pages+0xb9/0x280 invalidate_mapping_pagevec+0x12/0x20 generic_fadvise+0x303/0x3c0 ksys_fadvise64_64+0x4c/0xb0 [1] https://bugzilla.kernel.org/show_bug.cgi?id=215373 [2] https://linux-mtd.infradead.narkive.com/NQoBeT1u/patch-rfc-ubifs-fix-assert-failed-in-ubifs-set-page-dirty Fixes: 1e51764a ("UBIFS: add new flash file system") Signed-off-by:
Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-