Skip to content
Snippets Groups Projects
  1. Feb 02, 2024
  2. Feb 01, 2024
  3. Jan 31, 2024
  4. Jan 30, 2024
    • Darrick J. Wong's avatar
      xfs: remove conditional building of rt geometry validator functions · 881f78f4
      Darrick J. Wong authored
      
      I mistakenly turned off CONFIG_XFS_RT in the Kconfig file for arm64
      variant of the djwong-wtf git branch.  Unfortunately, it took me a good
      hour to figure out that RT wasn't built because this is what got printed
      to dmesg:
      
      XFS (sda2): realtime geometry sanity check failed
      XFS (sda2): Metadata corruption detected at xfs_sb_read_verify+0x170/0x190 [xfs], xfs_sb block 0x0
      
      Whereas I would have expected:
      
      XFS (sda2): Not built with CONFIG_XFS_RT
      XFS (sda2): RT mount failed
      
      The root cause of these problems is the conditional compilation of the
      new functions xfs_validate_rtextents and xfs_compute_rextslog that I
      introduced in the two commits listed below.  The !RT versions of these
      functions return false and 0, respectively, which causes primary
      superblock validation to fail, which explains the first message.
      
      Move the two functions to other parts of libxfs that are not
      conditionally defined by CONFIG_XFS_RT and remove the broken stubs so
      that validation works again.
      
      Fixes: e1429380 ("xfs: don't allow overly small or large realtime volumes")
      Fixes: a6a38f30 ("xfs: make rextslog computation consistent with mkfs")
      Signed-off-by: default avatar"Darrick J. Wong" <djwong@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
      881f78f4
  5. Jan 29, 2024
  6. Jan 28, 2024
  7. Jan 27, 2024
    • Chunhai Guo's avatar
      erofs: relaxed temporary buffers allocation on readahead · d9281660
      Chunhai Guo authored
      Even with inplace decompression, sometimes very few temporary buffers
      may be still needed for a single decompression shot (e.g. 16 pages for
      64k sliding window or 4 pages for 16k sliding window).  In low-memory
      scenarios, it would be better to try to allocate with GFP_NOWAIT on
      readahead first.  That can help reduce the time spent on page allocation
      under durative memory pressure.
      
      Here are detailed performance numbers under multi-app launch benchmark
      workload [1] on ARM64 Android devices (8-core CPU and 8GB of memory)
      running a 5.15 LTS kernel with EROFS of 4k pclusters:
      
      +----------------------------------------------+
      |      LZ4       | vanilla | patched |  diff   |
      |----------------+---------+---------+---------|
      |  Average (ms)  |  3364   |  2684   | -20.21% | [64k sliding window]
      |----------------+---------+---------+---------|
      |  Average (ms)  |  2079   |  1610   | -22.56% | [16k sliding window]
      +----------------------------------------------+
      
      The total size of system images for 4k pclusters is almost unchanged:
      (64k sliding window)  9,117,044 KB
      (16k sliding window)  9,113,096 KB
      
      Therefore, in addition to switch the sliding window from 64k to 16k,
      after applying this patch, it can eventually save 52.14% (3364 -> 1610)
      on average with no memory reservation.  That is particularly useful for
      embedded devices with limited resources.
      
      [1] https://lore.kernel.org/r/20240109074143.4138783-1-guochunhai@vivo.com
      
      
      
      Suggested-by: default avatarGao Xiang <xiang@kernel.org>
      Signed-off-by: default avatarChunhai Guo <guochunhai@vivo.com>
      Signed-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarYue Hu <huyue2@coolpad.com>
      Link: https://lore.kernel.org/r/20240126140142.201718-1-hsiangkao@linux.alibaba.com
      d9281660
  8. Jan 26, 2024
  9. Jan 25, 2024
    • Lin Ma's avatar
      ksmbd: fix global oob in ksmbd_nl_policy · ebeae8ad
      Lin Ma authored
      
      Similar to a reported issue (check the commit b33fb5b8 ("net:
      qualcomm: rmnet: fix global oob in rmnet_policy"), my local fuzzer finds
      another global out-of-bounds read for policy ksmbd_nl_policy. See bug
      trace below:
      
      ==================================================================
      BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:386 [inline]
      BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600
      Read of size 1 at addr ffffffff8f24b100 by task syz-executor.1/62810
      
      CPU: 0 PID: 62810 Comm: syz-executor.1 Tainted: G                 N 6.1.0 #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:284 [inline]
       print_report+0x172/0x475 mm/kasan/report.c:395
       kasan_report+0xbb/0x1c0 mm/kasan/report.c:495
       validate_nla lib/nlattr.c:386 [inline]
       __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600
       __nla_parse+0x3e/0x50 lib/nlattr.c:697
       __nlmsg_parse include/net/netlink.h:748 [inline]
       genl_family_rcv_msg_attrs_parse.constprop.0+0x1b0/0x290 net/netlink/genetlink.c:565
       genl_family_rcv_msg_doit+0xda/0x330 net/netlink/genetlink.c:734
       genl_family_rcv_msg net/netlink/genetlink.c:833 [inline]
       genl_rcv_msg+0x441/0x780 net/netlink/genetlink.c:850
       netlink_rcv_skb+0x14f/0x410 net/netlink/af_netlink.c:2540
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:861
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x930/0xe50 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:714 [inline]
       sock_sendmsg+0x154/0x190 net/socket.c:734
       ____sys_sendmsg+0x6df/0x840 net/socket.c:2482
       ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536
       __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7fdd66a8f359
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fdd65e00168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007fdd66bbcf80 RCX: 00007fdd66a8f359
      RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000003
      RBP: 00007fdd66ada493 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007ffc84b81aff R14: 00007fdd65e00300 R15: 0000000000022000
       </TASK>
      
      The buggy address belongs to the variable:
       ksmbd_nl_policy+0x100/0xa80
      
      The buggy address belongs to the physical page:
      page:0000000034f47940 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1ccc4b
      flags: 0x200000000001000(reserved|node=0|zone=2)
      raw: 0200000000001000 ffffea00073312c8 ffffea00073312c8 0000000000000000
      raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffffffff8f24b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffffffff8f24b080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >ffffffff8f24b100: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 00 00 07 f9
                         ^
       ffffffff8f24b180: f9 f9 f9 f9 00 05 f9 f9 f9 f9 f9 f9 00 00 00 05
       ffffffff8f24b200: f9 f9 f9 f9 00 00 03 f9 f9 f9 f9 f9 00 00 04 f9
      ==================================================================
      
      To fix it, add a placeholder named __KSMBD_EVENT_MAX and let
      KSMBD_EVENT_MAX to be its original value - 1 according to what other
      netlink families do. Also change two sites that refer the
      KSMBD_EVENT_MAX to correct value.
      
      Cc: stable@vger.kernel.org
      Fixes: 0626e664 ("cifsd: add server handler for central processing and tranport layers")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Acked-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      ebeae8ad
    • Jingbo Xu's avatar
      erofs: get rid of unneeded GFP_NOFS · 97cf5d53
      Jingbo Xu authored
      
      Clean up some leftovers since there is no way for EROFS to be called
      again from a reclaim context.
      
      Signed-off-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20240124031945.130782-1-jefflexu@linux.alibaba.com
      
      
      Signed-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      97cf5d53
  10. Jan 24, 2024
    • Kent Overstreet's avatar
      bcachefs: discard path uses unlock_long() · 096386a5
      Kent Overstreet authored
      
      Some (bad) devices can have really terrible discard latency; we don't
      want them blocking memory reclaim and causing warnings.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      096386a5
    • Linus Torvalds's avatar
      uselib: remove use of __FMODE_EXEC · 3eab8301
      Linus Torvalds authored
      
      Jann Horn points out that uselib() really shouldn't trigger the new
      FMODE_EXEC logic introduced by commit 4759ff71 ("exec: __FMODE_EXEC
      instead of in_execve for LSMs").
      
      In fact, it shouldn't even have ever triggered the old pre-existing
      logic for __FMODE_EXEC (like the NFS code that makes executables not
      need read permissions).  Unlike a real execve(), that can work even with
      files that are purely executable by the user (not readable), uselib()
      has that MAY_READ requirement becasue it's really just a convenience
      wrapper around mmap() for legacy shared libraries.
      
      The whole FMODE_EXEC bit was originally introduced by commit
      b500531e ("[PATCH] Introduce FMODE_EXEC file flag"), primarily to
      give ETXTBUSY error returns for distributed filesystems.
      
      It has since grown a few other warts (like that NFS thing), but there
      really isn't any reason to use it for uselib(), and now that we are
      trying to use it to replace the horrid 'tsk->in_execve' flag, it's
      actively wrong.
      
      Of course, as Jann Horn also points out, nobody should be enabling
      CONFIG_USELIB in the first place in this day and age, but that's a
      different discussion entirely.
      
      Reported-by: default avatarJann Horn <jannh@google.com>
      Fixes: 4759ff71 ("exec: __FMODE_EXEC instead of in_execve for LSMs")
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3eab8301
    • Kees Cook's avatar
      exec: Distinguish in_execve from in_exec · 90383cc0
      Kees Cook authored
      
      Just to help distinguish the fs->in_exec flag from the current->in_execve
      flag, add comments in check_unsafe_exec() and copy_fs() for more
      context. Also note that in_execve is only used by TOMOYO now.
      
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: linux-mm@kvack.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      90383cc0
    • NeilBrown's avatar
      nfsd: fix RELEASE_LOCKOWNER · edcf9725
      NeilBrown authored
      
      The test on so_count in nfsd4_release_lockowner() is nonsense and
      harmful.  Revert to using check_for_locks(), changing that to not sleep.
      
      First: harmful.
      As is documented in the kdoc comment for nfsd4_release_lockowner(), the
      test on so_count can transiently return a false positive resulting in a
      return of NFS4ERR_LOCKS_HELD when in fact no locks are held.  This is
      clearly a protocol violation and with the Linux NFS client it can cause
      incorrect behaviour.
      
      If RELEASE_LOCKOWNER is sent while some other thread is still
      processing a LOCK request which failed because, at the time that request
      was received, the given owner held a conflicting lock, then the nfsd
      thread processing that LOCK request can hold a reference (conflock) to
      the lock owner that causes nfsd4_release_lockowner() to return an
      incorrect error.
      
      The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
      never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
      it knows that the error is impossible.  It assumes the lock owner was in
      fact released so it feels free to use the same lock owner identifier in
      some later locking request.
      
      When it does reuse a lock owner identifier for which a previous RELEASE
      failed, it will naturally use a lock_seqid of zero.  However the server,
      which didn't release the lock owner, will expect a larger lock_seqid and
      so will respond with NFS4ERR_BAD_SEQID.
      
      So clearly it is harmful to allow a false positive, which testing
      so_count allows.
      
      The test is nonsense because ... well... it doesn't mean anything.
      
      so_count is the sum of three different counts.
      1/ the set of states listed on so_stateids
      2/ the set of active vfs locks owned by any of those states
      3/ various transient counts such as for conflicting locks.
      
      When it is tested against '2' it is clear that one of these is the
      transient reference obtained by find_lockowner_str_locked().  It is not
      clear what the other one is expected to be.
      
      In practice, the count is often 2 because there is precisely one state
      on so_stateids.  If there were more, this would fail.
      
      In my testing I see two circumstances when RELEASE_LOCKOWNER is called.
      In one case, CLOSE is called before RELEASE_LOCKOWNER.  That results in
      all the lock states being removed, and so the lockowner being discarded
      (it is removed when there are no more references which usually happens
      when the lock state is discarded).  When nfsd4_release_lockowner() finds
      that the lock owner doesn't exist, it returns success.
      
      The other case shows an so_count of '2' and precisely one state listed
      in so_stateid.  It appears that the Linux client uses a separate lock
      owner for each file resulting in one lock state per lock owner, so this
      test on '2' is safe.  For another client it might not be safe.
      
      So this patch changes check_for_locks() to use the (newish)
      find_any_file_locked() so that it doesn't take a reference on the
      nfs4_file and so never calls nfsd_file_put(), and so never sleeps.  With
      this check is it safe to restore the use of check_for_locks() rather
      than testing so_count against the mysterious '2'.
      
      Fixes: ce3c4ad7 ("NFSD: Fix possible sleep during nfsd4_release_lockowner()")
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Cc: stable@vger.kernel.org # v6.2+
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      edcf9725
    • Shyam Prasad N's avatar
      cifs: fix stray unlock in cifs_chan_skip_or_disable · 993d1c34
      Shyam Prasad N authored
      
      A recent change moved the code that decides to skip
      a channel or disable multichannel entirely, into a
      helper function.
      
      During this, a mutex_unlock of the session_mutex
      should have been removed. Doing that here.
      
      Fixes: f591062b ("cifs: handle servers that still advertise multichannel after disabling")
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      993d1c34
    • Shyam Prasad N's avatar
      cifs: set replay flag for retries of write command · 4cdad802
      Shyam Prasad N authored
      
      Similar to the rest of the commands, this is a change
      to add replay flags on retry. This one does not add a
      back-off, considering that we may want to flush a write
      ASAP to the server. Considering that this will be a
      flush of cached pages, the retrans value is also not
      honoured.
      
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      4cdad802
    • Shyam Prasad N's avatar
      cifs: commands that are retried should have replay flag set · 4f1fffa2
      Shyam Prasad N authored
      
      MS-SMB2 states that the header flag SMB2_FLAGS_REPLAY_OPERATION
      needs to be set when a command needs to be retried, so that
      the server is aware that this is a replay for an operation that
      appeared before.
      
      This can be very important, for example, for state changing
      operations and opens which get retried following a reconnect;
      since the client maybe unaware of the status of the previous
      open.
      
      This is particularly important for multichannel scenario, since
      disconnection of one connection does not mean that the session
      is lost. The requests can be replayed on another channel.
      
      This change also makes use of exponential back-off before replays
      and also limits the number of retries to "retrans" mount option
      value.
      
      Also, this change does not modify the read/write codepath.
      
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      4f1fffa2
Loading