Skip to content
Snippets Groups Projects
  1. Jan 23, 2024
    • Zhengchao Shao's avatar
      netlink: fix potential sleeping issue in mqueue_flush_file · 234ec0b6
      Zhengchao Shao authored and Paolo Abeni's avatar Paolo Abeni committed
      
      I analyze the potential sleeping issue of the following processes:
      Thread A                                Thread B
      ...                                     netlink_create  //ref = 1
      do_mq_notify                            ...
        sock = netlink_getsockbyfilp          ...     //ref = 2
        info->notify_sock = sock;             ...
      ...                                     netlink_sendmsg
      ...                                       skb = netlink_alloc_large_skb  //skb->head is vmalloced
      ...                                       netlink_unicast
      ...                                         sk = netlink_getsockbyportid //ref = 3
      ...                                         netlink_sendskb
      ...                                           __netlink_sendskb
      ...                                             skb_queue_tail //put skb to sk_receive_queue
      ...                                         sock_put //ref = 2
      ...                                     ...
      ...                                     netlink_release
      ...                                       deferred_put_nlk_sk //ref = 1
      mqueue_flush_file
        spin_lock
        remove_notification
          netlink_sendskb
            sock_put  //ref = 0
              sk_free
                ...
                __sk_destruct
                  netlink_sock_destruct
                    skb_queue_purge  //get skb from sk_receive_queue
                      ...
                      __skb_queue_purge_reason
                        kfree_skb_reason
                          __kfree_skb
                          ...
                          skb_release_all
                            skb_release_head_state
                              netlink_skb_destructor
                                vfree(skb->head)  //sleeping while holding spinlock
      
      In netlink_sendmsg, if the memory pointed to by skb->head is allocated by
      vmalloc, and is put to sk_receive_queue queue, also the skb is not freed.
      When the mqueue executes flush, the sleeping bug will occur. Use
      vfree_atomic instead of vfree in netlink_skb_destructor to solve the issue.
      
      Fixes: c05cdb1b ("netlink: allow large data transfers from user-space")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Link: https://lore.kernel.org/r/20240122011807.2110357-1-shaozhengchao@huawei.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      234ec0b6
  2. Dec 29, 2023
  3. Dec 19, 2023
  4. Dec 07, 2023
    • Ido Schimmel's avatar
      drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group · e0378187
      Ido Schimmel authored
      
      The "NET_DM" generic netlink family notifies drop locations over the
      "events" multicast group. This is problematic since by default generic
      netlink allows non-root users to listen to these notifications.
      
      Fix by adding a new field to the generic netlink multicast group
      structure that when set prevents non-root users or root without the
      'CAP_SYS_ADMIN' capability (in the user namespace owning the network
      namespace) from joining the group. Set this field for the "events"
      group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the
      nature of the information that is shared over this group.
      
      Note that the capability check in this case will always be performed
      against the initial user namespace since the family is not netns aware
      and only operates in the initial network namespace.
      
      A new field is added to the structure rather than using the "flags"
      field because the existing field uses uAPI flags and it is inappropriate
      to add a new uAPI flag for an internal kernel check. In net-next we can
      rework the "flags" field to use internal flags and fold the new field
      into it. But for now, in order to reduce the amount of changes, add a
      new field.
      
      Since the information can only be consumed by root, mark the control
      plane operations that start and stop the tracing as root-only using the
      'GENL_ADMIN_PERM' flag.
      
      Tested using [1].
      
      Before:
      
       # capsh -- -c ./dm_repo
       # capsh --drop=cap_sys_admin -- -c ./dm_repo
      
      After:
      
       # capsh -- -c ./dm_repo
       # capsh --drop=cap_sys_admin -- -c ./dm_repo
       Failed to join "events" multicast group
      
      [1]
       $ cat dm.c
       #include <stdio.h>
       #include <netlink/genl/ctrl.h>
       #include <netlink/genl/genl.h>
       #include <netlink/socket.h>
      
       int main(int argc, char **argv)
       {
       	struct nl_sock *sk;
       	int grp, err;
      
       	sk = nl_socket_alloc();
       	if (!sk) {
       		fprintf(stderr, "Failed to allocate socket\n");
       		return -1;
       	}
      
       	err = genl_connect(sk);
       	if (err) {
       		fprintf(stderr, "Failed to connect socket\n");
       		return err;
       	}
      
       	grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events");
       	if (grp < 0) {
       		fprintf(stderr,
       			"Failed to resolve \"events\" multicast group\n");
       		return grp;
       	}
      
       	err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE);
       	if (err) {
       		fprintf(stderr, "Failed to join \"events\" multicast group\n");
       		return err;
       	}
      
       	return 0;
       }
       $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c
      
      Fixes: 9a8afc8d ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
      Reported-by: default avatar"The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/20231206213102.1824398-3-idosch@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e0378187
  5. Nov 19, 2023
    • Li RongQing's avatar
      rtnetlink: introduce nlmsg_new_large and use it in rtnl_getlink · ac40916a
      Li RongQing authored
      
      if a PF has 256 or more VFs, ip link command will allocate an order 3
      memory or more, and maybe trigger OOM due to memory fragment,
      the VFs needed memory size is computed in rtnl_vfinfo_size.
      
      so introduce nlmsg_new_large which calls netlink_alloc_large_skb in
      which vmalloc is used for large memory, to avoid the failure of
      allocating memory
      
          ip invoked oom-killer: gfp_mask=0xc2cc0(GFP_KERNEL|__GFP_NOWARN|\
      	__GFP_COMP|__GFP_NOMEMALLOC), order=3, oom_score_adj=0
          CPU: 74 PID: 204414 Comm: ip Kdump: loaded Tainted: P           OE
          Call Trace:
          dump_stack+0x57/0x6a
          dump_header+0x4a/0x210
          oom_kill_process+0xe4/0x140
          out_of_memory+0x3e8/0x790
          __alloc_pages_slowpath.constprop.116+0x953/0xc50
          __alloc_pages_nodemask+0x2af/0x310
          kmalloc_large_node+0x38/0xf0
          __kmalloc_node_track_caller+0x417/0x4d0
          __kmalloc_reserve.isra.61+0x2e/0x80
          __alloc_skb+0x82/0x1c0
          rtnl_getlink+0x24f/0x370
          rtnetlink_rcv_msg+0x12c/0x350
          netlink_rcv_skb+0x50/0x100
          netlink_unicast+0x1b2/0x280
          netlink_sendmsg+0x355/0x4a0
          sock_sendmsg+0x5b/0x60
          ____sys_sendmsg+0x1ea/0x250
          ___sys_sendmsg+0x88/0xd0
          __sys_sendmsg+0x5e/0xa0
          do_syscall_64+0x33/0x40
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
          RIP: 0033:0x7f95a65a5b70
      
      Cc: Yunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Link: https://lore.kernel.org/r/20231115120108.3711-1-lirongqing@baidu.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ac40916a
  6. Nov 03, 2023
  7. Oct 23, 2023
  8. Oct 20, 2023
    • Jakub Kicinski's avatar
      netlink: add variable-length / auto integers · 374d345d
      Jakub Kicinski authored
      We currently push everyone to use padding to align 64b values
      in netlink. Un-padded nla_put_u64() doesn't even exist any more.
      
      The story behind this possibly start with this thread:
      https://lore.kernel.org/netdev/20121204.130914.1457976839967676240.davem@davemloft.net/
      
      
      where DaveM was concerned about the alignment of a structure
      containing 64b stats. If user space tries to access such struct
      directly:
      
      	struct some_stats *stats = nla_data(attr);
      	printf("A: %llu", stats->a);
      
      lack of alignment may become problematic for some architectures.
      These days we most often put every single member in a separate
      attribute, meaning that the code above would use a helper like
      nla_get_u64(), which can deal with alignment internally.
      Even for arches which don't have good unaligned access - access
      aligned to 4B should be pretty efficient.
      Kernel and well known libraries deal with unaligned input already.
      
      Padded 64b is quite space-inefficient (64b + pad means at worst 16B
      per attr vs 32b which takes 8B). It is also more typing:
      
          if (nla_put_u64_pad(rsp, NETDEV_A_SOMETHING_SOMETHING,
                              value, NETDEV_A_SOMETHING_PAD))
      
      Create a new attribute type which will use 32 bits at netlink
      level if value is small enough (probably most of the time?),
      and (4B-aligned) 64 bits otherwise. Kernel API is just:
      
          if (nla_put_uint(rsp, NETDEV_A_SOMETHING_SOMETHING, value))
      
      Calling this new type "just" sint / uint with no specific size
      will hopefully also make people more comfortable with using it.
      Currently telling people "don't use u8, you may need the bits,
      and netlink will round up to 4B, anyway" is the #1 comment
      we give to newcomers.
      
      In terms of netlink layout it looks like this:
      
               0       4       8       12      16
      32b:     [nlattr][ u32  ]
      64b:     [  pad ][nlattr][     u64      ]
      uint(32) [nlattr][ u32  ]
      uint(64) [nlattr][     u64      ]
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      374d345d
  9. Oct 06, 2023
  10. Oct 05, 2023
    • Eric Dumazet's avatar
      netlink: annotate data-races around sk->sk_err · d0f95894
      Eric Dumazet authored
      
      syzbot caught another data-race in netlink when
      setting sk->sk_err.
      
      Annotate all of them for good measure.
      
      BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg
      
      write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0:
      netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
      sock_recvmsg_nosec net/socket.c:1027 [inline]
      sock_recvmsg net/socket.c:1049 [inline]
      __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
      __do_sys_recvfrom net/socket.c:2247 [inline]
      __se_sys_recvfrom net/socket.c:2243 [inline]
      __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1:
      netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
      sock_recvmsg_nosec net/socket.c:1027 [inline]
      sock_recvmsg net/socket.c:1049 [inline]
      __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
      __do_sys_recvfrom net/socket.c:2247 [inline]
      __se_sys_recvfrom net/socket.c:2243 [inline]
      __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00000000 -> 0x00000016
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d0f95894
  11. Aug 15, 2023
  12. Aug 13, 2023
  13. Jul 23, 2023
  14. Jul 22, 2023
  15. Jul 11, 2023
  16. Jun 27, 2023
    • Kuniyuki Iwashima's avatar
      netlink: Add __sock_i_ino() for __netlink_diag_dump(). · 25a9c8a4
      Kuniyuki Iwashima authored
      
      syzbot reported a warning in __local_bh_enable_ip(). [0]
      
      Commit 8d61f926 ("netlink: fix potential deadlock in
      netlink_set_err()") converted read_lock(&nl_table_lock) to
      read_lock_irqsave() in __netlink_diag_dump() to prevent a deadlock.
      
      However, __netlink_diag_dump() calls sock_i_ino() that uses
      read_lock_bh() and read_unlock_bh().  If CONFIG_TRACE_IRQFLAGS=y,
      read_unlock_bh() finally enables IRQ even though it should stay
      disabled until the following read_unlock_irqrestore().
      
      Using read_lock() in sock_i_ino() would trigger a lockdep splat
      in another place that was fixed in commit f064af1e ("net: fix
      a lockdep splat"), so let's add __sock_i_ino() that would be safe
      to use under BH disabled.
      
      [0]:
      WARNING: CPU: 0 PID: 5012 at kernel/softirq.c:376 __local_bh_enable_ip+0xbe/0x130 kernel/softirq.c:376
      Modules linked in:
      CPU: 0 PID: 5012 Comm: syz-executor487 Not tainted 6.4.0-rc7-syzkaller-00202-g6f68fc395f49 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
      RIP: 0010:__local_bh_enable_ip+0xbe/0x130 kernel/softirq.c:376
      Code: 45 bf 01 00 00 00 e8 91 5b 0a 00 e8 3c 15 3d 00 fb 65 8b 05 ec e9 b5 7e 85 c0 74 58 5b 5d c3 65 8b 05 b2 b6 b4 7e 85 c0 75 a2 <0f> 0b eb 9e e8 89 15 3d 00 eb 9f 48 89 ef e8 6f 49 18 00 eb a8 0f
      RSP: 0018:ffffc90003a1f3d0 EFLAGS: 00010046
      RAX: 0000000000000000 RBX: 0000000000000201 RCX: 1ffffffff1cf5996
      RDX: 0000000000000000 RSI: 0000000000000201 RDI: ffffffff8805c6f3
      RBP: ffffffff8805c6f3 R08: 0000000000000001 R09: ffff8880152b03a3
      R10: ffffed1002a56074 R11: 0000000000000005 R12: 00000000000073e4
      R13: dffffc0000000000 R14: 0000000000000002 R15: 0000000000000000
      FS:  0000555556726300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000045ad50 CR3: 000000007c646000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       sock_i_ino+0x83/0xa0 net/core/sock.c:2559
       __netlink_diag_dump+0x45c/0x790 net/netlink/diag.c:171
       netlink_diag_dump+0xd6/0x230 net/netlink/diag.c:207
       netlink_dump+0x570/0xc50 net/netlink/af_netlink.c:2269
       __netlink_dump_start+0x64b/0x910 net/netlink/af_netlink.c:2374
       netlink_dump_start include/linux/netlink.h:329 [inline]
       netlink_diag_handler_dump+0x1ae/0x250 net/netlink/diag.c:238
       __sock_diag_cmd net/core/sock_diag.c:238 [inline]
       sock_diag_rcv_msg+0x31e/0x440 net/core/sock_diag.c:269
       netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2547
       sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280
       netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
       netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365
       netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1914
       sock_sendmsg_nosec net/socket.c:724 [inline]
       sock_sendmsg+0xde/0x190 net/socket.c:747
       ____sys_sendmsg+0x71c/0x900 net/socket.c:2503
       ___sys_sendmsg+0x110/0x1b0 net/socket.c:2557
       __sys_sendmsg+0xf7/0x1c0 net/socket.c:2586
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f5303aaabb9
      Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffc7506e548 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5303aaabb9
      RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
      RBP: 00007f5303a6ed60 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f5303a6edf0
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      
      Fixes: 8d61f926 ("netlink: fix potential deadlock in netlink_set_err()")
      Reported-by: default avatar <syzbot+5da61cf6a9bc1902d422@syzkaller.appspotmail.com>
      Link: https://syzkaller.appspot.com/bug?extid=5da61cf6a9bc1902d422
      
      
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230626164313.52528-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      25a9c8a4
  17. Jun 24, 2023
  18. Jun 23, 2023
  19. Jun 12, 2023
    • Jakub Kicinski's avatar
      netlink: support extack in dump ->start() · 5ab8c41c
      Jakub Kicinski authored
      
      Commit 4a19edb6 ("netlink: Pass extack to dump handlers")
      added extack support to netlink dumps. It was focused on rtnl
      and since rtnl does not use ->start(), ->done() callbacks
      it ignored those. Genetlink on the other hand uses ->start()
      extensively, for parsing and input validation.
      
      Pass the extact in via struct netlink_dump_control and link
      it to cb for the time of ->start(). Both struct netlink_dump_control
      and extack itself live on the stack so we can't keep the same
      extack for the duration of the dump. This means that the extack
      visible in ->start() and each ->dump() callbacks will be different.
      Corner cases like reporting a warning message in DONE across dump
      calls are still not supported.
      
      We could put the extack (for dumps) in the socket struct,
      but layering makes it slightly awkward (extack pointer is decided
      before the DO / DUMP split).
      
      The genetlink dump error extacks are now surfaced:
      
        $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
        lib.ynl.NlError: Netlink error: Invalid argument
        nl_len = 64 (48) nl_flags = 0x300 nl_type = 2
      	error: -22	extack: {'msg': 'request header missing'}
      
      Previously extack was missing:
      
        $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get
        lib.ynl.NlError: Netlink error: Invalid argument
        nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
      	error: -22
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ab8c41c
  20. May 31, 2023
  21. May 10, 2023
    • Eric Dumazet's avatar
      netlink: annotate accesses to nlk->cb_running · a939d149
      Eric Dumazet authored
      
      Both netlink_recvmsg() and netlink_native_seq_show() read
      nlk->cb_running locklessly. Use READ_ONCE() there.
      
      Add corresponding WRITE_ONCE() to netlink_dump() and
      __netlink_dump_start()
      
      syzbot reported:
      BUG: KCSAN: data-race in __netlink_dump_start / netlink_recvmsg
      
      write to 0xffff88813ea4db59 of 1 bytes by task 28219 on cpu 0:
      __netlink_dump_start+0x3af/0x4d0 net/netlink/af_netlink.c:2399
      netlink_dump_start include/linux/netlink.h:308 [inline]
      rtnetlink_rcv_msg+0x70f/0x8c0 net/core/rtnetlink.c:6130
      netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2577
      rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6192
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1942
      sock_sendmsg_nosec net/socket.c:724 [inline]
      sock_sendmsg net/socket.c:747 [inline]
      sock_write_iter+0x1aa/0x230 net/socket.c:1138
      call_write_iter include/linux/fs.h:1851 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x463/0x760 fs/read_write.c:584
      ksys_write+0xeb/0x1a0 fs/read_write.c:637
      __do_sys_write fs/read_write.c:649 [inline]
      __se_sys_write fs/read_write.c:646 [inline]
      __x64_sys_write+0x42/0x50 fs/read_write.c:646
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff88813ea4db59 of 1 bytes by task 28222 on cpu 1:
      netlink_recvmsg+0x3b4/0x730 net/netlink/af_netlink.c:2022
      sock_recvmsg_nosec+0x4c/0x80 net/socket.c:1017
      ____sys_recvmsg+0x2db/0x310 net/socket.c:2718
      ___sys_recvmsg net/socket.c:2762 [inline]
      do_recvmmsg+0x2e5/0x710 net/socket.c:2856
      __sys_recvmmsg net/socket.c:2935 [inline]
      __do_sys_recvmmsg net/socket.c:2958 [inline]
      __se_sys_recvmmsg net/socket.c:2951 [inline]
      __x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00 -> 0x01
      
      Fixes: 16b304f3 ("netlink: Eliminate kmalloc in netlink dump operation.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a939d149
  22. Apr 25, 2023
  23. Apr 05, 2023
    • Eric Dumazet's avatar
      netlink: annotate lockless accesses to nlk->max_recvmsg_len · a1865f2e
      Eric Dumazet authored
      
      syzbot reported a data-race in data-race in netlink_recvmsg() [1]
      
      Indeed, netlink_recvmsg() can be run concurrently,
      and netlink_dump() also needs protection.
      
      [1]
      BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg
      
      read to 0xffff888141840b38 of 8 bytes by task 23057 on cpu 0:
      netlink_recvmsg+0xea/0x730 net/netlink/af_netlink.c:1988
      sock_recvmsg_nosec net/socket.c:1017 [inline]
      sock_recvmsg net/socket.c:1038 [inline]
      __sys_recvfrom+0x1ee/0x2e0 net/socket.c:2194
      __do_sys_recvfrom net/socket.c:2212 [inline]
      __se_sys_recvfrom net/socket.c:2208 [inline]
      __x64_sys_recvfrom+0x78/0x90 net/socket.c:2208
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      write to 0xffff888141840b38 of 8 bytes by task 23037 on cpu 1:
      netlink_recvmsg+0x114/0x730 net/netlink/af_netlink.c:1989
      sock_recvmsg_nosec net/socket.c:1017 [inline]
      sock_recvmsg net/socket.c:1038 [inline]
      ____sys_recvmsg+0x156/0x310 net/socket.c:2720
      ___sys_recvmsg net/socket.c:2762 [inline]
      do_recvmmsg+0x2e5/0x710 net/socket.c:2856
      __sys_recvmmsg net/socket.c:2935 [inline]
      __do_sys_recvmmsg net/socket.c:2958 [inline]
      __se_sys_recvmmsg net/socket.c:2951 [inline]
      __x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x0000000000000000 -> 0x0000000000001000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 23037 Comm: syz-executor.2 Not tainted 6.3.0-rc4-syzkaller-00195-g5a57b48fdfcb #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
      
      Fixes: 9063e21f ("netlink: autosize skb lengthes")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230403214643.768555-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a1865f2e
  24. Mar 10, 2023
  25. Feb 10, 2023
  26. Jan 24, 2023
    • Eric Dumazet's avatar
      netlink: annotate data races around sk_state · 9b663b5c
      Eric Dumazet authored
      
      netlink_getsockbyportid() reads sk_state while a concurrent
      netlink_connect() can change its value.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9b663b5c
    • Eric Dumazet's avatar
      netlink: annotate data races around dst_portid and dst_group · 004db64d
      Eric Dumazet authored
      
      netlink_getname(), netlink_sendmsg() and netlink_getsockbyportid()
      can read nlk->dst_portid and nlk->dst_group while another
      thread is changing them.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      004db64d
    • Eric Dumazet's avatar
      netlink: annotate data races around nlk->portid · c1bb9484
      Eric Dumazet authored
      
      syzbot reminds us netlink_getname() runs locklessly [1]
      
      This first patch annotates the race against nlk->portid.
      
      Following patches take care of the remaining races.
      
      [1]
      BUG: KCSAN: data-race in netlink_getname / netlink_insert
      
      write to 0xffff88814176d310 of 4 bytes by task 2315 on cpu 1:
      netlink_insert+0xf1/0x9a0 net/netlink/af_netlink.c:583
      netlink_autobind+0xae/0x180 net/netlink/af_netlink.c:856
      netlink_sendmsg+0x444/0x760 net/netlink/af_netlink.c:1895
      sock_sendmsg_nosec net/socket.c:714 [inline]
      sock_sendmsg net/socket.c:734 [inline]
      ____sys_sendmsg+0x38f/0x500 net/socket.c:2476
      ___sys_sendmsg net/socket.c:2530 [inline]
      __sys_sendmsg+0x19a/0x230 net/socket.c:2559
      __do_sys_sendmsg net/socket.c:2568 [inline]
      __se_sys_sendmsg net/socket.c:2566 [inline]
      __x64_sys_sendmsg+0x42/0x50 net/socket.c:2566
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff88814176d310 of 4 bytes by task 2316 on cpu 0:
      netlink_getname+0xcd/0x1a0 net/netlink/af_netlink.c:1144
      __sys_getsockname+0x11d/0x1b0 net/socket.c:2026
      __do_sys_getsockname net/socket.c:2041 [inline]
      __se_sys_getsockname net/socket.c:2038 [inline]
      __x64_sys_getsockname+0x3e/0x50 net/socket.c:2038
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00000000 -> 0xc9a49780
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 2316 Comm: syz-executor.2 Not tainted 6.2.0-rc3-syzkaller-00030-ge8f60cd7db24-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c1bb9484
  27. Nov 19, 2022
  28. Nov 18, 2022
  29. Nov 10, 2022
  30. Nov 09, 2022
  31. Nov 08, 2022
  32. Nov 07, 2022
Loading