Skip to content
Snippets Groups Projects
  1. Jan 04, 2022
  2. Jan 03, 2022
  3. Jan 02, 2022
  4. Jan 01, 2022
    • Haimin Zhang's avatar
      net ticp:fix a kernel-infoleak in __tipc_sendmsg() · d6d86830
      Haimin Zhang authored
      
      struct tipc_socket_addr.ref has a 4-byte hole,and __tipc_getname() currently
      copying it to user space,causing kernel-infoleak.
      
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline] lib/usercopy.c:33
      BUG: KMSAN: kernel-infoleak in _copy_to_user+0x1c9/0x270 lib/usercopy.c:33 lib/usercopy.c:33
       instrument_copy_to_user include/linux/instrumented.h:121 [inline]
       instrument_copy_to_user include/linux/instrumented.h:121 [inline] lib/usercopy.c:33
       _copy_to_user+0x1c9/0x270 lib/usercopy.c:33 lib/usercopy.c:33
       copy_to_user include/linux/uaccess.h:209 [inline]
       copy_to_user include/linux/uaccess.h:209 [inline] net/socket.c:287
       move_addr_to_user+0x3f6/0x600 net/socket.c:287 net/socket.c:287
       __sys_getpeername+0x470/0x6b0 net/socket.c:1987 net/socket.c:1987
       __do_sys_getpeername net/socket.c:1997 [inline]
       __se_sys_getpeername net/socket.c:1994 [inline]
       __do_sys_getpeername net/socket.c:1997 [inline] net/socket.c:1994
       __se_sys_getpeername net/socket.c:1994 [inline] net/socket.c:1994
       __x64_sys_getpeername+0xda/0x120 net/socket.c:1994 net/socket.c:1994
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_x64 arch/x86/entry/common.c:51 [inline] arch/x86/entry/common.c:82
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Uninit was stored to memory at:
       tipc_getname+0x575/0x5e0 net/tipc/socket.c:757 net/tipc/socket.c:757
       __sys_getpeername+0x3b3/0x6b0 net/socket.c:1984 net/socket.c:1984
       __do_sys_getpeername net/socket.c:1997 [inline]
       __se_sys_getpeername net/socket.c:1994 [inline]
       __do_sys_getpeername net/socket.c:1997 [inline] net/socket.c:1994
       __se_sys_getpeername net/socket.c:1994 [inline] net/socket.c:1994
       __x64_sys_getpeername+0xda/0x120 net/socket.c:1994 net/socket.c:1994
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_x64 arch/x86/entry/common.c:51 [inline] arch/x86/entry/common.c:82
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Uninit was stored to memory at:
       msg_set_word net/tipc/msg.h:212 [inline]
       msg_set_destport net/tipc/msg.h:619 [inline]
       msg_set_word net/tipc/msg.h:212 [inline] net/tipc/socket.c:1486
       msg_set_destport net/tipc/msg.h:619 [inline] net/tipc/socket.c:1486
       __tipc_sendmsg+0x44fa/0x5890 net/tipc/socket.c:1486 net/tipc/socket.c:1486
       tipc_sendmsg+0xeb/0x140 net/tipc/socket.c:1402 net/tipc/socket.c:1402
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg net/socket.c:724 [inline]
       sock_sendmsg_nosec net/socket.c:704 [inline] net/socket.c:2409
       sock_sendmsg net/socket.c:724 [inline] net/socket.c:2409
       ____sys_sendmsg+0xe11/0x12c0 net/socket.c:2409 net/socket.c:2409
       ___sys_sendmsg net/socket.c:2463 [inline]
       ___sys_sendmsg net/socket.c:2463 [inline] net/socket.c:2492
       __sys_sendmsg+0x704/0x840 net/socket.c:2492 net/socket.c:2492
       __do_sys_sendmsg net/socket.c:2501 [inline]
       __se_sys_sendmsg net/socket.c:2499 [inline]
       __do_sys_sendmsg net/socket.c:2501 [inline] net/socket.c:2499
       __se_sys_sendmsg net/socket.c:2499 [inline] net/socket.c:2499
       __x64_sys_sendmsg+0xe2/0x120 net/socket.c:2499 net/socket.c:2499
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_x64 arch/x86/entry/common.c:51 [inline] arch/x86/entry/common.c:82
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Local variable skaddr created at:
       __tipc_sendmsg+0x2d0/0x5890 net/tipc/socket.c:1419 net/tipc/socket.c:1419
       tipc_sendmsg+0xeb/0x140 net/tipc/socket.c:1402 net/tipc/socket.c:1402
      
      Bytes 4-7 of 16 are uninitialized
      Memory access of size 16 starts at ffff888113753e00
      Data copied to user address 0000000020000280
      
      Reported-by: default avatar <syzbot+cdbd40e0c3ca02cae3b7@syzkaller.appspotmail.com>
      Signed-off-by: default avatarHaimin Zhang <tcs_kernel@tencent.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Link: https://lore.kernel.org/r/1640918123-14547-1-git-send-email-tcs.kernel@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d6d86830
  5. Dec 31, 2021
  6. Dec 30, 2021
  7. Dec 29, 2021
  8. Dec 28, 2021
    • Dust Li's avatar
      net/smc: fix kernel panic caused by race of smc_sock · 349d4312
      Dust Li authored
      
      A crash occurs when smc_cdc_tx_handler() tries to access smc_sock
      but smc_release() has already freed it.
      
      [ 4570.695099] BUG: unable to handle page fault for address: 000000002eae9e88
      [ 4570.696048] #PF: supervisor write access in kernel mode
      [ 4570.696728] #PF: error_code(0x0002) - not-present page
      [ 4570.697401] PGD 0 P4D 0
      [ 4570.697716] Oops: 0002 [#1] PREEMPT SMP NOPTI
      [ 4570.698228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc4+ #111
      [ 4570.699013] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/0
      [ 4570.699933] RIP: 0010:_raw_spin_lock+0x1a/0x30
      <...>
      [ 4570.711446] Call Trace:
      [ 4570.711746]  <IRQ>
      [ 4570.711992]  smc_cdc_tx_handler+0x41/0xc0
      [ 4570.712470]  smc_wr_tx_tasklet_fn+0x213/0x560
      [ 4570.712981]  ? smc_cdc_tx_dismisser+0x10/0x10
      [ 4570.713489]  tasklet_action_common.isra.17+0x66/0x140
      [ 4570.714083]  __do_softirq+0x123/0x2f4
      [ 4570.714521]  irq_exit_rcu+0xc4/0xf0
      [ 4570.714934]  common_interrupt+0xba/0xe0
      
      Though smc_cdc_tx_handler() checked the existence of smc connection,
      smc_release() may have already dismissed and released the smc socket
      before smc_cdc_tx_handler() further visits it.
      
      smc_cdc_tx_handler()           |smc_release()
      if (!conn)                     |
                                     |
                                     |smc_cdc_tx_dismiss_slots()
                                     |      smc_cdc_tx_dismisser()
                                     |
                                     |sock_put(&smc->sk) <- last sock_put,
                                     |                      smc_sock freed
      bh_lock_sock(&smc->sk) (panic) |
      
      To make sure we won't receive any CDC messages after we free the
      smc_sock, add a refcount on the smc_connection for inflight CDC
      message(posted to the QP but haven't received related CQE), and
      don't release the smc_connection until all the inflight CDC messages
      haven been done, for both success or failed ones.
      
      Using refcount on CDC messages brings another problem: when the link
      is going to be destroyed, smcr_link_clear() will reset the QP, which
      then remove all the pending CQEs related to the QP in the CQ. To make
      sure all the CQEs will always come back so the refcount on the
      smc_connection can always reach 0, smc_ib_modify_qp_reset() was replaced
      by smc_ib_modify_qp_error().
      And remove the timeout in smc_wr_tx_wait_no_pending_sends() since we
      need to wait for all pending WQEs done, or we may encounter use-after-
      free when handling CQEs.
      
      For IB device removal routine, we need to wait for all the QPs on that
      device been destroyed before we can destroy CQs on the device, or
      the refcount on smc_connection won't reach 0 and smc_sock cannot be
      released.
      
      Fixes: 5f08318f ("smc: connection data control (CDC)")
      Reported-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Signed-off-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      349d4312
    • Dust Li's avatar
      net/smc: don't send CDC/LLC message if link not ready · 90cee52f
      Dust Li authored
      
      We found smc_llc_send_link_delete_all() sometimes wait
      for 2s timeout when testing with RDMA link up/down.
      It is possible when a smc_link is in ACTIVATING state,
      the underlaying QP is still in RESET or RTR state, which
      cannot send any messages out.
      
      smc_llc_send_link_delete_all() use smc_link_usable() to
      checks whether the link is usable, if the QP is still in
      RESET or RTR state, but the smc_link is in ACTIVATING, this
      LLC message will always fail without any CQE entering the
      CQ, and we will always wait 2s before timeout.
      
      Since we cannot send any messages through the QP before
      the QP enter RTS. I add a wrapper smc_link_sendable()
      which checks the state of QP along with the link state.
      And replace smc_link_usable() with smc_link_sendable()
      in all LLC & CDC message sending routine.
      
      Fixes: 5f08318f ("smc: connection data control (CDC)")
      Signed-off-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90cee52f
  9. Dec 27, 2021
    • yangxingwu's avatar
      net: udp: fix alignment problem in udp4_seq_show() · 6c25449e
      yangxingwu authored
      
      $ cat /pro/net/udp
      
      before:
      
        sl  local_address rem_address   st tx_queue rx_queue tr tm->when
      26050: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000
      26320: 0100007F:0143 00000000:0000 07 00000000:00000000 00:00000000
      27135: 00000000:8472 00000000:0000 07 00000000:00000000 00:00000000
      
      after:
      
         sl  local_address rem_address   st tx_queue rx_queue tr tm->when
      26050: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000
      26320: 0100007F:0143 00000000:0000 07 00000000:00000000 00:00000000
      27135: 00000000:8472 00000000:0000 07 00000000:00000000 00:00000000
      
      Signed-off-by: default avataryangxingwu <xingwu.yang@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c25449e
    • Karsten Graul's avatar
      net/smc: fix using of uninitialized completions · 6d7373da
      Karsten Graul authored
      
      In smc_wr_tx_send_wait() the completion on index specified by
      pend->idx is initialized and after smc_wr_tx_send() was called the wait
      for completion starts. pend->idx is used to get the correct index for
      the wait, but the pend structure could already be cleared in
      smc_wr_tx_process_cqe().
      Introduce pnd_idx to hold and use a local copy of the correct index.
      
      Fixes: 09c61d24 ("net/smc: wait for departure of an IB message")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d7373da
    • William Zhao's avatar
      ip6_vti: initialize __ip6_tnl_parm struct in vti6_siocdevprivate · c1833c39
      William Zhao authored
      
      The "__ip6_tnl_parm" struct was left uninitialized causing an invalid
      load of random data when the "__ip6_tnl_parm" struct was used elsewhere.
      As an example, in the function "ip6_tnl_xmit_ctl()", it tries to access
      the "collect_md" member. With "__ip6_tnl_parm" being uninitialized and
      containing random data, the UBSAN detected that "collect_md" held a
      non-boolean value.
      
      The UBSAN issue is as follows:
      ===============================================================
      UBSAN: invalid-load in net/ipv6/ip6_tunnel.c:1025:14
      load of value 30 is not a valid value for type '_Bool'
      CPU: 1 PID: 228 Comm: kworker/1:3 Not tainted 5.16.0-rc4+ #8
      Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      Workqueue: ipv6_addrconf addrconf_dad_work
      Call Trace:
      <TASK>
      dump_stack_lvl+0x44/0x57
      ubsan_epilogue+0x5/0x40
      __ubsan_handle_load_invalid_value+0x66/0x70
      ? __cpuhp_setup_state+0x1d3/0x210
      ip6_tnl_xmit_ctl.cold.52+0x2c/0x6f [ip6_tunnel]
      vti6_tnl_xmit+0x79c/0x1e96 [ip6_vti]
      ? lock_is_held_type+0xd9/0x130
      ? vti6_rcv+0x100/0x100 [ip6_vti]
      ? lock_is_held_type+0xd9/0x130
      ? rcu_read_lock_bh_held+0xc0/0xc0
      ? lock_acquired+0x262/0xb10
      dev_hard_start_xmit+0x1e6/0x820
      __dev_queue_xmit+0x2079/0x3340
      ? mark_lock.part.52+0xf7/0x1050
      ? netdev_core_pick_tx+0x290/0x290
      ? kvm_clock_read+0x14/0x30
      ? kvm_sched_clock_read+0x5/0x10
      ? sched_clock_cpu+0x15/0x200
      ? find_held_lock+0x3a/0x1c0
      ? lock_release+0x42f/0xc90
      ? lock_downgrade+0x6b0/0x6b0
      ? mark_held_locks+0xb7/0x120
      ? neigh_connected_output+0x31f/0x470
      ? lockdep_hardirqs_on+0x79/0x100
      ? neigh_connected_output+0x31f/0x470
      ? ip6_finish_output2+0x9b0/0x1d90
      ? rcu_read_lock_bh_held+0x62/0xc0
      ? ip6_finish_output2+0x9b0/0x1d90
      ip6_finish_output2+0x9b0/0x1d90
      ? ip6_append_data+0x330/0x330
      ? ip6_mtu+0x166/0x370
      ? __ip6_finish_output+0x1ad/0xfb0
      ? nf_hook_slow+0xa6/0x170
      ip6_output+0x1fb/0x710
      ? nf_hook.constprop.32+0x317/0x430
      ? ip6_finish_output+0x180/0x180
      ? __ip6_finish_output+0xfb0/0xfb0
      ? lock_is_held_type+0xd9/0x130
      ndisc_send_skb+0xb33/0x1590
      ? __sk_mem_raise_allocated+0x11cf/0x1560
      ? dst_output+0x4a0/0x4a0
      ? ndisc_send_rs+0x432/0x610
      addrconf_dad_completed+0x30c/0xbb0
      ? addrconf_rs_timer+0x650/0x650
      ? addrconf_dad_work+0x73c/0x10e0
      addrconf_dad_work+0x73c/0x10e0
      ? addrconf_dad_completed+0xbb0/0xbb0
      ? rcu_read_lock_sched_held+0xaf/0xe0
      ? rcu_read_lock_bh_held+0xc0/0xc0
      process_one_work+0x97b/0x1740
      ? pwq_dec_nr_in_flight+0x270/0x270
      worker_thread+0x87/0xbf0
      ? process_one_work+0x1740/0x1740
      kthread+0x3ac/0x490
      ? set_kthread_struct+0x100/0x100
      ret_from_fork+0x22/0x30
      </TASK>
      ===============================================================
      
      The solution is to initialize "__ip6_tnl_parm" struct to zeros in the
      "vti6_siocdevprivate()" function.
      
      Signed-off-by: default avatarWilliam Zhao <wizhao@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1833c39
  10. Dec 25, 2021
    • Xin Long's avatar
      sctp: use call_rcu to free endpoint · 5ec7d18d
      Xin Long authored
      
      This patch is to delay the endpoint free by calling call_rcu() to fix
      another use-after-free issue in sctp_sock_dump():
      
        BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
        Call Trace:
          __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
          lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
          __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
          _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
          spin_lock_bh include/linux/spinlock.h:334 [inline]
          __lock_sock+0x203/0x350 net/core/sock.c:2253
          lock_sock_nested+0xfe/0x120 net/core/sock.c:2774
          lock_sock include/net/sock.h:1492 [inline]
          sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324
          sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091
          sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527
          __inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049
          inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065
          netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244
          __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352
          netlink_dump_start include/linux/netlink.h:216 [inline]
          inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170
          __sock_diag_cmd net/core/sock_diag.c:232 [inline]
          sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263
          netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477
          sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274
      
      This issue occurs when asoc is peeled off and the old sk is freed after
      getting it by asoc->base.sk and before calling lock_sock(sk).
      
      To prevent the sk free, as a holder of the sk, ep should be alive when
      calling lock_sock(). This patch uses call_rcu() and moves sock_put and
      ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to
      hold the ep under rcu_read_lock in sctp_transport_traverse_process().
      
      If sctp_endpoint_hold() returns true, it means this ep is still alive
      and we have held it and can continue to dump it; If it returns false,
      it means this ep is dead and can be freed after rcu_read_unlock, and
      we should skip it.
      
      In sctp_sock_dump(), after locking the sk, if this ep is different from
      tsp->asoc->ep, it means during this dumping, this asoc was peeled off
      before calling lock_sock(), and the sk should be skipped; If this ep is
      the same with tsp->asoc->ep, it means no peeloff happens on this asoc,
      and due to lock_sock, no peeloff will happen either until release_sock.
      
      Note that delaying endpoint free won't delay the port release, as the
      port release happens in sctp_endpoint_destroy() before calling call_rcu().
      Also, freeing endpoint by call_rcu() makes it safe to access the sk by
      asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv().
      
      Thanks Jones to bring this issue up.
      
      v1->v2:
        - improve the changelog.
        - add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed.
      
      Reported-by: default avatar <syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com>
      Reported-by: default avatarLee Jones <lee.jones@linaro.org>
      Fixes: d25adbeb ("sctp: fix an use-after-free issue in sctp_sock_dump")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ec7d18d
  11. Dec 24, 2021
  12. Dec 23, 2021
  13. Dec 21, 2021
    • Eric Dumazet's avatar
      inet: fully convert sk->sk_rx_dst to RCU rules · 8f905c0e
      Eric Dumazet authored
      
      syzbot reported various issues around early demux,
      one being included in this changelog [1]
      
      sk->sk_rx_dst is using RCU protection without clearly
      documenting it.
      
      And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv()
      are not following standard RCU rules.
      
      [a]    dst_release(dst);
      [b]    sk->sk_rx_dst = NULL;
      
      They look wrong because a delete operation of RCU protected
      pointer is supposed to clear the pointer before
      the call_rcu()/synchronize_rcu() guarding actual memory freeing.
      
      In some cases indeed, dst could be freed before [b] is done.
      
      We could cheat by clearing sk_rx_dst before calling
      dst_release(), but this seems the right time to stick
      to standard RCU annotations and debugging facilities.
      
      [1]
      BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline]
      BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
      Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204
      
      CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247
       __kasan_report mm/kasan/report.c:433 [inline]
       kasan_report.cold+0x83/0xdf mm/kasan/report.c:450
       dst_check include/net/dst.h:470 [inline]
       tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
       ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340
       ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
       ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
       ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
       __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
       __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
       __netif_receive_skb_list net/core/dev.c:5608 [inline]
       netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
       gro_normal_list net/core/dev.c:5853 [inline]
       gro_normal_list net/core/dev.c:5849 [inline]
       napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
       virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
       virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
       __napi_poll+0xaf/0x440 net/core/dev.c:7023
       napi_poll net/core/dev.c:7090 [inline]
       net_rx_action+0x801/0xb40 net/core/dev.c:7177
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
       invoke_softirq kernel/softirq.c:432 [inline]
       __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
       common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240
       asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629
      RIP: 0033:0x7f5e972bfd57
      Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73
      RSP: 002b:00007fff8a413210 EFLAGS: 00000283
      RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45
      RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45
      RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9
      R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0
      R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019
       </TASK>
      
      Allocated by task 13:
       kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
       kasan_set_track mm/kasan/common.c:46 [inline]
       set_alloc_info mm/kasan/common.c:434 [inline]
       __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467
       kasan_slab_alloc include/linux/kasan.h:259 [inline]
       slab_post_alloc_hook mm/slab.h:519 [inline]
       slab_alloc_node mm/slub.c:3234 [inline]
       slab_alloc mm/slub.c:3242 [inline]
       kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247
       dst_alloc+0x146/0x1f0 net/core/dst.c:92
       rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
       ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340
       ip_route_input_rcu net/ipv4/route.c:2470 [inline]
       ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415
       ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354
       ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
       ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
       ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
       __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
       __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
       __netif_receive_skb_list net/core/dev.c:5608 [inline]
       netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
       gro_normal_list net/core/dev.c:5853 [inline]
       gro_normal_list net/core/dev.c:5849 [inline]
       napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
       virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
       virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
       __napi_poll+0xaf/0x440 net/core/dev.c:7023
       napi_poll net/core/dev.c:7090 [inline]
       net_rx_action+0x801/0xb40 net/core/dev.c:7177
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
      
      Freed by task 13:
       kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
       kasan_set_track+0x21/0x30 mm/kasan/common.c:46
       kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
       ____kasan_slab_free mm/kasan/common.c:366 [inline]
       ____kasan_slab_free mm/kasan/common.c:328 [inline]
       __kasan_slab_free+0xff/0x130 mm/kasan/common.c:374
       kasan_slab_free include/linux/kasan.h:235 [inline]
       slab_free_hook mm/slub.c:1723 [inline]
       slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749
       slab_free mm/slub.c:3513 [inline]
       kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530
       dst_destroy+0x2d6/0x3f0 net/core/dst.c:127
       rcu_do_batch kernel/rcu/tree.c:2506 [inline]
       rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
      
      Last potentially related work creation:
       kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
       __kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348
       __call_rcu kernel/rcu/tree.c:2985 [inline]
       call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065
       dst_release net/core/dst.c:177 [inline]
       dst_release+0x79/0xe0 net/core/dst.c:167
       tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712
       sk_backlog_rcv include/net/sock.h:1030 [inline]
       __release_sock+0x134/0x3b0 net/core/sock.c:2768
       release_sock+0x54/0x1b0 net/core/sock.c:3300
       tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441
       inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:724
       sock_write_iter+0x289/0x3c0 net/socket.c:1057
       call_write_iter include/linux/fs.h:2162 [inline]
       new_sync_write+0x429/0x660 fs/read_write.c:503
       vfs_write+0x7cd/0xae0 fs/read_write.c:590
       ksys_write+0x1ee/0x250 fs/read_write.c:643
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The buggy address belongs to the object at ffff88807f1cb700
       which belongs to the cache ip_dst_cache of size 176
      The buggy address is located 58 bytes inside of
       176-byte region [ffff88807f1cb700, ffff88807f1cb7b0)
      The buggy address belongs to the page:
      page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb
      flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
      raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780
      raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062
       prep_new_page mm/page_alloc.c:2418 [inline]
       get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149
       __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369
       alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191
       alloc_slab_page mm/slub.c:1793 [inline]
       allocate_slab mm/slub.c:1930 [inline]
       new_slab+0x32d/0x4a0 mm/slub.c:1993
       ___slab_alloc+0x918/0xfe0 mm/slub.c:3022
       __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109
       slab_alloc_node mm/slub.c:3200 [inline]
       slab_alloc mm/slub.c:3242 [inline]
       kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247
       dst_alloc+0x146/0x1f0 net/core/dst.c:92
       rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
       __mkroute_output net/ipv4/route.c:2564 [inline]
       ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791
       ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619
       __ip_route_output_key include/net/route.h:126 [inline]
       ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850
       ip_route_output_key include/net/route.h:142 [inline]
       geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809
       geneve_xmit_skb drivers/net/geneve.c:899 [inline]
       geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082
       __netdev_start_xmit include/linux/netdevice.h:4994 [inline]
       netdev_start_xmit include/linux/netdevice.h:5008 [inline]
       xmit_one net/core/dev.c:3590 [inline]
       dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606
       __dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229
      page last free stack trace:
       reset_page_owner include/linux/page_owner.h:24 [inline]
       free_pages_prepare mm/page_alloc.c:1338 [inline]
       free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389
       free_unref_page_prepare mm/page_alloc.c:3309 [inline]
       free_unref_page+0x19/0x690 mm/page_alloc.c:3388
       qlink_free mm/kasan/quarantine.c:146 [inline]
       qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165
       kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272
       __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444
       kasan_slab_alloc include/linux/kasan.h:259 [inline]
       slab_post_alloc_hook mm/slab.h:519 [inline]
       slab_alloc_node mm/slub.c:3234 [inline]
       kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270
       __alloc_skb+0x215/0x340 net/core/skbuff.c:414
       alloc_skb include/linux/skbuff.h:1126 [inline]
       alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078
       sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575
       mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754
       add_grhead+0x265/0x330 net/ipv6/mcast.c:1857
       add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995
       mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242
       mld_send_initial_cr net/ipv6/mcast.c:1232 [inline]
       mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268
       process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
       worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
      
      Memory state around the buggy address:
       ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      >ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                              ^
       ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
       ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 41063e9d ("ipv4: Early TCP socket demux.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20211220143330.680945-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8f905c0e
  14. Dec 20, 2021
  15. Dec 18, 2021
    • Lin Ma's avatar
      ax25: NPD bug when detaching AX25 device · 1ade48d0
      Lin Ma authored
      
      The existing cleanup routine implementation is not well synchronized
      with the syscall routine. When a device is detaching, below race could
      occur.
      
      static int ax25_sendmsg(...) {
        ...
        lock_sock()
        ax25 = sk_to_ax25(sk);
        if (ax25->ax25_dev == NULL) // CHECK
        ...
        ax25_queue_xmit(skb, ax25->ax25_dev->dev); // USE
        ...
      }
      
      static void ax25_kill_by_device(...) {
        ...
        if (s->ax25_dev == ax25_dev) {
          s->ax25_dev = NULL;
          ...
      }
      
      Other syscall functions like ax25_getsockopt, ax25_getname,
      ax25_info_show also suffer from similar races. To fix them, this patch
      introduce lock_sock() into ax25_kill_by_device in order to guarantee
      that the nullify action in cleanup routine cannot proceed when another
      socket request is pending.
      
      Signed-off-by: default avatarHanjie Wu <nagi@zju.edu.cn>
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ade48d0
    • Hoang Le's avatar
      Revert "tipc: use consistent GFP flags" · f845fe58
      Hoang Le authored
      
      This reverts commit 86c3a3e9.
      
      The tipc_aead_init() function can be calling from an interrupt routine.
      This allocation might sleep with GFP_KERNEL flag, hence the following BUG
      is reported.
      
      [   17.657509] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:230
      [   17.660916] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/3
      [   17.664093] preempt_count: 302, expected: 0
      [   17.665619] RCU nest depth: 2, expected: 0
      [   17.667163] Preemption disabled at:
      [   17.667165] [<0000000000000000>] 0x0
      [   17.669753] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G        W         5.16.0-rc4+ #1
      [   17.673006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      [   17.675540] Call Trace:
      [   17.676285]  <IRQ>
      [   17.676913]  dump_stack_lvl+0x34/0x44
      [   17.678033]  __might_resched.cold+0xd6/0x10f
      [   17.679311]  kmem_cache_alloc_trace+0x14d/0x220
      [   17.680663]  tipc_crypto_start+0x4a/0x2b0 [tipc]
      [   17.682146]  ? kmem_cache_alloc_trace+0xd3/0x220
      [   17.683545]  tipc_node_create+0x2f0/0x790 [tipc]
      [   17.684956]  tipc_node_check_dest+0x72/0x680 [tipc]
      [   17.686706]  ? ___cache_free+0x31/0x350
      [   17.688008]  ? skb_release_data+0x128/0x140
      [   17.689431]  tipc_disc_rcv+0x479/0x510 [tipc]
      [   17.690904]  tipc_rcv+0x71c/0x730 [tipc]
      [   17.692219]  ? __netif_receive_skb_core+0xb7/0xf60
      [   17.693856]  tipc_l2_rcv_msg+0x5e/0x90 [tipc]
      [   17.695333]  __netif_receive_skb_list_core+0x20b/0x260
      [   17.697072]  netif_receive_skb_list_internal+0x1bf/0x2e0
      [   17.698870]  ? dev_gro_receive+0x4c2/0x680
      [   17.700255]  napi_complete_done+0x6f/0x180
      [   17.701657]  virtnet_poll+0x29c/0x42e [virtio_net]
      [   17.703262]  __napi_poll+0x2c/0x170
      [   17.704429]  net_rx_action+0x22f/0x280
      [   17.705706]  __do_softirq+0xfd/0x30a
      [   17.706921]  common_interrupt+0xa4/0xc0
      [   17.708206]  </IRQ>
      [   17.708922]  <TASK>
      [   17.709651]  asm_common_interrupt+0x1e/0x40
      [   17.711078] RIP: 0010:default_idle+0x18/0x20
      
      Fixes: 86c3a3e9 ("tipc: use consistent GFP flags")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Link: https://lore.kernel.org/r/20211217030059.5947-1-hoang.h.le@dektech.com.au
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f845fe58
    • Paul Blakey's avatar
      net: openvswitch: Fix matching zone id for invalid conns arriving from tc · 635d448a
      Paul Blakey authored
      
      Zone id is not restored if we passed ct and ct rejected the connection,
      as there is no ct info on the skb.
      
      Save the zone from tc skb cb to tc skb extension and pass it on to
      ovs, use that info to restore the zone id for invalid connections.
      
      Fixes: d29334c1 ("net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      635d448a
Loading