Skip to content
Snippets Groups Projects
  1. Jan 05, 2022
  2. Jan 04, 2022
  3. Dec 15, 2021
  4. Dec 14, 2021
    • Yangyang Li's avatar
      RDMA/hns: Fix RNR retransmission issue for HIP08 · 4ad81814
      Yangyang Li authored
      Due to the discrete nature of the HIP08 timer unit, a requester might
      finish the timeout period sooner, in elapsed real time, than its responder
      does, even when both sides share the identical RNR timeout length included
      in the RNR Nak packet and the responder indeed starts the timing prior to
      the requester. Furthermore, if a 'providential' resend packet arrived
      before the responder's timeout period expired, the responder is certainly
      entitled to drop the packet silently in the light of IB protocol.
      
      To address this problem, our team made good use of certain hardware facts:
      
      1) The timing resolution regards the transmission arrangements is 1
         microsecond, e.g. if cq_period field is set to 3, it would be
         interpreted as 3 microsecond by hardware
      
      2) A QPC field shall inform the hardware how many timing unit (ticks)
         constitutes a full microsecond, which, by default, is 1000
      
      3) It takes 14ns for the processor to handle a packet in the buffer, so
         the RNR timeout length of 10ns would ensure our processing mechanism is
         disabled during the entire timeout period and the packet won't be
         dropped silently
      
      To achieve (3), we permanently set the QPC field mentioned in (2) to zero
      which nominally indicates every time tick is equivalent to a microsecond
      in wall-clock time; now, a RNR timeout period at face value of 10 would
      only last 10 ticks, which is 10ns in wall-clock time.
      
      It's worth noting that we adapt the driver by magnifying certain
      configuration parameters(cq_period, eq_period and ack_timeout)by 1000
      given the user assumes the configuring timing unit to be microseconds.
      
      Also, this particular improvisation is only deployed on HIP08 since other
      hardware has already solved this issue.
      
      Fixes: cfc85f3e ("RDMA/hns: Add profile support for hip08 driver")
      Link: https://lore.kernel.org/r/20211209140655.49493-1-liangwenpeng@huawei.com
      
      
      Signed-off-by: default avatarYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: default avatarWenpeng Liang <liangwenpeng@huawei.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      4ad81814
  5. Dec 07, 2021
  6. Nov 29, 2021
    • Guoqing Jiang's avatar
      RDMA/rtrs: Call {get,put}_cpu_ptr to silence a debug kernel warning · db6169b5
      Guoqing Jiang authored
      With preemption enabled (CONFIG_DEBUG_PREEMPT=y), the following appeared
      when rnbd client tries to map remote block device.
      
        BUG: using smp_processor_id() in preemptible [00000000] code: bash/1733
        caller is debug_smp_processor_id+0x17/0x20
        CPU: 0 PID: 1733 Comm: bash Not tainted 5.16.0-rc1 #5
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
        Call Trace:
         <TASK>
         dump_stack_lvl+0x5d/0x78
         dump_stack+0x10/0x12
         check_preemption_disabled+0xe4/0xf0
         debug_smp_processor_id+0x17/0x20
         rtrs_clt_update_all_stats+0x3b/0x70 [rtrs_client]
         rtrs_clt_read_req+0xc3/0x380 [rtrs_client]
         ? rtrs_clt_init_req+0xe3/0x120 [rtrs_client]
         rtrs_clt_request+0x1a7/0x320 [rtrs_client]
         ? 0xffffffffc0ab1000
         send_usr_msg+0xbf/0x160 [rnbd_client]
         ? rnbd_clt_put_sess+0x60/0x60 [rnbd_client]
         ? send_usr_msg+0x160/0x160 [rnbd_client]
         ? sg_alloc_table+0x27/0xb0
         ? sg_zero_buffer+0xd0/0xd0
         send_msg_sess_info+0xe9/0x180 [rnbd_client]
         ? rnbd_clt_put_sess+0x60/0x60 [rnbd_client]
         ? blk_mq_alloc_tag_set+0x2ef/0x370
         rnbd_clt_map_device+0xba8/0xcd0 [rnbd_client]
         ? send_msg_open+0x200/0x200 [rnbd_client]
         rnbd_clt_map_device_store+0x3e5/0x620 [rnbd_client
      
      To supress the calltrace, let's call get_cpu_ptr/put_cpu_ptr pair in
      rtrs_clt_update_rdma_stats to disable preemption when accessing per-cpu
      variable.
      
      While at it, let's make the similar change in rtrs_clt_update_wc_stats.
      And for rtrs_clt_inc_failover_cnt, though it was only called inside rcu
      section, but it still can be preempted in case CONFIG_PREEMPT_RCU is
      enabled, so change it to {get,put}_cpu_ptr pair either.
      
      Link: https://lore.kernel.org/r/20211128133501.38710-1-guoqing.jiang@linux.dev
      
      
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@linux.dev>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      db6169b5
  7. Nov 25, 2021
    • Yangyang Li's avatar
      RDMA/hns: Do not destroy QP resources in the hw resetting phase · b0969f83
      Yangyang Li authored
      When hns_roce_v2_destroy_qp() is called, the brief calling process of the
      driver is as follows:
      
       ......
       hns_roce_v2_destroy_qp
       hns_roce_v2_qp_modify
      	   hns_roce_cmd_mbox
       hns_roce_qp_destroy
      
      If hns_roce_cmd_mbox() detects that the hardware is being reset during the
      execution of the hns_roce_cmd_mbox(), the driver will not be able to get
      the return value from the hardware (the firmware cannot respond to the
      driver's mailbox during the hardware reset phase).
      
      The driver needs to wait for the hardware reset to complete before
      continuing to execute hns_roce_qp_destroy(), otherwise it may happen that
      the driver releases the resources but the hardware is still accessing. In
      order to fix this problem, HNS RoCE needs to add a piece of code to wait
      for the hardware reset to complete.
      
      The original interface get_hw_reset_stat() is the instantaneous state of
      the hardware reset, which cannot accurately reflect whether the hardware
      reset is completed, so it needs to be replaced with the ae_dev_reset_cnt
      interface.
      
      The sign that the hardware reset is complete is that the return value of
      the ae_dev_reset_cnt interface is greater than the original value
      reset_cnt recorded by the driver.
      
      Fixes: 6a04aed6 ("RDMA/hns: Fix the chip hanging caused by sending mailbox&CMQ during reset")
      Link: https://lore.kernel.org/r/20211123142402.26936-1-liangwenpeng@huawei.com
      
      
      Signed-off-by: default avatarYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: default avatarWenpeng Liang <liangwenpeng@huawei.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      b0969f83
    • Yangyang Li's avatar
      RDMA/hns: Do not halt commands during reset until later · 52414e27
      Yangyang Li authored
      is_reset is used to indicate whether the hardware starts to reset. When
      hns_roce_hw_v2_reset_notify_down() is called, the hardware has not yet
      started to reset. If is_reset is set at this time, all mailbox operations
      of resource destroy actions will be intercepted by driver. When the driver
      cleans up resources, but the hardware is still accessed, the following
      errors will appear:
      
        arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x000002088000003f
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50e0800
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
        arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x000002088000043e
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50a0800
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
        arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000020880000436
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50a0880
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
        arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x000002088000043a
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50e0840
        hns3 0000:35:00.0: INT status: CMDQ(0x0) HW errors(0x0) other(0x0)
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
        hns3 0000:35:00.0: received unknown or unhandled event of vector0
        arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
        arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
        {34}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 7
      
      is_reset will be set correctly in check_aedev_reset_status(), so the
      setting in hns_roce_hw_v2_reset_notify_down() should be deleted.
      
      Fixes: 726be12f ("RDMA/hns: Set reset flag when hw resetting")
      Link: https://lore.kernel.org/r/20211123084809.37318-1-liangwenpeng@huawei.com
      
      
      Signed-off-by: default avatarYangyang Li <liyangyang20@huawei.com>
      Signed-off-by: default avatarWenpeng Liang <liangwenpeng@huawei.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      52414e27
    • Alaa Hleihel's avatar
      RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow · f0ae4afe
      Alaa Hleihel authored
      For the case of IB_MR_TYPE_DM the mr does doesn't have a umem, even though
      it is a user MR. This causes function mlx5_free_priv_descs() to think that
      it is a kernel MR, leading to wrongly accessing mr->descs that will get
      wrong values in the union which leads to attempt to release resources that
      were not allocated in the first place.
      
      For example:
       DMA-API: mlx5_core 0000:08:00.1: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=0 bytes]
       WARNING: CPU: 8 PID: 1021 at kernel/dma/debug.c:961 check_unmap+0x54f/0x8b0
       RIP: 0010:check_unmap+0x54f/0x8b0
       Call Trace:
        debug_dma_unmap_page+0x57/0x60
        mlx5_free_priv_descs+0x57/0x70 [mlx5_ib]
        mlx5_ib_dereg_mr+0x1fb/0x3d0 [mlx5_ib]
        ib_dereg_mr_user+0x60/0x140 [ib_core]
        uverbs_destroy_uobject+0x59/0x210 [ib_uverbs]
        uobj_destroy+0x3f/0x80 [ib_uverbs]
        ib_uverbs_cmd_verbs+0x435/0xd10 [ib_uverbs]
        ? uverbs_finalize_object+0x50/0x50 [ib_uverbs]
        ? lock_acquire+0xc4/0x2e0
        ? lock_acquired+0x12/0x380
        ? lock_acquire+0xc4/0x2e0
        ? lock_acquire+0xc4/0x2e0
        ? ib_uverbs_ioctl+0x7c/0x140 [ib_uverbs]
        ? lock_release+0x28a/0x400
        ib_uverbs_ioctl+0xc0/0x140 [ib_uverbs]
        ? ib_uverbs_ioctl+0x7c/0x140 [ib_uverbs]
        __x64_sys_ioctl+0x7f/0xb0
        do_syscall_64+0x38/0x90
      
      Fix it by reorganizing the dereg flow and mlx5_ib_mr structure:
       - Move the ib_umem field into the user MRs structure in the union as it's
         applicable only there.
       - Function mlx5_ib_dereg_mr() will now call mlx5_free_priv_descs() only
         in case there isn't udata, which indicates that this isn't a user MR.
      
      Fixes: f18ec422 ("RDMA/mlx5: Use a union inside mlx5_ib_mr")
      Link: https://lore.kernel.org/r/66bb1dd253c1fd7ceaa9fc411061eefa457b86fb.1637581144.git.leonro@nvidia.com
      
      
      Signed-off-by: default avatarAlaa Hleihel <alaa@nvidia.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      f0ae4afe
    • Pavel Skripkin's avatar
      RDMA: Fix use-after-free in rxe_queue_cleanup · 84b01721
      Pavel Skripkin authored
      On error handling path in rxe_qp_from_init() qp->sq.queue is freed and
      then rxe_create_qp() will drop last reference to this object. qp clean up
      function will try to free this queue one time and it causes UAF bug.
      
      Fix it by zeroing queue pointer after freeing queue in rxe_qp_from_init().
      
      Fixes: 514aee66 ("RDMA: Globally allocate and release QP memory")
      Link: https://lore.kernel.org/r/20211121202239.3129-1-paskripkin@gmail.com
      
      
      Reported-by: default avatar <syzbot+aab53008a5adf26abe91@syzkaller.appspotmail.com>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Reviewed-by: default avatarZhu Yanjun <zyjzyj2000@gmail.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      84b01721
  8. Nov 17, 2021
    • Leon Romanovsky's avatar
      RDMA/nldev: Check stat attribute before accessing it · d821f7c1
      Leon Romanovsky authored
      The access to non-existent netlink attribute causes to the following
      kernel panic. Fix it by checking existence before trying to read it.
      
        general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
        KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
        CPU: 0 PID: 6744 Comm: syz-executor.0 Not tainted 5.15.0-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        RIP: 0010:nla_get_u32 include/net/netlink.h:1554 [inline]
        RIP: 0010:nldev_stat_set_mode_doit drivers/infiniband/core/nldev.c:1909 [inline]
        RIP: 0010:nldev_stat_set_doit+0x578/0x10d0 drivers/infiniband/core/nldev.c:2040
        Code: fa 4c 8b a4 24 f8 02 00 00 48 b8 00 00 00 00 00 fc ff df c7 84 24 80 00 00 00 00 00 00 00 49 8d 7c 24 04 48 89
        fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 02
        RSP: 0018:ffffc90004acf2e8 EFLAGS: 00010247
        RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90002b94000
        RDX: 0000000000000000 RSI: ffffffff8684c5ff RDI: 0000000000000004
        RBP: ffff88807cda4000 R08: 0000000000000000 R09: ffff888023fb8027
        R10: ffffffff8684c5d7 R11: 0000000000000000 R12: 0000000000000000
        R13: 0000000000000001 R14: ffff888041024280 R15: ffff888031ade780
        FS:  00007eff9dddd700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000001b2ef24000 CR3: 0000000036902000 CR4: 00000000003506f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         <TASK>
         rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195
         rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
         rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259
         netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
         netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1345
         netlink_sendmsg+0x86d/0xda0 net/netlink/af_netlink.c:1916
         sock_sendmsg_nosec net/socket.c:704 [inline]
         sock_sendmsg+0xcf/0x120 net/socket.c:724
         ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
         ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
         __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
         do_syscall_x64 arch/x86/entry/common.c:50 [inline]
         do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 822cf785 ("RDMA/nldev: Split nldev_stat_set_mode_doit out of nldev_stat_set_doit")
      Link: https://lore.kernel.org/r/b21967c366f076ff1988862f9c8a1aa0244c599f.1637151999.git.leonro@nvidia.com
      
      
      Reported-by: default avatar <syzbot+9111d2255a9710e87562@syzkaller.appspotmail.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      d821f7c1
    • Jack Wang's avatar
      RDMA/mlx4: Do not fail the registration on port stats · 378c6741
      Jack Wang authored
      If the FW doesn't support MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT, mlx4 driver
      will fail the ib_setup_port_attrs, which is called from
      ib_register_device()/enable_device_and_get(), in the end leads to device
      not detected[1][2]
      
      To fix it, add a new mlx4_ib_hw_stats_ops1, w/o alloc_hw_port_stats if FW
      does not support MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT.
      
      [1] https://bugzilla.redhat.com/show_bug.cgi?id=2014094
      [2] https://lore.kernel.org/linux-rdma/CAMGffEn2wvEnmzc0xe=xYiCLqpphiHDBxCxqAELrBofbUAMQxw@mail.gmail.com
      
      Fixes: 4b5f4d3f ("RDMA: Split the alloc_hw_stats() ops to port and device variants")
      Link: https://lore.kernel.org/r/20211115101519.27210-1-jinpu.wang@ionos.com
      
      
      Signed-off-by: default avatarJack Wang <jinpu.wang@ionos.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      378c6741
  9. Nov 16, 2021
    • Dennis Dalessandro's avatar
      IB/hfi1: Properly allocate rdma counter desc memory · da86dc17
      Dennis Dalessandro authored
      When optional counter support was added the allocation of the memory
      holding the counter descriptors was not cleared properly. This caused
      WARN_ON()s in the IB/sysfs code to be hit.
      
      This is because the uninitialized memory made some of the counters wrongly
      look like optional counters. Use kzalloc.
      
      While here change the sizeof() calls to use the pointer rather than the
      name of the type.
      
        WARNING: CPU: 0 PID: 32644 at drivers/infiniband/core/sysfs.c:1064 ib_setup_port_attrs+0x7e1/0x890 [ib_core]
        CPU: 0 PID: 32644 Comm: kworker/0:2 Tainted: G S      W 5.15.0+ #36
        Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
        Workqueue: events work_for_cpu_fn
        RIP: 0010:ib_setup_port_attrs+0x7e1/0x890 [ib_core]
        RSP: 0018:ffffc90006ea3c40 EFLAGS: 00010202
        RAX: 0000000000000068 RBX: ffff888106ad8000 RCX: 0000000000000138
        RDX: ffff888126c84c00 RSI: ffff888103c41000 RDI: 0000000000000124
        RBP: ffff88810f63a801 R08: ffff888126c8a000 R09: 0000000000000001
        R10: ffffffffa09acf20 R11: 0000000000000065 R12: ffff88810f63a800
        R13: ffff88810f63a800 R14: ffff88810f63a8e0 R15: 0000000000000001
        FS:  0000000000000000(0000) GS:ffff888667a00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00005590102cb078 CR3: 000000000240a003 CR4: 00000000001706f0
        Call Trace:
         ib_register_device.cold.44+0x23e/0x2d0 [ib_core]
         rvt_register_device+0xfa/0x230 [rdmavt]
         hfi1_register_ib_device+0x623/0x690 [hfi1]
         init_one.cold.36+0x2d1/0x49b [hfi1]
         local_pci_probe+0x45/0x80
         work_for_cpu_fn+0x16/0x20
         process_one_work+0x1b1/0x360
         worker_thread+0x1d4/0x3a0
         kthread+0x11a/0x140
         ret_from_fork+0x22/0x30
      
      Fixes: 5e2ddd1e ("RDMA/counter: Add optional counter support")
      Link: https://lore.kernel.org/r/20211115200913.124104.47770.stgit@awfm-01.cornelisnetworks.com
      
      
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      da86dc17
    • Leon Romanovsky's avatar
      RDMA/core: Set send and receive CQ before forwarding to the driver · 6cd7397d
      Leon Romanovsky authored
      Preset both receive and send CQ pointers prior to call to the drivers and
      overwrite it later again till the mlx4 is going to be changed do not
      overwrite ibqp properties.
      
      This change is needed for mlx5, because in case of QP creation failure, it
      will go to the path of QP destroy which relies on proper CQ pointers.
      
       BUG: KASAN: use-after-free in create_qp.cold+0x164/0x16e [mlx5_ib]
       Write of size 8 at addr ffff8880064c55c0 by task a.out/246
      
       CPU: 0 PID: 246 Comm: a.out Not tainted 5.15.0+ #291
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Call Trace:
        dump_stack_lvl+0x45/0x59
        print_address_description.constprop.0+0x1f/0x140
        kasan_report.cold+0x83/0xdf
        create_qp.cold+0x164/0x16e [mlx5_ib]
        mlx5_ib_create_qp+0x358/0x28a0 [mlx5_ib]
        create_qp.part.0+0x45b/0x6a0 [ib_core]
        ib_create_qp_user+0x97/0x150 [ib_core]
        ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x92c/0x1250 [ib_uverbs]
        ib_uverbs_cmd_verbs+0x1c38/0x3150 [ib_uverbs]
        ib_uverbs_ioctl+0x169/0x260 [ib_uverbs]
        __x64_sys_ioctl+0x866/0x14d0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       Allocated by task 246:
        kasan_save_stack+0x1b/0x40
        __kasan_kmalloc+0xa4/0xd0
        create_qp.part.0+0x92/0x6a0 [ib_core]
        ib_create_qp_user+0x97/0x150 [ib_core]
        ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x92c/0x1250 [ib_uverbs]
        ib_uverbs_cmd_verbs+0x1c38/0x3150 [ib_uverbs]
        ib_uverbs_ioctl+0x169/0x260 [ib_uverbs]
        __x64_sys_ioctl+0x866/0x14d0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       Freed by task 246:
        kasan_save_stack+0x1b/0x40
        kasan_set_track+0x1c/0x30
        kasan_set_free_info+0x20/0x30
        __kasan_slab_free+0x10c/0x150
        slab_free_freelist_hook+0xb4/0x1b0
        kfree+0xe7/0x2a0
        create_qp.part.0+0x52b/0x6a0 [ib_core]
        ib_create_qp_user+0x97/0x150 [ib_core]
        ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x92c/0x1250 [ib_uverbs]
        ib_uverbs_cmd_verbs+0x1c38/0x3150 [ib_uverbs]
        ib_uverbs_ioctl+0x169/0x260 [ib_uverbs]
        __x64_sys_ioctl+0x866/0x14d0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 514aee66 ("RDMA: Globally allocate and release QP memory")
      Link: https://lore.kernel.org/r/2dbb2e2cbb1efb188a500e5634be1d71956424ce.1636631035.git.leonro@nvidia.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      6cd7397d
  10. Nov 03, 2021
  11. Nov 01, 2021
  12. Oct 29, 2021
  13. Oct 28, 2021
Loading