Skip to content
Snippets Groups Projects
  1. Dec 05, 2024
  2. Nov 26, 2024
  3. Nov 25, 2024
  4. Nov 15, 2024
    • Tvrtko Ursulin's avatar
      dma-fence: Use kernel's sort for merging fences · fe52c649
      Tvrtko Ursulin authored and Christian König's avatar Christian König committed
      One alternative to the fix Christian proposed in
      https://lore.kernel.org/dri-devel/20241024124159.4519-3-christian.koenig@amd.com/
      
      
      is to replace the rather complex open coded sorting loops with the kernel
      standard sort followed by a context squashing pass.
      
      Proposed advantage of this would be readability but one concern Christian
      raised was that there could be many fences, that they are typically mostly
      sorted, and so the kernel's heap sort would be much worse by the proposed
      algorithm.
      
      I had a look running some games and vkcube to see what are the typical
      number of input fences. Tested scenarios:
      
      1) Hogwarts Legacy under Gamescope
      
      450 calls per second to __dma_fence_unwrap_merge.
      
      Percentages per number of fences buckets, before and after checking for
      signalled status, sorting and flattening:
      
         N       Before      After
         0       0.91%
         1      69.40%
        2-3     28.72%       9.4%  (90.6% resolved to one fence)
        4-5      0.93%
        6-9      0.03%
        10+
      
      2) Cyberpunk 2077 under Gamescope
      
      1050 calls per second, amounting to 0.01% CPU time according to perf top.
      
         N       Before      After
         0       1.13%
         1      52.30%
        2-3     40.34%       55.57%
        4-5      1.46%        0.50%
        6-9      2.44%
        10+      2.34%
      
      3) vkcube under Plasma
      
      90 calls per second.
      
         N       Before      After
         0
         1
        2-3      100%         0%   (Ie. all resolved to a single fence)
        4-5
        6-9
        10+
      
      In the case of vkcube all invocations in the 2-3 bucket were actually
      just two input fences.
      
      From these numbers it looks like the heap sort should not be a
      disadvantage, given how the dominant case is <= 2 input fences which heap
      sort solves with just one compare and swap. (And for the case of one input
      fence we have a fast path in the previous patch.)
      
      A complementary possibility is to implement a different sorting algorithm
      under the same API as the kernel's sort() and so keep the simplicity,
      potentially moving the new sort under lib/ if it would be found more
      widely useful.
      
      v2:
       * Hold on to fence references and reduce commentary. (Christian)
       * Record and use latest signaled timestamp in the 2nd loop too.
       * Consolidate zero or one fences fast paths.
      
      v3:
       * Reverse the seqno sort order for a simpler squashing pass. (Christian)
      
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@igalia.com>
      Fixes: 245a4a7b ("dma-buf: generalize dma_fence unwrap & merging v3")
      Closes: drm/amd#3617
      
      
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Friedrich Vock <friedrich.vock@gmx.de>
      Cc: linux-media@vger.kernel.org
      Cc: dri-devel@lists.freedesktop.org
      Cc: linaro-mm-sig@lists.linaro.org
      Cc: <stable@vger.kernel.org> # v6.0+
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20241115102153.1980-3-tursulin@igalia.com
    • Tvrtko Ursulin's avatar
      dma-fence: Fix reference leak on fence merge failure path · 949291c5
      Tvrtko Ursulin authored and Christian König's avatar Christian König committed
      
      Release all fence references if the output dma-fence-array could not be
      allocated.
      
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@igalia.com>
      Fixes: 245a4a7b ("dma-buf: generalize dma_fence unwrap & merging v3")
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Friedrich Vock <friedrich.vock@gmx.de>
      Cc: linux-media@vger.kernel.org
      Cc: dri-devel@lists.freedesktop.org
      Cc: linaro-mm-sig@lists.linaro.org
      Cc: <stable@vger.kernel.org> # v6.0+
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20241115102153.1980-2-tursulin@igalia.com
      949291c5
  5. Nov 14, 2024
  6. Nov 13, 2024
    • Akash Goel's avatar
      drm/panthor: Fix handling of partial GPU mapping of BOs · 3387e043
      Akash Goel authored and Liviu Dudau's avatar Liviu Dudau committed
      
      This commit fixes the bug in the handling of partial mapping of the
      buffer objects to the GPU, which caused kernel warnings.
      
      Panthor didn't correctly handle the case where the partial mapping
      spanned multiple scatterlists and the mapping offset didn't point
      to the 1st page of starting scatterlist. The offset variable was
      not cleared after reaching the starting scatterlist.
      
      Following warning messages were seen.
      WARNING: CPU: 1 PID: 650 at drivers/iommu/io-pgtable-arm.c:659 __arm_lpae_unmap+0x254/0x5a0
      <snip>
      pc : __arm_lpae_unmap+0x254/0x5a0
      lr : __arm_lpae_unmap+0x2cc/0x5a0
      <snip>
      Call trace:
       __arm_lpae_unmap+0x254/0x5a0
       __arm_lpae_unmap+0x108/0x5a0
       __arm_lpae_unmap+0x108/0x5a0
       __arm_lpae_unmap+0x108/0x5a0
       arm_lpae_unmap_pages+0x80/0xa0
       panthor_vm_unmap_pages+0xac/0x1c8 [panthor]
       panthor_gpuva_sm_step_unmap+0x4c/0xc8 [panthor]
       op_unmap_cb.isra.23.constprop.30+0x54/0x80
       __drm_gpuvm_sm_unmap+0x184/0x1c8
       drm_gpuvm_sm_unmap+0x40/0x60
       panthor_vm_exec_op+0xa8/0x120 [panthor]
       panthor_vm_bind_exec_sync_op+0xc4/0xe8 [panthor]
       panthor_ioctl_vm_bind+0x10c/0x170 [panthor]
       drm_ioctl_kernel+0xbc/0x138
       drm_ioctl+0x210/0x4b0
       __arm64_sys_ioctl+0xb0/0xf8
       invoke_syscall+0x4c/0x110
       el0_svc_common.constprop.1+0x98/0xf8
       do_el0_svc+0x24/0x38
       el0_svc+0x34/0xc8
       el0t_64_sync_handler+0xa0/0xc8
       el0t_64_sync+0x174/0x178
      <snip>
      panthor : [drm] drm_WARN_ON(unmapped_sz != pgsize * pgcount)
      WARNING: CPU: 1 PID: 650 at drivers/gpu/drm/panthor/panthor_mmu.c:922 panthor_vm_unmap_pages+0x124/0x1c8 [panthor]
      <snip>
      pc : panthor_vm_unmap_pages+0x124/0x1c8 [panthor]
      lr : panthor_vm_unmap_pages+0x124/0x1c8 [panthor]
      <snip>
      panthor : [drm] *ERROR* failed to unmap range ffffa388f000-ffffa3890000 (requested range ffffa388c000-ffffa3890000)
      
      Fixes: 647810ec ("drm/panthor: Add the MMU/VM logical block")
      Signed-off-by: default avatarAkash Goel <akash.goel@arm.com>
      Reviewed-by: default avatarLiviu Dudau <liviu.dudau@arm.com>
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20241111134720.780403-1-akash.goel@arm.com
      
      
      Signed-off-by: default avatarLiviu Dudau <liviu.dudau@arm.com>
      3387e043
  7. Nov 11, 2024
  8. Nov 10, 2024
    • Linus Torvalds's avatar
      Linux 6.12-rc7 · 2d5404ca
      Linus Torvalds authored
      2d5404ca
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 541f3d87
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A handful of Qualcomm clk driver fixes:
      
         - Correct flags for X Elite USB MP GDSC and pcie pipediv2 clocks
      
         - Fix alpha PLL post_div mask for the cases where width is not
           specified
      
         - Avoid hangs in the SM8350 video driver (venus) by setting HW_CTRL
           trigger feature on the video clocks"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: qcom: gcc-x1e80100: Fix USB MP SS1 PHY GDSC pwrsts flags
        clk: qcom: gcc-x1e80100: Fix halt_check for pipediv2 clocks
        clk: qcom: clk-alpha-pll: Fix pll post div mask when width is not set
        clk: qcom: videocc-sm8350: use HW_CTRL_TRIGGER for vcodec GDSCs
      541f3d87
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · d7e67a9e
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "i2c-host fixes for v6.12-rc7 (from Andi):
      
         - Fix designware incorrect behavior when concluding a transmission
      
         - Fix Mule multiplexer error value evaluation"
      
      * tag 'i2c-for-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: designware: do not hold SCL low when I2C_DYNAMIC_TAR_UPDATE is not set
        i2c: muxes: Fix return value check in mule_i2c_mux_probe()
      d7e67a9e
    • Trond Myklebust's avatar
      filemap: Fix bounds checking in filemap_read() · ace149e0
      Trond Myklebust authored
      
      If the caller supplies an iocb->ki_pos value that is close to the
      filesystem upper limit, and an iterator with a count that causes us to
      overflow that limit, then filemap_read() enters an infinite loop.
      
      This behaviour was discovered when testing xfstests generic/525 with the
      "localio" optimisation for loopback NFS mounts.
      
      Reported-by: default avatarMike Snitzer <snitzer@kernel.org>
      Fixes: c2a9737f ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
      Tested-by: default avatarMike Snitzer <snitzer@kernel.org>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ace149e0
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.12_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a9cda7c0
      Linus Torvalds authored
      Pull irq fix from Borislav Petkov:
      
       - Make sure GICv3 controller interrupt activation doesn't race with a
         concurrent deactivation due to propagation delays of the register
         write
      
      * tag 'irq_urgent_for_v6.12_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3: Force propagation of the active state with a read-back
      a9cda7c0
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-11-09-22-40' of... · 28e43197
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-11-09-22-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "20 hotfixes, 14 of which are cc:stable.
      
        Three affect DAMON. Lorenzo's five-patch series to address the
        mmap_region error handling is here also.
      
        Apart from that, various singletons"
      
      * tag 'mm-hotfixes-stable-2024-11-09-22-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mailmap: add entry for Thorsten Blum
        ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove()
        signal: restore the override_rlimit logic
        fs/proc: fix compile warning about variable 'vmcore_mmap_ops'
        ucounts: fix counter leak in inc_rlimit_get_ucounts()
        selftests: hugetlb_dio: check for initial conditions to skip in the start
        mm: fix docs for the kernel parameter ``thp_anon=``
        mm/damon/core: avoid overflow in damon_feed_loop_next_input()
        mm/damon/core: handle zero schemes apply interval
        mm/damon/core: handle zero {aggregation,ops_update} intervals
        mm/mlock: set the correct prev on failure
        objpool: fix to make percpu slot allocation more robust
        mm/page_alloc: keep track of free highatomic
        mm: resolve faulty mmap_region() error path behaviour
        mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
        mm: refactor map_deny_write_exec()
        mm: unconditionally close VMAs on error
        mm: avoid unsafe VMA hook invocation when error arises on mmap hook
        mm/thp: fix deferred split unqueue naming and locking
        mm/thp: fix deferred split queue not partially_mapped
      28e43197
    • Linus Torvalds's avatar
      Merge tag 'usb-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · a558cc34
      Linus Torvalds authored
      Pull USB/Thunderbolt fixes from Greg KH:
       "Here are some small remaining USB and Thunderbolt fixes and device ids
        for 6.12-rc7. Included in here are:
      
         - new USB serial driver device ids
      
         - thunderbolt driver fixes for reported problems
      
         - typec bugfixes
      
         - dwc3 driver fix
      
         - musb driver fix
      
        All of these have been in linux-next this past week with no reported
        issues"
      
      * tag 'usb-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: serial: qcserial: add support for Sierra Wireless EM86xx
        thunderbolt: Fix connection issue with Pluggable UD-4VPD dock
        usb: typec: fix potential out of bounds in ucsi_ccg_update_set_new_cam_cmd()
        usb: dwc3: fix fault at system suspend if device was already runtime suspended
        usb: typec: qcom-pmic: init value of hdr_len/txbuf_len earlier
        usb: musb: sunxi: Fix accessing an released usb phy
        USB: serial: io_edgeport: fix use after free in debug printk
        USB: serial: option: add Quectel RG650V
        USB: serial: option: add Fibocom FG132 0x0112 composition
        thunderbolt: Add only on-board retimers when !CONFIG_USB4_DEBUGFS_MARGINING
      a558cc34
    • Linus Torvalds's avatar
      Merge tag 'staging-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 023d4fc0
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are two small memory leak fixes for the vchiq_arm staging driver
        that have been sitting in my tree for weeks and should get merged for
        6.12-rc7 so that people don't keep tripping over them.
      
        They both have been in linux-next for a while with no reported
        problems"
      
      * tag 'staging-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: vchiq_arm: Use devm_kzalloc() for drv_mgmt allocation
        staging: vchiq_arm: Use devm_kzalloc() for vchiq_arm_state allocation
      023d4fc0
  9. Nov 09, 2024
  10. Nov 08, 2024
Loading