Skip to content
Snippets Groups Projects
  1. Jan 22, 2024
    • Boris Brezillon's avatar
      drm/panthor: Add the MMU/VM logical block · 2bd5a89c
      Boris Brezillon authored
      
      MMU and VM management is related and placed in the same source file.
      
      Page table updates are delegated to the io-pgtable-arm driver that's in
      the iommu subsystem.
      
      The VM management logic is based on drm_gpuva_mgr, and is assuming the
      VA space is mostly managed by the usermode driver, except for a reserved
      portion of this VA-space that's used for kernel objects (like the heap
      contexts/chunks).
      
      Both asynchronous and synchronous VM operations are supported, and
      internal helpers are exposed to allow other logical blocks to map their
      buffers in the GPU VA space.
      
      There's one VM_BIND queue per-VM (meaning the Vulkan driver can only
      expose one sparse-binding queue), and this bind queue is managed with
      a 1:1 drm_sched_entity:drm_gpu_scheduler, such that each VM gets its own
      independent execution queue, avoiding VM operation serialization at the
      device level (things are still serialized at the VM level).
      
      The rest is just implementation details that are hopefully well explained
      in the documentation.
      
      v4:
      - Add an helper to return the VM state
      - Check drmm_mutex_init() return code
      - Remove the VM from the AS reclaim list when panthor_vm_active() is
        called
      - Count the number of active VM users instead of considering there's
        at most one user (several scheduling groups can point to the same
        vM)
      - Pre-allocate a VMA object for unmap operations (unmaps can trigger
        a sm_step_remap() call)
      - Check vm->root_page_table instead of vm->pgtbl_ops to detect if
        the io-pgtable is trying to allocate the root page table
      - Don't memset() the va_node in panthor_vm_alloc_va(), make it a
        caller requirement
      - Fix the kernel doc in a few places
      - Drop the panthor_vm::base offset constraint and modify
        panthor_vm_put() to explicitly check for a NULL value
      - Fix unbalanced vm_bo refcount in panthor_gpuva_sm_step_remap()
      - Drop stale comments about the shared_bos list
      - Patch mmu_features::va_bits on 32-bit builds to reflect the
        io_pgtable limitation and let the UMD know about it
      
      v3:
      - Add acks for the MIT/GPL2 relicensing
      - Propagate MMU faults to the scheduler
      - Move pages pinning/unpinning out of the dma_signalling path
      - Fix 32-bit support
      - Rework the user/kernel VA range calculation
      - Make the auto-VA range explicit (auto-VA range doesn't cover the full
        kernel-VA range on the MCU VM)
      - Let callers of panthor_vm_alloc_va() allocate the drm_mm_node
        (embedded in panthor_kernel_bo now)
      - Adjust things to match the latest drm_gpuvm changes (extobj tracking,
        resv prep and more)
      - Drop the per-AS lock and use slots_lock (fixes a race on vm->as.id)
      - Set as.id to -1 when reusing an address space from the LRU list
      - Drop misleading comment about page faults
      - Remove check for irq being assigned in panthor_mmu_unplug()
      
      Co-developed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
      Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
      Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
      2bd5a89c
    • Boris Brezillon's avatar
      drm/panthor: Add the devfreq logical block · a54b067f
      Boris Brezillon authored
      
      Every thing related to devfreq in placed in panthor_devfreq.c, and
      helpers that can be called by other logical blocks are exposed through
      panthor_devfreq.h.
      
      This implementation is loosely based on the panfrost implementation,
      the only difference being that we don't count device users, because
      the idle/active state will be managed by the scheduler logic.
      
      v4:
      - Add Clément's A-b for the relicensing
      
      v3:
      - Add acks for the MIT/GPL2 relicensing
      
      v2:
      - Added in v2
      
      Cc: Clément Péron <peron.clem@gmail.com> # MIT+GPL2 relicensing
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
      Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
      Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
      Acked-by: Clément Péron <peron.clem@gmail.com> # MIT+GPL2 relicensing
      a54b067f
    • Boris Brezillon's avatar
      drm/panthor: Add GEM logical block · f51a013e
      Boris Brezillon authored
      
      Anything relating to GEM object management is placed here. Nothing
      particularly interesting here, given the implementation is based on
      drm_gem_shmem_object, which is doing most of the work.
      
      v4:
      - Force kernel BOs to be GPU mapped
      - Make panthor_kernel_bo_destroy() robust against ERR/NULL BO pointers
        to simplify the call sites
      
      v3:
      - Add acks for the MIT/GPL2 relicensing
      - Provide a panthor_kernel_bo abstraction for buffer objects managed by
        the kernel (will replace panthor_fw_mem and be used everywhere we were
        using panthor_gem_create_and_map() before)
      - Adjust things to match drm_gpuvm changes
      - Change return of panthor_gem_create_with_handle() to int
      
      Co-developed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
      Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
      Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
      f51a013e
    • Boris Brezillon's avatar
      drm/panthor: Add the GPU logical block · 8dc00a8f
      Boris Brezillon authored
      
      Handles everything that's not related to the FW, the MMU or the
      scheduler. This is the block dealing with the GPU property retrieval,
      the GPU block power on/off logic, and some global operations, like
      global cache flushing.
      
      v4:
      - Expose CORE_FEATURES through DEV_QUERY
      
      v3:
      - Add acks for the MIT/GPL2 relicensing
      - Use macros to extract GPU ID info
      - Make sure we reset clear pending_reqs bits when wait_event_timeout()
        times out but the corresponding bit is cleared in GPU_INT_RAWSTAT
        (can happen if the IRQ is masked or HW takes to long to call the IRQ
        handler)
      - GPU_MODEL now takes separate arch and product majors to be more
        readable.
      - Drop GPU_IRQ_MCU_STATUS_CHANGED from interrupt mask.
      - Handle GPU_IRQ_PROTM_FAULT correctly (don't output registers that are
        not updated for protected interrupts).
      - Minor code tidy ups
      
      Cc: Alexey Sheplyakov <asheplyakov@basealt.ru> # MIT+GPL2 relicensing
      Co-developed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
      Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
      Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
      8dc00a8f
    • Boris Brezillon's avatar
      drm/panthor: Add the device logical block · c455d5bd
      Boris Brezillon authored
      
      The panthor driver is designed in a modular way, where each logical
      block is dealing with a specific HW-block or software feature. In order
      for those blocks to communicate with each other, we need a central
      panthor_device collecting all the blocks, and exposing some common
      features, like interrupt handling, power management, reset, ...
      
      This what this panthor_device logical block is about.
      
      v4:
      - Check drmm_mutex_init() return code
      - Fix panthor_device_reset_work() out path
      - Fix the race in the unplug logic
      - Fix typos
      - Unplug blocks when something fails in panthor_device_init()
      - Add Steve's R-b
      
      v3:
      - Add acks for the MIT+GPL2 relicensing
      - Fix 32-bit support
      - Shorten the sections protected by panthor_device::pm::mmio_lock to fix
        lock ordering issues.
      - Rename panthor_device::pm::lock into panthor_device::pm::mmio_lock to
        better reflect what this lock is protecting
      - Use dev_err_probe()
      - Make sure we call drm_dev_exit() when something fails half-way in
        panthor_device_reset_work()
      - Replace CSF_GPU_LATEST_FLUSH_ID_DEFAULT with a constant '1' and a
        comment to explain. Also remove setting the dummy flush ID on suspend.
      - Remove drm_WARN_ON() in panthor_exception_name()
      - Check pirq->suspended in panthor_xxx_irq_raw_handler()
      
      Co-developed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
      Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
      Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      c455d5bd
    • Boris Brezillon's avatar
      drm/panthor: Add GPU register definitions · 4a05b612
      Boris Brezillon authored
      
      Those are the registers directly accessible through the MMIO range.
      
      FW registers are exposed in panthor_fw.h.
      
      v4:
      - Add the CORE_FEATURES register (needed for GPU variants)
      - Add Steve's R-b
      
      v3:
      - Add macros to extract GPU ID info
      - Formatting changes
      - Remove AS_TRANSCFG_ADRMODE_LEGACY - it doesn't exist post-CSF
      - Remove CSF_GPU_LATEST_FLUSH_ID_DEFAULT
      - Add GPU_L2_FEATURES_LINE_SIZE for extracting the GPU cache line size
      
      Co-developed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
      Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
      Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      4a05b612
    • Boris Brezillon's avatar
      drm/panthor: Add uAPI · 5f20384a
      Boris Brezillon authored
      
      Panthor follows the lead of other recently submitted drivers with
      ioctls allowing us to support modern Vulkan features, like sparse memory
      binding:
      
      - Pretty standard GEM management ioctls (BO_CREATE and BO_MMAP_OFFSET),
        with the 'exclusive-VM' bit to speed-up BO reservation on job submission
      - VM management ioctls (VM_CREATE, VM_DESTROY and VM_BIND). The VM_BIND
        ioctl is loosely based on the Xe model, and can handle both
        asynchronous and synchronous requests
      - GPU execution context creation/destruction, tiler heap context creation
        and job submission. Those ioctls reflect how the hardware/scheduler
        works and are thus driver specific.
      
      We also have a way to expose IO regions, such that the usermode driver
      can directly access specific/well-isolate registers, like the
      LATEST_FLUSH register used to implement cache-flush reduction.
      
      This uAPI intentionally keeps usermode queues out of the scope, which
      explains why doorbell registers and command stream ring-buffers are not
      directly exposed to userspace.
      
      v4:
      - Add a VM_GET_STATE ioctl
      - Fix doc
      - Expose the CORE_FEATURES register so we can deal with variants in the
        UMD
      - Add Steve's R-b
      
      v3:
      - Add the concept of sync-only VM operation
      - Fix support for 32-bit userspace
      - Rework drm_panthor_vm_create to pass the user VA size instead of
        the kernel VA size (suggested by Robin Murphy)
      - Typo fixes
      - Explicitly cast enums with top bit set to avoid compiler warnings in
        -pedantic mode.
      - Drop property core_group_count as it can be easily calculated by the
        number of bits set in l2_present.
      
      Co-developed-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      5f20384a
    • Boris Brezillon's avatar
      iommu: Extend LPAE page table format to support custom allocators · 0206d6ca
      Boris Brezillon authored
      
      We need that in order to implement the VM_BIND ioctl in the GPU driver
      targeting new Mali GPUs.
      
      VM_BIND is about executing MMU map/unmap requests asynchronously,
      possibly after waiting for external dependencies encoded as dma_fences.
      We intend to use the drm_sched framework to automate the dependency
      tracking and VM job dequeuing logic, but this comes with its own set
      of constraints, one of them being the fact we are not allowed to
      allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this
      sort of deadlocks:
      
      - VM_BIND map job needs to allocate a page table to map some memory
        to the VM. No memory available, so kswapd is kicked
      - GPU driver shrinker backend ends up waiting on the fence attached to
        the VM map job or any other job fence depending on this VM operation.
      
      With custom allocators, we will be able to pre-reserve enough pages to
      guarantee the map/unmap operations we queued will take place without
      going through the system allocator. But we can also optimize
      allocation/reservation by not free-ing pages immediately, so any
      upcoming page table allocation requests can be serviced by some free
      page table pool kept at the driver level.
      
      I might also be valuable for other aspects of GPU and similar
      use-cases, like fine-grained memory accounting and resource limiting.
      
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      Reviewed-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: default avatarGaurav Kohli <quic_gkohli@quicinc.com>
      Tested-by: default avatarGaurav Kohli <quic_gkohli@quicinc.com>
      ---
      v4:
      - Add Gaurav's R-b/T-b
      v3:
      - Don't pass __GFP_ZERO to the custom ->alloc() hook. Returning zeroed
        mem is partof the agreement between the io-pgtable and its user
      - Add Robin R-b
      v2:
      - Add Steven R-b
      - Expand on possible use-cases for custom allocators
      0206d6ca
    • Boris Brezillon's avatar
      iommu: Allow passing custom allocators to pgtable drivers · 15293624
      Boris Brezillon authored
      
      This will be useful for GPU drivers who want to keep page tables in a
      pool so they can:
      
      - keep freed page tables in a free pool and speed-up upcoming page
        table allocations
      - batch page table allocation instead of allocating one page at a time
      - pre-reserve pages for page tables needed for map/unmap operations,
        to ensure map/unmap operations don't try to allocate memory in paths
        they're allowed to block or fail
      
      It might also be valuable for other aspects of GPU and similar
      use-cases, like fine-grained memory accounting and resource limiting.
      
      We will extend the Arm LPAE format to support custom allocators in a
      separate commit.
      
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      Reviewed-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: default avatarGaurav Kohli <quic_gkohli@quicinc.com>
      Tested-by: default avatarGaurav Kohli <quic_gkohli@quicinc.com>
      ---
      v4:
      - Add Gaurav's R-b/T-b
      v3:
      - Add Robin R-b
      - Move caps definition around
      - Add extra constraints to the ->alloc() callback documentation
      v2:
      - Add Steven R-b
      - Expand on possible use-cases for custom allocators
      - Add a caps fields to io_pgtable_init_fns so we can simplify the
        check_custom_allocator() logic (Robin Murphy)
      15293624
  2. Jan 19, 2024
  3. Jan 17, 2024
  4. Jan 16, 2024
  5. Jan 15, 2024
  6. Jan 12, 2024
  7. Jan 11, 2024
    • Maíra Canal's avatar
      drm/v3d: Show the memory-management stats on debugfs · 502756e2
      Maíra Canal authored
      
      Dump the contents of the DRM MM allocator of the V3D driver. This will
      help us to debug the VA ranges allocated.
      
      Signed-off-by: default avatarMaíra Canal <mcanal@igalia.com>
      Reviewed-by: default avatarMelissa Wen <mwen@igalia.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240105145851.193492-1-mcanal@igalia.com
      502756e2
    • Karolina Stolarek's avatar
    • Maíra Canal's avatar
      drm/vc4: don't check if plane->state->fb == state->fb · 5ee0d47d
      Maíra Canal authored
      
      Currently, when using non-blocking commits, we can see the following
      kernel warning:
      
      [  110.908514] ------------[ cut here ]------------
      [  110.908529] refcount_t: underflow; use-after-free.
      [  110.908620] WARNING: CPU: 0 PID: 1866 at lib/refcount.c:87 refcount_dec_not_one+0xb8/0xc0
      [  110.908664] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cmac algif_hash aes_arm64 aes_generic algif_skcipher af_alg bnep hid_logitech_hidpp vc4 brcmfmac hci_uart btbcm brcmutil bluetooth snd_soc_hdmi_codec cfg80211 cec drm_display_helper drm_dma_helper drm_kms_helper snd_soc_core snd_compress snd_pcm_dmaengine fb_sys_fops sysimgblt syscopyarea sysfillrect raspberrypi_hwmon ecdh_generic ecc rfkill libaes i2c_bcm2835 binfmt_misc joydev snd_bcm2835(C) bcm2835_codec(C) bcm2835_isp(C) v4l2_mem2mem videobuf2_dma_contig snd_pcm bcm2835_v4l2(C) raspberrypi_gpiomem bcm2835_mmal_vchiq(C) videobuf2_v4l2 snd_timer videobuf2_vmalloc videobuf2_memops videobuf2_common snd videodev vc_sm_cma(C) mc hid_logitech_dj uio_pdrv_genirq uio i2c_dev drm fuse dm_mod drm_panel_orientation_quirks backlight ip_tables x_tables ipv6
      [  110.909086] CPU: 0 PID: 1866 Comm: kodi.bin Tainted: G         C         6.1.66-v8+ #32
      [  110.909104] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT)
      [  110.909114] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  110.909132] pc : refcount_dec_not_one+0xb8/0xc0
      [  110.909152] lr : refcount_dec_not_one+0xb4/0xc0
      [  110.909170] sp : ffffffc00913b9c0
      [  110.909177] x29: ffffffc00913b9c0 x28: 000000556969bbb0 x27: 000000556990df60
      [  110.909205] x26: 0000000000000002 x25: 0000000000000004 x24: ffffff8004448480
      [  110.909230] x23: ffffff800570b500 x22: ffffff802e03a7bc x21: ffffffecfca68c78
      [  110.909257] x20: ffffff8002b42000 x19: ffffff802e03a600 x18: 0000000000000000
      [  110.909283] x17: 0000000000000011 x16: ffffffffffffffff x15: 0000000000000004
      [  110.909308] x14: 0000000000000fff x13: ffffffed577e47e0 x12: 0000000000000003
      [  110.909333] x11: 0000000000000000 x10: 0000000000000027 x9 : c912d0d083728c00
      [  110.909359] x8 : c912d0d083728c00 x7 : 65646e75203a745f x6 : 746e756f63666572
      [  110.909384] x5 : ffffffed579f62ee x4 : ffffffed579eb01e x3 : 0000000000000000
      [  110.909409] x2 : 0000000000000000 x1 : ffffffc00913b750 x0 : 0000000000000001
      [  110.909434] Call trace:
      [  110.909441]  refcount_dec_not_one+0xb8/0xc0
      [  110.909461]  vc4_bo_dec_usecnt+0x4c/0x1b0 [vc4]
      [  110.909903]  vc4_cleanup_fb+0x44/0x50 [vc4]
      [  110.910315]  drm_atomic_helper_cleanup_planes+0x88/0xa4 [drm_kms_helper]
      [  110.910669]  vc4_atomic_commit_tail+0x390/0x9dc [vc4]
      [  110.911079]  commit_tail+0xb0/0x164 [drm_kms_helper]
      [  110.911397]  drm_atomic_helper_commit+0x1d0/0x1f0 [drm_kms_helper]
      [  110.911716]  drm_atomic_commit+0xb0/0xdc [drm]
      [  110.912569]  drm_mode_atomic_ioctl+0x348/0x4b8 [drm]
      [  110.913330]  drm_ioctl_kernel+0xec/0x15c [drm]
      [  110.914091]  drm_ioctl+0x24c/0x3b0 [drm]
      [  110.914850]  __arm64_sys_ioctl+0x9c/0xd4
      [  110.914873]  invoke_syscall+0x4c/0x114
      [  110.914897]  el0_svc_common+0xd0/0x118
      [  110.914917]  do_el0_svc+0x38/0xd0
      [  110.914936]  el0_svc+0x30/0x8c
      [  110.914958]  el0t_64_sync_handler+0x84/0xf0
      [  110.914979]  el0t_64_sync+0x18c/0x190
      [  110.914996] ---[ end trace 0000000000000000 ]---
      
      This happens because, although `prepare_fb` and `cleanup_fb` are
      perfectly balanced, we cannot guarantee consistency in the check
      plane->state->fb == state->fb. This means that sometimes we can increase
      the refcount in `prepare_fb` and don't decrease it in `cleanup_fb`. The
      opposite can also be true.
      
      In fact, the struct drm_plane .state shouldn't be accessed directly
      but instead, the `drm_atomic_get_new_plane_state()` helper function should
      be used. So, we could stick to this check, but using
      `drm_atomic_get_new_plane_state()`. But actually, this check is not really
      needed. We can increase and decrease the refcount symmetrically without
      problems.
      
      This is going to make the code more simple and consistent.
      
      Signed-off-by: default avatarMaíra Canal <mcanal@igalia.com>
      Acked-by: Maxime Ripard's avatarMaxime Ripard <mripard@kernel.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240105175908.242000-1-mcanal@igalia.com
      5ee0d47d
    • chenxuebing's avatar
      drm/edid: Clean up errors in drm_edid.c · cbe7cea7
      chenxuebing authored and Jani Nikula's avatar Jani Nikula committed
      
      Fix the following errors reported by checkpatch:
      
      ERROR: do not use assignment in if condition
      
      Signed-off-by: default avatarchenxuebing <chenxb_99091@126.com>
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240111063921.8701-1-chenxb_99091@126.com
      cbe7cea7
Loading