drivers/gpu/drm/pancsf/pancsf_sched.c · pancsf · Boris Brezillon / linux

drm/pancsf: Add a new driver for Mali CSF-based GPUs · a194f85c
Boris Brezillon authored Jan 03, 2023
Mali v10 (second Valhal iteration) and later GPUs replaced the Job
Manager block by a command stream based interface called CSF (for
Command Stream Frontend). This interface is not only turning the job
chain based submission model into a command stream based one, but also
introducing FW-assisted scheduling of command stream queues. This is a
fundamental shift in both how userspace is supposed to submit jobs, but
also how the driver is architectured. We initially tried to retrofit the
CSF model into panfrost, but this ended up introducing unneeded
complexity to the existing driver, which we all know is a potential
source of regression.

So here comes a brand new driver for CSF-based hardware. This is a
preliminary version and some important features are missing (like devfreq
, PM support and a memory shrinker implem, to name a few). The goal of
this RFC is to gather some preliminary feedback on both the uAPI and some
basic building blocks, like the MMU/VM code, the tiler heap allocation
logic...

It's also here to give concrete code to refer to for the discussion
around scheduling and VM_BIND support that started on the Xe/nouveau
threads[1][2]. Right now, I'm still using a custom timesharing-based
scheduler, but I plan to give Daniel's suggestion a try (having one
drm_gpu_scheduler per drm_sched_entity, and replacing the tick-based
scheduler by some group slot manager with an LRU-based group eviction
mechanism). I also have a bunch of things I need to figure out regarding
the VM-based memory management code. The current design assumes explicit
syncs everywhere, but we don't use resv objects yet. I see other modern
drivers are adding BOOKKEEP fences to the VM resv object and using this
VM resv to synchronize with kernel operations on the VM, but we
currently don't do any of that. As Daniel pointed out it's likely to
become an issue when we throw the memory shrinker into the mix. And of
course, the plan is to transition to the drm_gpuva_manager infrastructure
being discussed here [2] before merging the driver. Kind of related to
this shrinker topic, I'm wondering if it wouldn't make sense to use
the TTM infra for our buffer management (AFAIU, we'd get LRU-based BO
eviction for free, without needing to expose an MADVISE(DONT_NEED) kind
of API), but I'm a bit worried about the extra complexity this would pull
in.

Note that DT bindings are currently undocumented. For those who really
care, they're based on the panfrost bindings, so I don't expect any
pain points on that front. I'll provide a proper doc once all other
aspects have been sorted out.

Regards,

Boris

[1]https://lore.kernel.org/dri-devel/20221222222127.34560-1-matthew.brost@intel.com/
[2]https://lore.kernel.org/lkml/Y8jOCE%2FPyNZ2Z6aX@DUT025-TGLU.fm.intel.com/



Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Faith Ekstrand <faith.ekstrand@collabora.com>
a194f85c