-
Boris Brezillon authored
Mali v10 (second Valhal iteration) and later GPUs replaced the Job Manager block by a command stream based interface called CSF (for Command Stream Frontend). This interface is not only turning the job chain based submission model into a command stream based one, but also introducing FW-assisted scheduling of command stream queues. This is a fundamental shift in both how userspace is supposed to submit jobs, but also how the driver is architectured. We initially tried to retrofit the CSF model into panfrost, but this ended up introducing unneeded complexity to the existing driver, which we all know is a potential source of regression. So here comes a brand new driver for CSF-based hardware. This is a preliminary version and some important features are missing (like devfreq , PM support and a memory shrinker implem, to name a few). The goal of this RFC is to gather some preliminary feedback on both the uAPI and some basic building blocks, like the MMU/VM code, the tiler heap allocation logic... It's also here to give concrete code to refer to for the discussion around scheduling and VM_BIND support that started on the Xe/nouveau threads[1][2]. Right now, I'm still using a custom timesharing-based scheduler, but I plan to give Daniel's suggestion a try (having one drm_gpu_scheduler per drm_sched_entity, and replacing the tick-based scheduler by some group slot manager with an LRU-based group eviction mechanism). I also have a bunch of things I need to figure out regarding the VM-based memory management code. The current design assumes explicit syncs everywhere, but we don't use resv objects yet. I see other modern drivers are adding BOOKKEEP fences to the VM resv object and using this VM resv to synchronize with kernel operations on the VM, but we currently don't do any of that. As Daniel pointed out it's likely to become an issue when we throw the memory shrinker into the mix. And of course, the plan is to transition to the drm_gpuva_manager infrastructure being discussed here [2] before merging the driver. Kind of related to this shrinker topic, I'm wondering if it wouldn't make sense to use the TTM infra for our buffer management (AFAIU, we'd get LRU-based BO eviction for free, without needing to expose an MADVISE(DONT_NEED) kind of API), but I'm a bit worried about the extra complexity this would pull in. Note that DT bindings are currently undocumented. For those who really care, they're based on the panfrost bindings, so I don't expect any pain points on that front. I'll provide a proper doc once all other aspects have been sorted out. Regards, Boris [1]https://lore.kernel.org/dri-devel/20221222222127.34560-1-matthew.brost@intel.com/ [2]https://lore.kernel.org/lkml/Y8jOCE%2FPyNZ2Z6aX@DUT025-TGLU.fm.intel.com/ Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Stone <daniels@collabora.com> Cc: Faith Ekstrand <faith.ekstrand@collabora.com>
a194f85c