v3d: Add a perfetto datasource
What does this MR do and why?
This MR adds support a v3d perfetto datasource. There are some limitations that we need to take care of.
There can be only one perfmon active at a time.
There is a new state that a perfmon can enter: global perfmon. If a global perfmon exists, every job submit will automatically use the global perfmon. This ensures no application configured perfmon will be used (e.g. with the AMD_performacne_counter extension or GALLIUM_HUD).
There come some new rules/behaviours too:
- There can only be ONE global perfmon.
- If applications are using the perfmon infra, pps_producer will overrule it and the kernel will return -EAGAIN for jobs with a perfmon.
There can only be 32 perf signal sused for a perfmon.
We need to define the performance counter of interest when we start pps_producer.
V3D_DS_COUNTER=cycle-count,CLE-bin-thread-active-cycles,CLE-render-thread-active-cycles,QPU-total-uniform-cache-hit ./src/tool/pps/pps-producer
This MR is marked as draft, after the needed kernel patch landed in next: https://lore.kernel.org/all/20241202140615.74802-1-christian.gmeiner@gmail.com/