RADV: Extreme overhead in vkQueueSubmit
vkd3d-proton triggers an extreme slowdown in vkQueueSubmit between two adjacent vkd3d-proton commits on Mesa master with RX 6800, kernel 5.11 and 5.10:
db1b425d2aa7729503ad13db7cafdefa3999be68 is slow 8437eea2c0551270961a3abfcd9f004c502ba430 is fine
We're observing that vkQueueSubmit takes 20x as long time on the slow commit compared to the fast one. In the slow case it takes ~2ms to complete vkQueueSubmit which makes the submission thread the main CPU bottleneck. The culprit is amdgpu cs submission based on ad-hoc instrumentation.
The odd part here is that we actually have fewer live BOs, yet it's much slower, so we're likely hitting some interesting corner case.
RADV_PERFTEST=localbos changes nothing from my testing.