Skip to content

v3d: Add a fallback scheduler when register allocation fails

Neil Roberts requested to merge nroberts/mesa:v3d/schedule-fallback into master

v3d uses nir_schedule to schedule instructions. This tries to aggressively parallelise the shader until it hits a certain register pressure. When it hits this pressure it then tries to pick paths that reduce the register pressure so that it won’t cause register allocation to fail. Unfortunately it seems like the algorithm that tries to reduce pressure doesn’t work very well. When combined with the limited spilling that is possible on v3d, it can easily hit a situation where the register allocation fails. This is the case for these dEQP tests using geometry shaders:


This is probably also related to issue 2990.

Note that disabling the scheduler altogether makes the tests pass. It’s also worth noting that if you set the register pressure threshold to 0 so that it will always use the register pressure reduction algorithm, that isn’t enough to make the tests pass. So it seems like the scheduler algorithm could do with some work to make it handle register pressure better.

I tried various experiments to improve the scheduler in order to make the register pressure instruction selection better, but so far I haven’t come up with a good balance that makes the tests pass without breaking other tests. It seems like a bit of a balancing act together in conjunction with the spilling code which seems to easily get stuck and doesn’t spill things that it seems like it ought to.

This MR tries to make a fallback algorithm for the shader that just aggressively picks the node with the smallest max delay in the hope that that will be the most likely to be pass register allocation. The compiler then retries compilation with this fallback algorithm when compilation fails due to register allocation.

The algorithm brings up a separate issue that the scheduler doesn’t know that in the geometry shader the input primitive ID is stored in the same location in the VPM output stream as the output header. That means there needs to be a dependency between the load_primitive_id intrinsic and writing to VPM offset 0. This MR tries to fix that as well by modifying nir_schedule to have a mechanism to call back into the backend driver when determining dependencies for intrinsics. I think this is a problem that already exists independently of the register allocation issue, but without the fallback scheduling algorithm it seems very unlikely that the problem would be hit.

The MR fixes the dEQP tests mentioned above but I haven’t tested it with issue 2990.

Merge request reports