anv: Leave push range checking to the back-end
This MR fixes performance regressions introduced by e03f9652 in which we started bounds checking our push constants. This added a LOT of shader code to shaders which use the robustBufferAccess feature and led to substantial spilling. The checking we can do in the FS back-end is far more efficient for two reasons:
-
It can be done at a whole register granularity rather than per- scalar and so we emit one SIMD8
SEL
per 32B GRF rather than one SIMD16SEL
(executed as twoSEL
s) for each component loaded. -
Because we do it with
NoMask
instructions, we can do it on whole pushed GRFs without splatting them out to SIMD8 or SIME16 values. This means that robust buffer access no longer explodes our register pressure for no good reason.
As a tiny side-benefit, we're now using can use AND
instead of SEL
which
means no need for the flag and better scheduling.
Vulkan pipeline database results on ICL:
Instructions in all programs: 293586059 -> 238009118 (-18.9%)
SENDs in all programs: 13568515 -> 13568515 (+0.0%)
Loops in all programs: 149720 -> 149720 (+0.0%)
Cycles in all programs: 88499234498 -> 84348917496 (-4.7%)
Spills in all programs: 1229018 -> 184339 (-85.0%)
Fills in all programs: 1348397 -> 246061 (-81.8%)
This improves the performance of the Shadow of the Tomb Raider benchmark by about 3-5%.