Skip to content

nir/opt_if: Be more aggressive in opt_split_alu_of_phi

Faith Ekstrand requested to merge gfxstrand/mesa:review/opt-split-alu into main

The opt_split_alu_of_phi pass in nir_opt_if was originally designed to handle cases where the continue portion of the loop was eaten up by nir_opt_peephole_select before opt_peel_loop_initial_if got to it. However, it only peels instructions around if their sources either come from the tail of the loop or are immediates or undefs. This is a good way to avoid register pressure regressions but it also means that it can't peel them off in many cases. This commit removes those restrictions and makes the pass more aggressive.

By itself, this change only ever adds instructions since it duplicates every instruction it peels. However, those instructions aren't ever executed any more times and it does sometimes allow other optimization passes such as opt_simplify_bcsel_of_phi to make progress. The one downside, as mentioned above, is that it can increase register pressure in some cases (I'm still not 100% sure how) and resulted in a few lost SIMD16 programs.

Shader-db results on ICL:

total instructions in shared programs: 17129594 -> 17130180 (<.01%)
instructions in affected programs: 32537 -> 33123 (1.80%)
helped: 0
HURT: 82
HURT stats (abs)   min: 7 max: 10 x̄: 7.15 x̃: 7
HURT stats (rel)   min: 1.21% max: 3.76% x̄: 1.91% x̃: 1.77%
95% mean confidence interval for instructions value: 7.00 7.29
95% mean confidence interval for instructions %-change: 1.79% 2.02%
Instructions are HURT.

total cycles in shared programs: 364257702 -> 364316288 (0.02%)
cycles in affected programs: 627727 -> 686313 (9.33%)
helped: 0
HURT: 82
HURT stats (abs)   min: 36 max: 10956 x̄: 714.46 x̃: 258
HURT stats (rel)   min: 0.78% max: 16.03% x̄: 6.05% x̃: 5.32%
95% mean confidence interval for cycles value: 270.02 1158.91
95% mean confidence interval for cycles %-change: 5.39% 6.70%
Cycles are HURT.

LOST:   8
GAINED: 0

The lost programs are all in Talos Principal (which doesn't even have an OpenGL back-end anymore) and Serious Sam 3.

Fossil-db results on ICL:

Instructions in all programs: 263442482 -> 263438116 (-0.0%)
SENDs in all programs: 15062268 -> 15062268 (+0.0%)
Loops in all programs: 150022 -> 150022 (+0.0%)
Cycles in all programs: 84967273066 -> 84728637038 (-0.3%)
Spills in all programs: 243458 -> 243414 (-0.0%)
Fills in all programs: 341009 -> 340983 (-0.0%)

Virtually all the change there was in GeekBench and it was all help except for two shaders hurt by a few cycles. There were also four shaders in Shadow of the Tomb Rader which were hurt by < 1% instructions but helped by 2% cycles on average.

Merge request reports