intel/brw: Misc optimizations and improvements
These are some improvements and optimizations that I found while fixing some functional regressions in an update to !29884 (merged).
I suspect "intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup" will help some ray tracing workloads.
fossil-db results across the whole series on Meteor Lake (-1,523 loops is pretty cool):
Totals:
Instrs: 151531355 -> 151496069 (-0.02%); split: -0.02%, +0.00%
Send messages: 7459339 -> 7459396 (+0.00%)
Loop count: 49111 -> 47588 (-3.10%)
Cycle count: 17209372399 -> 17193247702 (-0.09%); split: -0.11%, +0.02%
Spill count: 80830 -> 80839 (+0.01%); split: -0.02%, +0.03%
Fill count: 152754 -> 152671 (-0.05%); split: -0.07%, +0.02%
Scratch Memory Size: 4136960 -> 4130816 (-0.15%)
Max live registers: 32016490 -> 32015955 (-0.00%); split: -0.00%, +0.00%
Totals from 18963 (3.01% of 630198) affected shaders:
Instrs: 5398894 -> 5363608 (-0.65%); split: -0.69%, +0.04%
Send messages: 238314 -> 238371 (+0.02%)
Loop count: 8145 -> 6622 (-18.70%)
Cycle count: 5964817450 -> 5948692753 (-0.27%); split: -0.33%, +0.06%
Spill count: 39396 -> 39405 (+0.02%); split: -0.04%, +0.07%
Fill count: 75293 -> 75210 (-0.11%); split: -0.14%, +0.03%
Scratch Memory Size: 1767424 -> 1761280 (-0.35%)
Max live registers: 549689 -> 549154 (-0.10%); split: -0.13%, +0.04%