Skip to content

Draft: anv/brw: split driver & application push constant allocations

What does this MR do and why?

This MR implements splitting of push constants mentioned in #11055

There multiple changes in this MR :

  • for Mesh/Task/RT stages, implement push constants loading similar to CS stage (in asssign_curb_setup)
  • implements packing of constants pulled by shaders (only on Gfx12.5+ for CS/Mesh/Task/RT stages)
  • allow parameters to be loaded from one of the ubo_ranges rather than push constants parameters

Some positive shader-db stats (DG2) :

Cyberpunk 2077:
Totals from 417 (3.92% of 10636) affected shaders:
Instrs: 330041 -> 329977 (-0.02%); split: -0.04%, +0.02%
Send messages: 18843 -> 18825 (-0.10%)
Cycle count: 17087318 -> 16211286 (-5.13%); split: -6.22%, +1.10%
Spill count: 6 -> 4 (-33.33%)
Fill count: 6 -> 4 (-33.33%)
Max live registers: 29896 -> 29787 (-0.36%); split: -0.39%, +0.03%

Aztec Ruins:
Totals from 17 (6.67% of 255) affected shaders:
Instrs: 29109 -> 29105 (-0.01%); split: -0.05%, +0.03%
Send messages: 5644 -> 5637 (-0.12%)
Cycle count: 1223278 -> 1221973 (-0.11%); split: -0.12%, +0.01%

Age Of Wonders III:
Totals from 17 (0.93% of 1822) affected shaders:
Instrs: 702 -> 668 (-4.84%)
Send messages: 113 -> 96 (-15.04%)
Cycle count: 8998 -> 7752 (-13.85%)
Max live registers: 269 -> 251 (-6.69%)

Dark Souls 3:
Totals from 17 (1.25% of 1364) affected shaders:
Instrs: 1316 -> 1291 (-1.90%); split: -2.13%, +0.23%
Send messages: 177 -> 163 (-7.91%)
Cycle count: 184756 -> 183926 (-0.45%); split: -0.61%, +0.16%
Max live registers: 346 -> 328 (-5.20%)

Strange Brigade:
Totals from 2765 (67.27% of 4110) affected shaders:
Instrs: 1447772 -> 1447555 (-0.01%); split: -0.12%, +0.10%
Send messages: 90461 -> 90443 (-0.02%)
Cycle count: 23303752 -> 23278494 (-0.11%); split: -0.35%, +0.25%
Max live registers: 164410 -> 164386 (-0.01%); split: -0.02%, +0.00%

Some regressions (DG2) :

Total WarHammer 2:
Totals from 118 (25.71% of 459) affected shaders:
Instrs: 29720 -> 29740 (+0.07%); split: -0.02%, +0.08%
Send messages: 1403 -> 1435 (+2.28%)
Cycle count: 493439 -> 499017 (+1.13%); split: -0.10%, +1.23%
Max live registers: 6415 -> 6396 (-0.30%); split: -0.34%, +0.05%

Rise of the Tomb Raider:
Totals from 41 (23.03% of 178) affected shaders:
Instrs: 12176 -> 12215 (+0.32%); split: -0.01%, +0.33%
Send messages: 1006 -> 1017 (+1.09%)
Cycle count: 5130581 -> 5148766 (+0.35%); split: -0.02%, +0.38%
Max live registers: 2356 -> 2352 (-0.17%); split: -0.21%, +0.04%

Dota 2:
Totals from 162 (10.76% of 1505) affected shaders:
Instrs: 20451 -> 20654 (+0.99%)
Send messages: 593 -> 694 (+17.03%)
Cycle count: 290815 -> 297683 (+2.36%); split: -0.06%, +2.42%
Max live registers: 5977 -> 5900 (-1.29%)

TGL is mostly untouched with a couple of regressions (similar to DG2) :

Total WarHammer 2:
Totals from 99 (21.34% of 464) affected shaders:
Instrs: 23463 -> 23502 (+0.17%); split: -0.02%, +0.19%
Send messages: 1063 -> 1095 (+3.01%)
Cycle count: 228253 -> 230334 (+0.91%); split: -0.79%, +1.70%
Max live registers: 5593 -> 5571 (-0.39%)

Rise of the Tomb Raider:
Totals from 11 (6.18% of 178) affected shaders:
Instrs: 900 -> 922 (+2.44%)
Send messages: 89 -> 100 (+12.36%)
Cycle count: 4370 -> 4363 (-0.16%); split: -0.87%, +0.71%
Max live registers: 343 -> 339 (-1.17%); split: -1.46%, +0.29%

No changes on GL shader-db (on DG2).

Most of the positively affected shaders are compute shaders, because packing of push constants is limiting the amount of constant register space used and also leads to fewer messages. There a few shaders where this is helping the register allocator, dropping a few spills.

The regressions appear in titles that are not using bindless descriptors and where we could promote more UBOs as push constants. But because we've splitted push constants in 2 (app/driver), we now have one fewer slots for promoted UBOs. I don't expect DX12 titles to be affected because they use bindless.

Perf A/B testing : results

Edited by Lionel Landwerlin

Merge request reports