intel/brw: Add and use SHADER_OPCODE_SEND_GATHER in Xe3
On top of intel/brw: Add encoding support for ARF scalar ... (!32236 - merged).
Xe3 adds a variant of SEND that can take various non-contiguous GRFs instead of one or two contiguous spans. This is encoded by setting the register numbers into the ARF scalar register and passing that as source to the SEND.
This allows the payload to be "broken" into smaller pieces that can be further optimized, which may result in
- less register pressure (no need to contiguous space), and
- less instructions (no need to MOV to such space).
This MR adds a new opcode, and apply it to existing SENDs that potentially can
benefit from it (brw_fs_opt_send_to_send_gather()
). After register allocation
is done, they are lowered (brw_lower_send_gather_inst()
) by either
- adding the move to the ARF scalar register with the correct register numbers;
- if the registers are spread into one or two contiguous spans, fallback to a split send.
The "fallback" is important since there's a cost associated with GATHER, which is the extra instruction(s) to write the ARF scalar register.
An early attempted at this MR was focused on adding a single MULTI_SEND that would also
perform some of the roles from LOAD_PAYLOAD
. That didn't work well, in summary because was
trying to do everything at the same time. I think is still a good idea to have some sort
of MULTI_SEND in the future, but maybe this time keeping the LOAD_PAYLOAD around.
Fossil-DB numbers (2025-01-23)
*** Shaders only in 'after' results are ignored:
steam-dxvk/hogwarts_legacy/2e8a19f87d3ac31d/fs.32/0, steam-dxvk/hogwarts_legacy/dcf5ea29d77c7997/fs.32/0, steam-dxvk/batman_arkham_origins/849a667532e71382/fs.32/0, steam-dxvk/hogwarts_legacy/754977f13ae6182b/fs.32/0, steam-dxvk/dying_light_2/11c8a9d6c9ab0c26/fs.32/0, and 1855 more
from 26 apps: benchmarks/3dmark_disco, benchmarks/gravity_mark, demos/shooter_demo_g2, demos/spaceship_demo_ue5, steam-dxvk/CivilizationVI, steam-dxvk/alan_wake, steam-dxvk/assassins_creed_odyssey...
*** Shaders only in 'before' results are ignored:
steam-dxvk/batman_arkham_origins/3e10858b8ee14eb9/fs.32/0, steam-dxvk/batman_arkham_origins/01a12e62954eba77/fs.32/0, steam-dxvk/f1_22_abu_dhabi/05b05c652030b307/fs.32/0, steam-dxvk/batman_arkham_city_goty/5d4ce53dfd3a5e0d/fs.32/0, steam-dxvk/batman_arkham_origins/160f706af6a2e7e7/fs.32/0, and 442 more
from 22 apps: demos/shooter_demo_g2, demos/spaceship_demo_ue5, steam-dxvk/age_of_wonders_III, steam-dxvk/alan_wake, steam-dxvk/assassins_creed_odyssey, steam-dxvk/batman_arkham_city_goty, steam-dxvk/batman_arkham_origins...
Totals:
Instrs: 205496497 -> 205318964 (-0.09%); split: -1.70%, +1.61%
Subgroup size: 14191200 -> 14244240 (+0.37%)
Cycle count: 26206544670 -> 26379226418 (+0.66%); split: -1.36%, +2.02%
Spill count: 380038 -> 113285 (-70.19%); split: -70.54%, +0.35%
Fill count: 479128 -> 217651 (-54.57%); split: -56.98%, +2.41%
Scratch Memory Size: 28083200 -> 9758720 (-65.25%); split: -65.35%, +0.10%
Max live registers: 64431778 -> 59613435 (-7.48%); split: -7.48%, +0.00%
Totals from 704058 (99.87% of 704981) affected shaders:
Instrs: 205451151 -> 205273618 (-0.09%); split: -1.70%, +1.61%
Subgroup size: 14173248 -> 14226288 (+0.37%)
Cycle count: 26204798762 -> 26377480510 (+0.66%); split: -1.36%, +2.02%
Spill count: 380038 -> 113285 (-70.19%); split: -70.54%, +0.35%
Fill count: 479128 -> 217651 (-54.57%); split: -56.98%, +2.41%
Scratch Memory Size: 28083200 -> 9758720 (-65.25%); split: -65.35%, +0.10%
Max live registers: 64406362 -> 59588019 (-7.48%); split: -7.48%, +0.00%
PERCENTAGE DELTAS Shaders Instrs Subgroup size Cycle count Spill count Fill count Scratch Memory Size Max live registers
benchmarks/3dmark_disco 1020 -0.82% . -2.79% -10.85% -21.08% -73.47% -10.91%
benchmarks/aztec_ruins_high 224 +3.15% +0.35% +5.71% -1.05% -11.00% -1.52% -5.87%
benchmarks/geekbench5 768 +2.45% . +0.09% . . . .
benchmarks/gravity_mark 271 +4.62% +1.38% +5.79% -0.62% -1.81% +3.23% -5.80%
demos/shooter_demo_g2 274 +2.12% . +2.60% . . . -6.69%
demos/spaceship_demo_ue5 952 +1.25% . +0.72% -1.52% -0.72% -4.58% -5.81%
steam-dxvk/CivilizationVI 405 +1.44% . +2.15% . . . -7.46%
steam-dxvk/age_of_empires_II_2013 128 +4.23% . +1.16% . . . -14.29%
steam-dxvk/age_of_wonders_III 1485 +1.87% . +1.81% . . . -11.13%
steam-dxvk/alan_wake 10198 +1.13% . +10.92% . . . -12.91%
steam-dxvk/assassins_creed_odyssey 1972 +1.02% +0.31% +3.81% -18.31% -21.35% -39.50% -6.32%
steam-dxvk/batman_arkham_city_goty 164585 +2.97% . +11.57% -98.90% -98.09% -98.72% -7.91%
steam-dxvk/batman_arkham_origins 239029 +0.85% . +0.46% . . . -6.07%
steam-dxvk/borderlands_3 1864 +1.29% . +0.42% -86.25% -83.18% -71.43% -6.66%
steam-dxvk/cyberpunk_2077 563 +1.89% . +0.83% +3.87% +3.68% -1.62% -0.89%
steam-dxvk/dark_souls_3_dxvk_g2 1160 +1.36% . +0.34% . . . -6.49%
steam-dxvk/dying_light_2 11523 +3.16% . +18.44% -9.43% -52.05% -11.11% -14.05%
steam-dxvk/f1_22_abu_dhabi 20626 +1.02% +0.03% +1.01% -51.89% -58.69% -37.63% -7.37%
steam-dxvk/fallout_4_dxvk_g2 1465 +1.95% . +0.93% -1.39% +7.36% . -7.50%
steam-dxvk/far_cry_new_dawn 2073 +1.26% . +0.78% -24.05% -19.27% -33.33% -5.49%
steam-dxvk/hitman_3 1790 +0.77% +0.36% +1.77% -9.93% -0.07% -9.84% -8.77%
steam-dxvk/hogwarts_legacy 109230 -4.64% +2.31% +4.10% -77.66% -64.35% -74.02% -10.07%
steam-dxvk/octopath_traveler 17401 +1.57% +0.00% +1.89% . . . -8.93%
steam-dxvk/strange_brigade 3988 +1.26% +0.04% +1.72% . . . -4.65%
steam-dxvk/total_war_warhammer3 7919 -1.07% . +0.63% -38.21% -32.13% -46.39% -6.56%
steam-dxvk/witcher_3_dxvk_g2 908 -0.02% . +6.11% -47.58% -33.66% -44.44% -6.03%
steam-native/doom_2016_g2 940 +1.46% . -1.41% . . . -5.34%
steam-native/dota2_g2 1345 +1.39% . +6.53% . . . -11.91%
steam-native/red_dead_redemption2 5283 -0.14% +0.06% -1.42% -45.03% -30.41% -23.99% -7.20%
steam-native/rise_of_the_tomb_raider_g2 146 +2.06% +0.52% -3.77% +37.50% +22.50% +66.67% -4.61%
steam-native/shadow_of_the_tomb_raider 90813 +0.93% +0.00% +0.99% +8.70% +8.70% . -4.34%
steam-native/strange_brigade 2082 +1.44% +0.04% +1.28% . . . -3.67%
steam-native/talos_g2 1041 +0.43% . +5.63% . . . -5.70%
steam-native/total_war_warhammer2 425 +0.81% +0.18% +0.03% -86.11% -77.14% -66.67% -5.62%
steam-native/wolfenstein_youngblood 613 +1.32% . +0.61% . . . -3.00%
unicom 472 +0.97% +0.15% +2.48% -18.24% -29.34% -75.00% -5.76%
---------------------------------------------------------------------------------------------------------------------------------------------------
All affected 704058 -0.09% +0.37% +0.66% -70.19% -54.57% -65.25% -7.48%
---------------------------------------------------------------------------------------------------------------------------------------------------
Total 704981 -0.09% +0.37% +0.66% -70.19% -54.57% -65.25% -7.48%
Fossil-DB numbers (2025-01-30), VRT already landed so we see improvements as 'GRF registers' count
*** Shaders only in 'after' results are ignored:
steam-dxvk/total_war_warhammer3/d13b630560a17c9b/fs.32/0, steam-dxvk/far_cry_new_dawn/2cc94a6a55b82e7f/fs.32/0, steam-dxvk/hogwarts_legacy/2cadc7ce785bc7ed/fs.32/0, steam-dxvk/total_war_warhammer3/dbc85aaa69b38545/fs.32/0, steam-dxvk/hogwarts_legacy/e9724f69caf8472e/fs.32/0, and 83 more
from 7 apps: demos/spaceship_demo_ue5, steam-dxvk/borderlands_3, steam-dxvk/f1_22_abu_dhabi, steam-dxvk/far_cry_new_dawn, steam-dxvk/hogwarts_legacy, steam-dxvk/total_war_warhammer3, steam-native/red_dead_redemption2
*** Shaders only in 'before' results are ignored:
steam-dxvk/hogwarts_legacy/c72645c68d5b4dac/fs.16/0, steam-dxvk/total_war_warhammer3/8304d64f4606073d/fs.16/0, steam-dxvk/hogwarts_legacy/24ae9f27eed0ae73/fs.32/0, steam-dxvk/total_war_warhammer3/339323212a878c76/fs.16/0, steam-dxvk/hogwarts_legacy/66e05b66768ae110/fs.16/0, and 83 more
from 7 apps: demos/spaceship_demo_ue5, steam-dxvk/borderlands_3, steam-dxvk/f1_22_abu_dhabi, steam-dxvk/far_cry_new_dawn, steam-dxvk/hogwarts_legacy, steam-dxvk/total_war_warhammer3, steam-native/red_dead_redemption2
Totals:
Instrs: 166479853 -> 167926267 (+0.87%); split: -0.89%, +1.76%
Subgroup size: 12684400 -> 12745184 (+0.48%)
Cycle count: 28311565622 -> 28829826724 (+1.83%); split: -0.51%, +2.34%
Spill count: 105086 -> 14637 (-86.07%); split: -86.96%, +0.89%
Fill count: 121341 -> 18441 (-84.80%); split: -85.98%, +1.18%
Scratch Memory Size: 8421376 -> 2099200 (-75.07%); split: -75.46%, +0.39%
Max live registers: 64786181 -> 60120005 (-7.20%); split: -7.22%, +0.01%
GRF registers: 43509397 -> 40371436 (-7.21%); split: -7.35%, +0.14%
GRF blocks: 1139573 -> 1027132 (-9.87%); split: -10.03%, +0.17%
Totals from 531093 (99.91% of 531559) affected shaders:
Instrs: 166468857 -> 167915271 (+0.87%); split: -0.89%, +1.76%
Subgroup size: 12669744 -> 12730528 (+0.48%)
Cycle count: 28310754008 -> 28829015110 (+1.83%); split: -0.51%, +2.34%
Spill count: 105086 -> 14637 (-86.07%); split: -86.96%, +0.89%
Fill count: 121341 -> 18441 (-84.80%); split: -85.98%, +1.18%
Scratch Memory Size: 8421376 -> 2099200 (-75.07%); split: -75.46%, +0.39%
Max live registers: 64773322 -> 60107146 (-7.20%); split: -7.22%, +0.01%
GRF registers: 43501405 -> 40363444 (-7.21%); split: -7.35%, +0.14%
GRF blocks: 1139557 -> 1027116 (-9.87%); split: -10.03%, +0.17%
PERCENTAGE DELTAS Shaders Instrs Subgroup size Cycle count Spill count Fill count Scratch Memory Size Max live registers GRF registers GRF blocks
benchmarks/3dmark_disco 754 +0.84% . +4.89% . . . -12.08% -5.99% -3.98%
benchmarks/aztec_ruins_high 172 +6.72% . +0.55% . . . -7.30% -3.69% -3.33%
benchmarks/geekbench5 768 +2.44% . +0.01% . . . . -0.16% .
benchmarks/gravity_mark 254 +5.35% . +9.10% . . . -6.77% -6.95% -10.77%
demos/shooter_demo_g2 216 +2.21% . +1.48% . . . -7.00% -5.67% -9.26%
demos/spaceship_demo_ue5 820 +1.42% +0.22% -1.54% . . . -5.70% -4.08% -3.61%
steam-dxvk/CivilizationVI 322 +1.55% . +1.29% . . . -7.02% -7.01% -10.54%
steam-dxvk/age_of_empires_II_2013 113 +4.74% . +3.50% . . . -14.99% -9.94% -40.00%
steam-dxvk/age_of_wonders_III 1135 +1.96% . +0.54% . . . -12.41% -8.84% -16.38%
steam-dxvk/alan_wake 7759 +1.41% . +5.69% . . . -14.50% -9.83% -14.76%
steam-dxvk/assassins_creed_odyssey 1603 +1.39% . +0.56% . . . -5.72% -3.42% -3.70%
steam-dxvk/batman_arkham_city_goty 135686 +3.19% . +3.99% . . . -6.38% -9.42% -13.98%
steam-dxvk/batman_arkham_origins 164193 +0.99% . +0.55% . . . -7.94% -5.13% -3.59%
steam-dxvk/borderlands_3 1456 +1.50% . -0.32% . . . -6.98% -5.30% -4.15%
steam-dxvk/cyberpunk_2077 558 +2.47% . +0.89% +9.00% +8.30% +1.82% -0.88% -0.81% -0.21%
steam-dxvk/dark_souls_3_dxvk_g2 917 +1.29% . +0.70% . . . -6.17% -4.96% -6.94%
steam-dxvk/dying_light_2 7898 +2.45% . +3.17% . . . -14.96% -12.42% -13.94%
steam-dxvk/f1_22_abu_dhabi 14967 +0.12% +0.01% +2.14% . . . -8.18% -9.24% -10.90%
steam-dxvk/fallout_4_dxvk_g2 1056 +1.82% . +2.79% . . . -9.08% -7.29% -13.55%
steam-dxvk/far_cry_new_dawn 1570 +1.06% . -0.48% . . . -5.74% -6.52% -8.78%
steam-dxvk/hitman_3 1455 +1.05% +0.08% +1.87% . . . -8.45% -7.23% -9.90%
steam-dxvk/hogwarts_legacy 81619 -1.85% +3.11% +4.22% -99.49% -99.69% -99.42% -9.78% -8.88% -10.36%
steam-dxvk/octopath_traveler 13071 +1.58% . +1.13% . . . -9.16% -6.02% -14.19%
steam-dxvk/strange_brigade 2971 +1.52% . +0.43% . . . -5.22% -2.70% -6.60%
steam-dxvk/total_war_warhammer3 6336 +0.41% . -8.51% . . . -6.36% -3.52% -4.96%
steam-dxvk/witcher_3_dxvk_g2 674 +0.78% . +17.72% . . . -6.49% -4.69% -5.65%
steam-native/doom_2016_g2 706 +1.01% . +1.73% . . . -5.49% -4.39% -5.68%
steam-native/dota2_g2 976 +1.62% . +1.47% . . . -12.14% -10.79% -13.34%
steam-native/red_dead_redemption2 4448 +1.17% . +2.45% . . . -7.16% -5.67% -7.59%
steam-native/rise_of_the_tomb_raider_g2 128 +1.59% . -0.78% . . . -4.26% -2.11% -1.78%
steam-native/shadow_of_the_tomb_raider 73847 +0.96% . +0.69% . . . -3.96% -3.51% -7.67%
steam-native/strange_brigade 1547 +1.67% . +0.54% . . . -4.12% -2.01% -1.32%
steam-native/talos_g2 760 +0.74% . +0.90% . . . -6.32% -5.85% -6.65%
steam-native/total_war_warhammer2 306 +1.05% . +0.07% . . . -6.66% -3.07% -4.14%
steam-native/wolfenstein_youngblood 498 +0.74% . -2.65% . . . -2.77% -3.03% -2.87%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
All affected 531093 +0.87% +0.48% +1.83% -86.07% -84.80% -75.07% -7.20% -7.21% -9.87%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 531559 +0.87% +0.48% +1.83% -86.07% -84.80% -75.07% -7.20% -7.21% -9.87%