Skip to content

radv, nir: Lower local invocation ID Y, Z components based on subgroup ID.

Based on !24005 (merged)

Add a new option to nir_lower_subgroups that allows to lower local_invocation_id Y and Z components to be based on the subgroup ID if possible. Furthermore, add a few new bitfield extract related patterns to nir_opt_algebraic that lets it optimize the generated code.

The main benefits of this MR are:

  • Fewer vector loads, and more scalar loads, are emitted
  • Slightly reduced VGPR use
  • Slightly fewer instructions overall

Fossil DB stats on GFX10.3 compared to current main:

Totals from 64408 (48.26% of 133461) affected shaders:
MaxWaves: 1796969 -> 1797237 (+0.01%); split: +0.02%, -0.00%
Instrs: 33585782 -> 33576655 (-0.03%); split: -0.04%, +0.01%
CodeSize: 176767328 -> 176777028 (+0.01%); split: -0.01%, +0.02%
VGPRs: 2330120 -> 2329400 (-0.03%); split: -0.03%, +0.00%
SpillSGPRs: 2786 -> 2838 (+1.87%)
SpillVGPRs: 831 -> 821 (-1.20%); split: -1.68%, +0.48%
Latency: 209316001 -> 208731408 (-0.28%); split: -0.28%, +0.00%
InvThroughput: 35531731 -> 35429071 (-0.29%); split: -0.30%, +0.01%
VClause: 635890 -> 634842 (-0.16%); split: -0.18%, +0.01%
SClause: 792124 -> 793050 (+0.12%); split: -0.01%, +0.12%
Copies: 2800354 -> 2801235 (+0.03%); split: -0.03%, +0.06%
Branches: 1032576 -> 1032708 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 2838454 -> 2838533 (+0.00%); split: -0.01%, +0.01%
PreVGPRs: 1897585 -> 1896913 (-0.04%); split: -0.04%, +0.00%
Edited by Timur Kristóf

Merge request reports