radv: Refactor TCS-TES linking to be independent of driver locations.
Based on !28487 (merged)
This MR achieves two things:
- Keep track of TCS outputs that really need LDS, therefore allows to allocate less LDS space
- Always map fixed output locations (same way as the unlinked case) to TCS outputs (and TES inputs) in VRAM
As a result, TCS and TES will no longer depend on assigned driver locations which will unblock using better link-time optimizations (such as nir_opt_varyings
) for these stages in the future.
Fossil stats on Navi 21, thanks to Georg:
Totals from 2634 (3.32% of 79395) affected shaders:
MaxWaves: 56356 -> 56132 (-0.40%); split: +0.09%, -0.49%
Instrs: 1657940 -> 1661280 (+0.20%); split: -0.05%, +0.25%
CodeSize: 8592940 -> 8621512 (+0.33%); split: -0.01%, +0.34%
VGPRs: 127464 -> 127728 (+0.21%); split: -0.15%, +0.36%
LDS: 11250176 -> 8556544 (-23.94%)
Latency: 9380180 -> 9406915 (+0.29%); split: -0.21%, +0.50%
InvThroughput: 2169439 -> 2132590 (-1.70%); split: -2.09%, +0.39%
VClause: 34298 -> 33468 (-2.42%); split: -2.92%, +0.50%
SClause: 27319 -> 27292 (-0.10%); split: -0.28%, +0.18%
Copies: 80546 -> 80100 (-0.55%); split: -0.91%, +0.36%
PreSGPRs: 91340 -> 91354 (+0.02%); split: -0.00%, +0.02%
PreVGPRs: 109001 -> 109047 (+0.04%); split: -0.02%, +0.06%
VALU: 1081498 -> 1082668 (+0.11%); split: -0.13%, +0.24%
SALU: 177506 -> 180751 (+1.83%); split: -0.05%, +1.87%
Edited by Timur Kristóf