Skip to content

aco: improve compile speed

This MR contains some patches to speed up value_numbering, live_var_analysis and register_allocation mostly by more efficient use of standard containers.

For register_allocation, there is also a small patch to make better decisions when live_range splits are necessary:

Totals from affected shaders:
SGPRS: 298112 -> 298112 (0.00 %)
VGPRS: 245584 -> 245596 (0.00 %)
Spilled SGPRs: 25088 -> 25088 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 25714088 -> 25695092 (-0.07 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Max Waves: 21136 -> 21133 (-0.01 %)

The patch changing the live_out variables from std::set to std::unordered set has a slight effect due to the ordering in which live_in variables are handled in RA. but overall no real difference. All other patches don't affect the stats.

The compile times for a pipelinedb collection went down from 241s to 227s using 6 threads.

Edited by Daniel Schürmann

Merge request reports