Skip to content

aco: adjust num_waves for LDS before scheduling

The idea is that certain num_waves targets aren't beneficial as the same number of workgroups could be executed with a smaller number of waves in flight.

The fossil stats differences come e.g. from shaders with workgroup size of 576 which requires 9* wave64. In this example, it was possible to launch 3 workgroups per WGP (27 waves in total) by using num_waves == 8 across 4 SIMD units. But the 3 workgroups can already be launched with only 7 waves. So, reduce the num_waves in order to use the additional registers for better scheduling.

Edited by Daniel Schürmann

Merge request reports