Skip to content

aco: scheduling improvements

Daniel Schürmann requested to merge daniel-schuermann/mesa:aco_scheduler into master

As mentioned in my XDC talk, ACO doesn't really take RAR dependencies into account when scheduling. This series does a few changes to the scheduler in order to improve the situation.

  • restrict scheduling more depending on max_waves prevents the scheduler from loosing too much parallelism
  • better handling of RAR dependencies for VMEM instructions. this greatly improves the VMEM def-use distances as they are executed in-order
  • some minor changes.

A small increase in code size is mainly due to an increased number of waitcnt() instructions. As with almost all changes, the result is a bit mixed, and some games might experience a slight loss in performance, but overall I think this series is beneficial. This series also makes scheduling slightly faster than before.

Total shader stats changes:

  • 57559 shaders in 28980 tests
  • Totals:
  • SGPRS: 2895271 -> 2969727 (2.57 %)
  • VGPRS: 1981304 -> 1964604 (-0.84 %)
  • Spilled SGPRs: 868 -> 868 (0.00 %)
  • Spilled VGPRs: 0 -> 0 (0.00 %)
  • Private memory VGPRs: 0 -> 0 (0.00 %)
  • Scratch size: 10348 -> 10348 (0.00 %) dwords per thread
  • Code Size: 114455544 -> 114584072 (0.11 %) bytes
  • LDS: 933 -> 933 (0.00 %) blocks
  • Max Waves: 378759 -> 382668 (1.03 %)
  • Wait states: 0 -> 0 (0.00 %)

Merge request reports