nir/schedule: use rb_tree to avoid linear searching. (!2881) · Merge requests · Mesa / mesa

Alejandro Piñeiro requested to merge apinheiro/mesa:review/nir-schedule-use-rb-trees into main Nov 26, 2019

It adds one rb_tree so when searching for a better node to schedule, we could avoid a linear searching in some cases, and on those cases that we rely on a linear search, improving that as the list is ordered. At some point I experimented with having two rb trees (ready/leader) but I didn't get a real improvement, and the code became somewhat more complex.

Curiously using shader-db to compare with the linear searching, the time got slightly worse on v3d:

Total CPU time (seconds): 8895.72 -> 8907.25 (0.13%)

That can be explained because the outcome are also slightly different. Again focusing on spill/fills stats, and comparing the linear searching with having the rb tree on v3d:

  total spills in shared programs: 1260 -> 1212 (-3.81%)
  spills in affected programs: 1136 -> 1088 (-4.23%)
  helped: 15
  HURT: 33
 
  total fills in shared programs: 2492 -> 2471 (-0.84%)
  fills in affected programs: 1892 -> 1871 (-1.11%)
  helped: 18
  HURT: 33

Having different outcome can be initially surprising. After a brief checking, this is mostly due the cases where we have nodes with the same maximum value of max_delay. For that case, the linear searching return the first head with that repeated value, while the code using a rb tree could chose a different one.

I was tempted to try to get the same outcome, or even investigate a little further, but seemed somewhat unnecessary as the outcome is even slightly better.

BTW, going back to the times, it is worth to note that even the linear searching got the CPU time to go down when using the nir scheduler. So comparing master and the nir scheduler with linear searching:

Total CPU time (seconds): 9471.56 -> 8895.72 (-6.08%)

And comparing master with the nir scheduler with rb tree:

Total CPU time (seconds): 9471.56 -> 8907.25 (-5.96%)

Edited May 05, 2021 by Jordan Justen

nir/schedule: use rb_tree to avoid linear searching.

Merge request reports