freedreno/ir3: scheduler improvements
For instructions that increase the # of live values, apply a threshold to avoid scheduling them too early.
I noticed that we'd tend to schedule silly things like out_color[n].w = 1.0f
really early, because the immed load has no dependencies, it would be picked up to fill a delay slot when nothing else was available. It turns out that there is other similar silliness which is a bit harder to spot in a large shader.
First attempt was just to blindly apply a depth threshold. A lower depth means an instruction isn't needed until closer to the end of the shader. So filtering out shallow instructions relative to the deepest candidate reduces the tendency to schedule too early. Overall for affected programs it was +37.11% instructions and -11.85% registers. Despite instruction count increase, the register decrease seemed to bump manhattan by about 10%.
Limiting the threshold to just instructions that increase the number of live values brought instruction count back in line without giving up much of the register usage gain (-10.73%), and actually slightly improved instruction count (-4.17%). End result was ~25-30% fps gain. (So half-precision should be a huge win.)
There might be some room to tweak things a bit further, and should look at some other sets of schaders still.
Maybe in the end we replace the current scheduler, but at this point I'm thinking taming the current scheduler, and then adding a second post-RA sched pass to try an extract a bit more parallelism is a good path forward. Once we have that working, we can try replacing the existing pre-RA sched pass.
Full shader-db summary below:
total instructions in shared programs: 27869 -> 26792 (-3.86%)
instructions in affected programs: 25811 -> 24734 (-4.17%)
helped: 150
HURT: 24
helped stats (abs) min: 1 max: 20 x̄: 8.77 x̃: 6
helped stats (rel) min: 1.56% max: 11.76% x̄: 5.88% x̃: 5.29%
HURT stats (abs) min: 1 max: 19 x̄: 9.96 x̃: 10
HURT stats (rel) min: 0.95% max: 13.67% x̄: 7.52% x̃: 6.47%
95% mean confidence interval for instructions value: -7.55 -4.83
95% mean confidence interval for instructions %-change: -4.84% -3.22%
Instructions are helped.
total dwords in shared programs: 47968 -> 47808 (-0.33%)
dwords in affected programs: 480 -> 320 (-33.33%)
helped: 5
HURT: 0
helped stats (abs) min: 32 max: 32 x̄: 32.00 x̃: 32
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%
95% mean confidence interval for dwords value: -32.00 -32.00
95% mean confidence interval for dwords %-change: -33.33% -33.33%
Dwords are helped.
total half in shared programs: 0 -> 0
half in affected programs: 0 -> 0
helped: 0
HURT: 0
total full in shared programs: 1903 -> 1750 (-8.04%)
full in affected programs: 1426 -> 1273 (-10.73%)
helped: 112
HURT: 24
helped stats (abs) min: 1 max: 4 x̄: 1.58 x̃: 1
helped stats (rel) min: 7.14% max: 28.57% x̄: 15.63% x̃: 11.11%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 7.14% max: 25.00% x̄: 11.17% x̃: 8.01%
95% mean confidence interval for full value: -1.35 -0.90
95% mean confidence interval for full %-change: -13.11% -8.69%
Full are helped.
total const in shared programs: 6126 -> 6126 (0.00%)
const in affected programs: 0 -> 0
helped: 0
HURT: 0
total constlen in shared programs: 6126 -> 6126 (0.00%)
constlen in affected programs: 0 -> 0
helped: 0
HURT: 0
total (ss) in shared programs: 606 -> 497 (-17.99%)
(ss) in affected programs: 472 -> 363 (-23.09%)
helped: 59
HURT: 4
helped stats (abs) min: 1 max: 4 x̄: 1.92 x̃: 2
helped stats (rel) min: 14.29% max: 33.33% x̄: 24.53% x̃: 30.77%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 16.67% max: 20.00% x̄: 18.33% x̃: 18.33%
95% mean confidence interval for (ss) value: -2.04 -1.42
95% mean confidence interval for (ss) %-change: -25.15% -18.47%
(ss) are helped.
total (sy) in shared programs: 335 -> 447 (33.43%)
(sy) in affected programs: 245 -> 357 (45.71%)
helped: 4
HURT: 66
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 20.00% max: 25.00% x̄: 22.50% x̃: 22.50%
HURT stats (abs) min: 1 max: 2 x̄: 1.76 x̃: 2
HURT stats (rel) min: 25.00% max: 100.00% x̄: 54.42% x̃: 50.00%
95% mean confidence interval for (sy) value: 1.42 1.78
95% mean confidence interval for (sy) %-change: 43.61% 56.43%
(sy) are HURT.
total max_sun in shared programs: 6329 -> 6329 (0.00%)
max_sun in affected programs: 0 -> 0
helped: 0
HURT: 0
total loops in shared programs: 1 -> 1 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
LOST: 0
GAINED: 0
Total CPU time (seconds): 22.92 -> 22.88 (-0.17%)