Skip to content

NIR: Do better for flrp on platforms that lack flrp instruction

Ian Romanick requested to merge idr/mesa:review/lower-flrp into master

This is basically a resend of an earlier series that I sent to the mailing list.

Rather than trying to do everything with only local information in nir_opt_algebraic, this series adds a new optimization pass. This new pass looks at how various parameters of a nir_op_flrp are used in other nir_op_flrp instructions to make better choices.

The results across the whole series are shown below. At least for Intel GPUs, this series is a significant improvement. There are a couple extra loops and a couple lost SIMD16 shaders on Iron Lake, but the trade off is an extra 52 (+47 overall) SIMD16 shaders on both Iron Lake and GM45. NOTE: On GM45 each shader is either SIMD8 or SIMD16, so the lost and gained counts will always be the same. I had to look at the actual output from report.py to get the real count.

It also shouldn't break the whole universe for non-Intel GPUs. 😄 Neither of the tested Intel GPUs have an FMA instruction, so I don't know how this series will affect GPUs that have FMA but lack LRP. I did some testing on Ice Lake, which lacks LRP but has FMA, and the results were mixed but generally positive. There are some changes to FMA on Ice Lake, and the driver does not currently take those into account. As a result, the Ice Lake results are not likely to be representative of the final result.

I have another series waiting to go out that improves LRP and FMA generation for all of the Intel platforms that support LRP and FMA. That series caused a bunch of regressions on the non-LRP Intel platforms, so this series needs to land first.

Iron Lake
total instructions in shared programs: 8200374 -> 8099710 (-1.23%)
instructions in affected programs: 4586432 -> 4485768 (-2.19%)
helped: 20118
HURT: 1064
helped stats (abs) min: 1 max: 155 x̄: 5.07 x̃: 3
helped stats (rel) min: 0.11% max: 86.96% x̄: 2.83% x̃: 1.89%
HURT stats (abs)   min: 1 max: 8 x̄: 1.25 x̃: 1
HURT stats (rel)   min: 0.08% max: 6.52% x̄: 1.17% x̃: 0.97%
95% mean confidence interval for instructions value: -4.83 -4.67
95% mean confidence interval for instructions %-change: -2.67% -2.58%
Instructions are helped.

total cycles in shared programs: 187531554 -> 187056200 (-0.25%)
cycles in affected programs: 104021478 -> 103546124 (-0.46%)
helped: 19884
HURT: 1517
helped stats (abs) min: 2 max: 930 x̄: 24.60 x̃: 12
helped stats (rel) min: <.01% max: 94.58% x̄: 1.22% x̃: 0.56%
HURT stats (abs)   min: 2 max: 208 x̄: 9.14 x̃: 6
HURT stats (rel)   min: <.01% max: 4.11% x̄: 0.34% x̃: 0.15%
95% mean confidence interval for cycles value: -22.67 -21.76
95% mean confidence interval for cycles %-change: -1.15% -1.08%
Cycles are helped.

total loops in shared programs: 856 -> 860 (0.47%)
loops in affected programs: 0 -> 4
helped: 0
HURT: 4
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.00% max: 0.00% x̄: 0.00% x̃: 0.00%
95% mean confidence interval for loops value: 1.00 1.00
95% mean confidence interval for loops %-change: 0.00% 0.00%
Loops are HURT.

LOST:   5
GAINED: 52


GM45
total instructions in shared programs: 5004357 -> 4951782 (-1.05%)
instructions in affected programs: 2440172 -> 2387597 (-2.15%)
helped: 10341
HURT: 537
helped stats (abs) min: 1 max: 155 x̄: 5.15 x̃: 3
helped stats (rel) min: 0.11% max: 86.96% x̄: 2.78% x̃: 1.85%
HURT stats (abs)   min: 1 max: 8 x̄: 1.25 x̃: 1
HURT stats (rel)   min: 0.08% max: 6.25% x̄: 1.15% x̃: 0.96%
95% mean confidence interval for instructions value: -4.94 -4.72
95% mean confidence interval for instructions %-change: -2.65% -2.52%
Instructions are helped.

total cycles in shared programs: 127487388 -> 127192972 (-0.23%)
cycles in affected programs: 63823238 -> 63528822 (-0.46%)
helped: 10260
HURT: 825
helped stats (abs) min: 2 max: 930 x̄: 29.50 x̃: 12
helped stats (rel) min: <.01% max: 93.33% x̄: 1.30% x̃: 0.59%
HURT stats (abs)   min: 2 max: 206 x̄: 10.03 x̃: 6
HURT stats (rel)   min: <.01% max: 4.11% x̄: 0.34% x̃: 0.14%
95% mean confidence interval for cycles value: -27.29 -25.83
95% mean confidence interval for cycles %-change: -1.22% -1.13%
Cycles are helped.

total loops in shared programs: 635 -> 637 (0.31%)
loops in affected programs: 0 -> 2
helped: 0
HURT: 2

total fills in shared programs: 93 -> 94 (1.08%)
fills in affected programs: 81 -> 82 (1.23%)
helped: 0
HURT: 1

LOST:   57
GAINED: 57
Edited by Ian Romanick

Merge request reports