Skip to content

broadcom/compiler: misc improvments

Iago Toral requested to merge itoral/mesa:v3d_more_misc_compiler_opt into main

The main change in this series is that we update the NIR scheduler to add latency to UBO, SSBO, shared and scratch loads making it more consistent (before the schedule would only specify texture reads as having a large latency). On V3D (currently the only user of the NIR scheduler) this causes register pressure to blow up though, so the later patches fix this by making latency configuration driver-specific and adjusting parameters to make more sense on our platform based on shader-db and empirical testing.

Relevant shader-db changes are included for each patch in the series, but for the whole series we get:

total instructions in shared programs: 12663186 -> 12628339 (-0.28%)
instructions in affected programs: 6649833 -> 6614986 (-0.52%)
helped: 20498
HURT: 12451

total threads in shared programs: 415870 -> 416322 (0.11%)
threads in affected programs: 896 -> 1348 (50.45%)
helped: 300
HURT: 74

total uniforms in shared programs: 3711629 -> 3704863 (-0.18%)
uniforms in affected programs: 359478 -> 352712 (-1.88%)
helped: 2662
HURT: 1717

total max-temps in shared programs: 2138857 -> 2152684 (0.65%)
max-temps in affected programs: 288951 -> 302778 (4.79%)
helped: 3693
HURT: 3797

total spills in shared programs: 3860 -> 3274 (-15.18%)
spills in affected programs: 2708 -> 2122 (-21.64%)
helped: 77
HURT: 25

total fills in shared programs: 5573 -> 4657 (-16.44%)
fills in affected programs: 3987 -> 3071 (-22.97%)
helped: 83
HURT: 29

total sfu-stalls in shared programs: 39583 -> 34403 (-13.09%)
sfu-stalls in affected programs: 15037 -> 9857 (-34.45%)
helped: 3816
HURT: 1354

total inst-and-stalls in shared programs: 12702769 -> 12662742 (-0.32%)
inst-and-stalls in affected programs: 6725546 -> 6685519 (-0.60%)
helped: 20651
HURT: 12409

total nops in shared programs: 324894 -> 320636 (-1.31%)
nops in affected programs: 56542 -> 52284 (-7.53%)
helped: 4718
HURT: 3122

This also seems to slightly improve fps some UE4 demos between 3% and 5%.

Merge request reports