Skip to content

mesa: Lower mediump temps and CS shared when the driver supports FP16+INT16.

Emma Anholt requested to merge anholt/mesa:mediump-vars-gl into main

Typically GLSL mediump lowering will have lowered all the ALU ops generating the values to 16-bit, and once vars_to_ssa happens the mediump temps disappear. However, if they don't disappear (for example, the var gets indirected and eventually gets lowered to scratch or indirect lowering), then you don't want the storage upconverted to 32-bit.

Also, if a CS shared var is declared mediump, then storing it as 16 bit prevents conversions around the load store assuming the ALU ops related to them are 16 bit. For gfxbench aztec ruins, the CS shared var sizes are cut in half, improving overall perf by 0.805549% +/- 0.0953482% (n=6) on gl-5-normal.

freedreno shader-db:

total instructions in shared programs: 2917577 -> 2917743 (<.01%)
instructions in affected programs: 46141 -> 46307 (0.36%)
total last-baryf in shared programs: 109712 -> 109492 (-0.20%)
last-baryf in affected programs: 638 -> 418 (-34.48%)
total full in shared programs: 190275 -> 190218 (-0.03%)
full in affected programs: 156 -> 99 (-36.54%)
total constlen in shared programs: 492596 -> 492600 (<.01%)
constlen in affected programs: 8 -> 12 (50.00%)

total cat6 in shared programs: 33019 -> 33107 (0.27%)
cat6 in affected programs: 3604 -> 3692 (2.44%)
total stp in shared programs: 3626 -> 3670 (1.21%)
stp in affected programs: 3336 -> 3380 (1.32%)
total ldp in shared programs: 1718 -> 1762 (2.56%)
ldp in affected programs: 1680 -> 1724 (2.62%)
(this is all in aztec ruins)

total sstall in shared programs: 195656 -> 195182 (-0.24%)
sstall in affected programs: 3249 -> 2775 (-14.59%)
total (ss) in shared programs: 52823 -> 52966 (0.27%)
(ss) in affected programs: 1733 -> 1876 (8.25%)
total systall in shared programs: 507928 -> 508687 (0.15%)
systall in affected programs: 103010 -> 103769 (0.74%)
total (sy) in shared programs: 23185 -> 23196 (0.05%)
(sy) in affected programs: 1276 -> 1287 (0.86%)
total waves in shared programs: 435290 -> 435302 (<.01%)
waves in affected programs: 12 -> 24 (100.00%)
total loops in shared programs: 407 -> 405 (-0.49%)
loops in affected programs: 9 -> 7 (-22.22%)

Reviewed-by: Marek Olšák marek.olsak@amd.com Reviewed-by: Matt Turner mattst88@gmail.com

This pulls in !18449 (merged) to avoid regressions

Merge request reports