WIP: mesa/st: Turn on OptimizeForAOS on non-scalar NIR backends. (!14277) · Merge requests · Mesa / mesa

Emma Anholt requested to merge anholt/mesa:optimize-for-aos into main Dec 20, 2021

Based on !14200 (merged), alternative to !14247 (closed). I think we should land this instead.

    On i965 vec4 hardware (most of crocus), this lets the VS matrix multiplies
    happen in parallel as independent DP4s to each dest channel, rather than a
    serialized set of MADs with approximately the same instruction count.
    Should be a perf regression fix from the crocus transition (from the
    original commit, "Improves performance in Lightsmark by 1.01131% +/-
    0.162069% (n = 10) on a Haswell GT2 system.").

    
    i915g:
    total instructions in shared programs: 396828 -> 396831 (<.01%)
    instructions in affected programs: 159 -> 162 (1.89%)
    
    r300:
    total instructions in shared programs: 1226783 -> 1228308 (0.12%)
    instructions in affected programs: 61920 -> 63445 (2.46%)
    total temps in shared programs: 195902 -> 195850 (-0.03%)
    temps in affected programs: 2393 -> 2341 (-2.17%)
    
    hsw:
    total instructions in shared programs: 8163635 -> 8154150 (-0.12%)
    instructions in affected programs: 174076 -> 164591 (-5.45%)

Edited Dec 20, 2021 by Emma Anholt

Admin message

WIP: mesa/st: Turn on OptimizeForAOS on non-scalar NIR backends.

Merge request reports