Skip to content

WIP: mesa/st: Turn on OptimizeForAOS on non-scalar NIR backends.

Emma Anholt requested to merge anholt/mesa:optimize-for-aos into main

Based on !14200 (merged), alternative to !14247 (closed). I think we should land this instead.

    On i965 vec4 hardware (most of crocus), this lets the VS matrix multiplies
    happen in parallel as independent DP4s to each dest channel, rather than a
    serialized set of MADs with approximately the same instruction count.
    Should be a perf regression fix from the crocus transition (from the
    original commit, "Improves performance in Lightsmark by 1.01131% +/-
    0.162069% (n = 10) on a Haswell GT2 system.").

    
    i915g:
    total instructions in shared programs: 396828 -> 396831 (<.01%)
    instructions in affected programs: 159 -> 162 (1.89%)
    
    r300:
    total instructions in shared programs: 1226783 -> 1228308 (0.12%)
    instructions in affected programs: 61920 -> 63445 (2.46%)
    total temps in shared programs: 195902 -> 195850 (-0.03%)
    temps in affected programs: 2393 -> 2341 (-2.17%)
    
    hsw:
    total instructions in shared programs: 8163635 -> 8154150 (-0.12%)
    instructions in affected programs: 174076 -> 164591 (-5.45%)
Edited by Emma Anholt

Merge request reports