Skip to content

broadcom/compiler: emit instructions producing flags earlier

Iago Toral requested to merge itoral/mesa:v3d_schedule_flags into main

We usually emit flags right before consuming them but this is suboptimal from the point of view of register pressure: if an instruction is only used to generate flags then waiting to emit it right before reading the flags extends the liveness of the sources used to generate the flags for no gain. This pass will check for such instructions and try to move them as early as possible.

Shader-db results below show this is effective to reduce register pressure, allowing a few shaders to increase thread counts and/or reduce spilling:

    total instructions in shared programs: 11057173 -> 11057076 (<.01%)
    instructions in affected programs: 1955543 -> 1955446 (<.01%)
    helped: 4214
    HURT: 3905
    Inconclusive result (value mean confidence interval includes 0).
    
    total threads in shared programs: 425096 -> 425170 (0.02%)
    threads in affected programs: 74 -> 148 (100.00%)
    helped: 37
    HURT: 0
    Threads are helped.
    
    total uniforms in shared programs: 3846275 -> 3845674 (-0.02%)
    uniforms in affected programs: 23574 -> 22973 (-2.55%)
    helped: 217
    HURT: 30
    Uniforms are helped.
    
    total max-temps in shared programs: 2222910 -> 2220488 (-0.11%)
    max-temps in affected programs: 61904 -> 59482 (-3.91%)
    helped: 2145
    HURT: 113
    Max-temps are helped.
    
    total spills in shared programs: 4294 -> 4280 (-0.33%)
    spills in affected programs: 148 -> 134 (-9.46%)
    helped: 8
    HURT: 0
    
    total fills in shared programs: 6497 -> 6468 (-0.45%)
    fills in affected programs: 291 -> 262 (-9.97%)
    helped: 8
    HURT: 0
    
    total sfu-stalls in shared programs: 14344 -> 14611 (1.86%)
    sfu-stalls in affected programs: 1308 -> 1575 (20.41%)
    helped: 217
    HURT: 335
    Inconclusive result (%-change mean confidence interval includes 0).
    
    total inst-and-stalls in shared programs: 11071517 -> 11071687 (<.01%)
    inst-and-stalls in affected programs: 1946767 -> 1946937 (<.01%)
    helped: 4191
    HURT: 3909
    Inconclusive result (value mean confidence interval includes 0).
    
    total nops in shared programs: 270628 -> 269829 (-0.30%)
    nops in affected programs: 22032 -> 21233 (-3.63%)
    helped: 1213
    HURT: 571
    Inconclusive result (%-change mean confidence interval includes 0).
Edited by Iago Toral

Merge request reports

Loading