Skip to content
  • Emma Anholt's avatar
    vc4: Do instruction scheduling on the QIR to hide texture fetch latency. · f1fb85e5
    Emma Anholt authored
    This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR.  Texture
    fetch can probably take as much as the rest of the cycles of the program,
    so it's important to hide our other cycles during it (which is hard to do
    after register allocation).  Also, we can queue up multiple texture
    requests before collecting the resulting samples, so that we keep the
    texture unit busy more of the time.
    
    High-settings openarena performance +2.35849% +/- 0.221154% (n=7).  Also
    about 2-3% on the multiarb demo.  8 piglit tests
    (ext_framebuffer_multisample accuracy depthstencil) go from failing in
    rendering to failing in register allocation, but hopefully I can fix that
    up with some better register pressure handling here.
    
    total instructions in shared programs: 87723 -> 88448 (0.83%)
    instructions in affected programs:     78411 -> 79136 (0.92%)
    total estimated cycles in shared programs: 276583 -> 246306 (-10.95%)
    estimated cycles in affected programs:     265691 -> 235414 (-11.40%)
    f1fb85e5