1. 03 Jan, 2019 12 commits
    • Jakob Bornecrantz's avatar
      virgl/vtest: Use default socket name from protocol header · 6a9be6fc
      Jakob Bornecrantz authored
      
      
      No functional change as the socket name is the same,
      just removing the double definition of the path.
      Reviewed-by: Gurchetan Singh's avatarGurchetan Singh <gurchetansingh@chromium.org>
      Signed-off-by: Jakob Bornecrantz's avatarJakob Bornecrantz <jakob@collabora.com>
      6a9be6fc
    • Rob Clark's avatar
      freedreno: fix staging resource size for arrays · e869481e
      Rob Clark authored
      
      
      A 2d-array texture (for example), should get the # of array elements
      from box->depth, rather than depth0 which is minified.
      
      Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_bias_float_fragment
      with tiled textures.
      Reported-by: default avatarKristian H. Kristensen <hoegsberg@chromium.org>
      Signed-off-by: Rob Clark's avatarRob Clark <robdclark@gmail.com>
      e869481e
    • Rob Clark's avatar
      freedreno: remove blit_via_copy_region() · 67a7f6f2
      Rob Clark authored
      
      
      If we hit the memcpy() path for copy_region(), that will try to do a
      transfer_map(), which goes badly for blits to/from staging triggered
      by transfer_map() or transfer_unmap().
      
      We could possibly add fd_blit2() which has allow_transfer_map param,
      and call that for staging blits.  But I'm not really sure if trying
      the blit via copy_region() is very useful.  At least for newer gens
      that implement fd_context::blit(), it probably isn't.
      Signed-off-by: Rob Clark's avatarRob Clark <robdclark@gmail.com>
      67a7f6f2
    • Rob Clark's avatar
      freedreno/a6xx: rework blitter API · 2fc17e16
      Rob Clark authored
      
      
      Switch over to using fd_context::blit(), in the same way that a5xx does.
      The previous patch wires fd_resource_copy_region() up to the blitter so
      a6xx no longer needs to bypass the core layer to accelerate this.
      Signed-off-by: Rob Clark's avatarRob Clark <robdclark@gmail.com>
      2fc17e16
    • Rob Clark's avatar
      53b8eb78
    • Rob Clark's avatar
      freedreno: rework blit API · 228eddd7
      Rob Clark authored
      
      
      First step to unify the way fd5 and fd6 blitter works.  Currently a6xx
      bypasses the blit API in order to also accelerate resource_copy_region()
      
      But this approach can lead to infinite recursion:
      
        #0  fd_alloc_staging (ctx=0x5555936480, rsc=0x7fac485f90, level=0, box=0x7fbab29220) at ../src/gallium/drivers/freedreno/freedreno_resource.c:291
        #1  0x0000007fbdebed04 in fd_resource_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/drivers/freedreno/freedreno_resource.c:479
        #2  0x0000007fbe5c5068 in u_transfer_helper_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/auxiliary/util/u_transfer_helper.c:243
        #3  0x0000007fbde2dcb8 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47c780, src_level=0, src_box_in=0x7fbab2945c) at ../src/gallium/auxiliary/util/u_surface.c:350
        #4  0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173
        #5  0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587
        #6  0x0000007fbde2f3d0 in util_try_blit_via_copy_region (ctx=0x5555936480, blit=0x7fbab29430) at ../src/gallium/auxiliary/util/u_surface.c:864
        #7  0x0000007fbdec02c4 in fd_blit (pctx=0x5555936480, blit_info=0x7fbab29588) at ../src/gallium/drivers/freedreno/freedreno_resource.c:993
        #8  0x0000007fbdf08408 in fd6_blit (pctx=0x5555936480, info=0x7fbab29588) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:546
        #9  0x0000007fbdebdc74 in do_blit (ctx=0x5555936480, blit=0x7fbab29588, fallback=false) at ../src/gallium/drivers/freedreno/freedreno_resource.c:129
        #10 0x0000007fbdebe58c in fd_blit_from_staging (ctx=0x5555936480, trans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:326
        #11 0x0000007fbdebea38 in fd_resource_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:416
        #12 0x0000007fbe5c5c68 in u_transfer_helper_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/auxiliary/util/u_transfer_helper.c:516
        #13 0x0000007fbde2de24 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47b8e0, src_level=0, src_box_in=0x7fbab2997c) at ../src/gallium/auxiliary/util/u_surface.c:376
        #14 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173
        #15 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587
        ...
      
      Instead rework the API to push the fallback back to core code, so that
      we can rework resource_copy_region() to have it's own fallback path,
      and then finally convert fd6 over to work in the same way.
      
      This also makes ctx->blit() optional, and cleans up some unnecessary
      callers.
      Signed-off-by: Rob Clark's avatarRob Clark <robdclark@gmail.com>
      228eddd7
    • Rob Clark's avatar
      freedreno: skip depth resolve if not written · f1c88336
      Rob Clark authored
      
      
      For multi-pass rendering, it is common to keep the same depth buffer
      from previous pass, to discard geometry that would be hidden by later
      draws.  In the later passes with depth-test enabled, but depth-write
      disabled, there is no reason to do gmem2mem resolve.
      
      TODO probably do something similar for stencil.. although stencil
      buffer isn't used as commonly these days
      Signed-off-by: Rob Clark's avatarRob Clark <robdclark@gmail.com>
      f1c88336
    • Timothy Arceri's avatar
      nir: merge some basic consecutive ifs · 4d3f6cb9
      Timothy Arceri authored
      
      
      After trying multiple times to merge if-statements with phis
      between them I've come to the conclusion that it cannot be done
      without regressions. The problem is for some shaders we end up
      with a whole bunch of phis for the merged ifs resulting in
      increased register pressure.
      
      So this patch just merges ifs that have no phis between them.
      This seems to be consistent with what LLVM does so for radeonsi
      we only see a change (although its a large change) in a single
      shader.
      
      Shader-db results i965 (SKL):
      
      total instructions in shared programs: 13098176 -> 13098152 (<.01%)
      instructions in affected programs: 1326 -> 1302 (-1.81%)
      helped: 4
      HURT: 0
      
      total cycles in shared programs: 332032989 -> 332037583 (<.01%)
      cycles in affected programs: 60665 -> 65259 (7.57%)
      helped: 0
      HURT: 4
      
      The cycles estimates reported by shader-db for i965 seem inaccurate
      as the only difference in the final code is the removal of the
      redundent condition evaluations and jumps.
      
      Also the biggest code reduction (~7%) for radeonsi was in a tomb
      raider tressfx shader but for some reason this does not get merged
      for i965.
      
      Shader-db results radeonsi (VEGA):
      
      Totals from affected shaders:
      SGPRS: 232 -> 232 (0.00 %)
      VGPRS: 164 -> 164 (0.00 %)
      Spilled SGPRs: 59 -> 59 (0.00 %)
      Spilled VGPRs: 0 -> 0 (0.00 %)
      Private memory VGPRs: 0 -> 0 (0.00 %)
      Scratch size: 0 -> 0 (0.00 %) dwords per thread
      Code Size: 14584 -> 13520 (-7.30 %) bytes
      LDS: 0 -> 0 (0.00 %) blocks
      Max Waves: 13 -> 13 (0.00 %)
      Wait states: 0 -> 0 (0.00 %)
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      4d3f6cb9
    • Timothy Arceri's avatar
      nir: add rewrite_phi_predecessor_blocks() helper · 19cafe80
      Timothy Arceri authored
      
      
      This will also be used by the if merge pass in the following commit.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      19cafe80
    • Timothy Arceri's avatar
      5122fbc4
    • Timothy Arceri's avatar
    • Timothy Arceri's avatar
  2. 02 Jan, 2019 28 commits