1. 01 Jul, 2019 1 commit
  2. 24 Jun, 2019 1 commit
  3. 18 Jun, 2019 1 commit
    • Iago Toral's avatar
      v3d: implement simultaneous peripheral access exceptions for V3D 4.1+ · 79a30543
      Iago Toral authored
      Shader-db results:
      
      total instructions in shared programs: 9117550 -> 9102719 (-0.16%)
      instructions in affected programs: 1752873 -> 1738042 (-0.85%)
      helped: 7076
      HURT: 478
      helped stats (abs) min: 1 max: 22 x̄: 2.19 x̃: 2
      helped stats (rel) min: 0.07% max: 13.89% x̄: 1.70% x̃: 1.07%
      HURT stats (abs)   min: 1 max: 7 x̄: 1.41 x̃: 1
      HURT stats (rel)   min: 0.09% max: 10.17% x̄: 0.86% x̃: 0.54%
      95% mean confidence interval for instructions value: -2.00 -1.92
      95% mean confidence interval for instructions %-change: -1.58% -1.50%
      Instructions are helped.
      
      total max-temps in shared programs: 1327774 -> 1327728 (<.01%)
      max-temps in affected programs: 1025 -> 979 (-4.49%)
      helped: 47
      HURT: 2
      helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
      helped stats (rel) min: 2.63% max: 20.00% x̄: 7.67% x̃: 5.26%
      HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
      HURT stats (rel)   min: 4.17% max: 4.17% x̄: 4.17% x̃: 4.17%
      95% mean confidence interval for max-temps value: -1.06 -0.82
      95% mean confidence interval for max-temps %-change: -8.89% -5.49%
      Max-temps are helped.
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      79a30543
  4. 14 Jun, 2019 1 commit
    • Iago Toral's avatar
      v3d: do not setup execute flags for else block in uniform control flow · 360b832c
      Iago Toral authored
      Either all channels executed the 'then' block, in which case all
      channels will directly jump to the 'endif' block at the end of the
      'then' block, or all channels execute the 'else' block (so no
      execution masking is necessary).
      
      Shader-db results:
      
      total instructions in shared programs: 9119238 -> 9117550 (-0.02%)
      instructions in affected programs: 401252 -> 399564 (-0.42%)
      helped: 855
      HURT: 77
      
      total uniforms in shared programs: 3022622 -> 3022605 (<.01%)
      uniforms in affected programs: 3566 -> 3549 (-0.48%)
      helped: 17
      HURT: 0
      
      total max-temps in shared programs: 1327762 -> 1327774 (<.01%)
      max-temps in affected programs: 619 -> 631 (1.94%)
      helped: 2
      HURT: 15
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      360b832c
  5. 13 Jun, 2019 1 commit
  6. 07 Jun, 2019 2 commits
  7. 06 Jun, 2019 1 commit
    • Iago Toral's avatar
      v3d: fix scheduling dependency tracking for ALU with small immediates · 09d230c6
      Iago Toral authored
      We were not accountint for small immediates in the B mux so the scheduler
      was interpreting these are regular register file accesses, which could
      lead to additional (incorrect) write-read dependencies.
      
      Shader-db changes:
      
      total instructions in shared programs: 9163664 -> 9137263 (-0.29%)
      instructions in affected programs: 3931035 -> 3904634 (-0.67%)
      helped: 12457
      HURT: 2563
      
      total max-temps in shared programs: 1325787 -> 1325597 (-0.01%)
      max-temps in affected programs: 5746 -> 5556 (-3.31%)
      helped: 186
      HURT: 16
      helped stats (abs) min: 1 max: 4 x̄: 1.12 x̃: 1
      helped stats (rel) min: 1.45% max: 22.22% x̄: 4.42% x̃: 3.28%
      HURT stats (abs)   min: 1 max: 3 x̄: 1.12 x̃: 1
      HURT stats (rel)   min: 2.86% max: 10.00% x̄: 5.76% x̃: 5.88%
      95% mean confidence interval for max-temps value: -1.04 -0.84
      95% mean confidence interval for max-temps %-change: -4.16% -3.07%
      Max-temps are helped.
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      09d230c6
  8. 05 Jun, 2019 1 commit
  9. 24 May, 2019 1 commit
  10. 10 May, 2019 1 commit
  11. 09 May, 2019 1 commit
  12. 07 May, 2019 2 commits
    • Ian Romanick's avatar
      nir: Use the flrp lowering pass instead of nir_opt_algebraic · d41cdef2
      Ian Romanick authored
      I tried to be very careful while updating all the various drivers, but I
      don't have any of that hardware for testing. :(
      
      i965 is the only platform that sets always_precise = true, and it is
      only set true for fragment shaders.  Gen4 and Gen5 both set lower_flrp32
      only for vertex shaders.  For fragment shaders, nir_op_flrp is lowered
      during code generation as a(1-c)+bc.  On all other platforms 64-bit
      nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old
      nir_opt_algebraic method.
      
      No changes on any other Intel platforms.
      
      v2: Add panfrost changes.
      
      Iron Lake and GM45 had similar results. (Iron Lake shown)
      total cycles in shared programs: 188647754 -> 188647748 (<.01%)
      cycles in affected programs: 5096 -> 5090 (-0.12%)
      helped: 3
      HURT: 0
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      d41cdef2
    • Christian Gmeiner's avatar
      nir: nir_shader_compiler_options: drop native_integers · 4e110eca
      Christian Gmeiner authored
      Driver which do not support native integers should use a lowering
      pass to go from integers to floats.
      Signed-off-by: Christian Gmeiner's avatarChristian Gmeiner <christian.gmeiner@gmail.com>
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason@jlekstrand.net>
      4e110eca
  13. 29 Apr, 2019 1 commit
  14. 26 Apr, 2019 5 commits
  15. 18 Apr, 2019 2 commits
  16. 16 Apr, 2019 2 commits
    • Eric Anholt's avatar
      v3d: Always set up the qregs for CSD payload. · 697e2e1f
      Eric Anholt authored
      We were failing to set up payload[1] for use by LocalInvocationIndex/ID
      and shared variable accesses if gl_WorkGroupID/gl_GlobalInvocationID
      wasn't used (possibly because you only have one workgroup).  You're always
      going to use payload[1], and payload[0] is common enough and we have DCE
      in the backend to clean it up if it happens to not be used.
      697e2e1f
    • Eric Anholt's avatar
      v3d: Only look up the 3rd texture gather offset for non-arrays. · 1bc71e8b
      Eric Anholt authored
      Fixes assertion failures in the CTS since Karol's cleanup when NIR started
      noticing that we were reading an invalid component.
      
      Fixes: 5450f1c9 ("v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value")
      1bc71e8b
  17. 15 Apr, 2019 1 commit
  18. 14 Apr, 2019 1 commit
  19. 12 Apr, 2019 9 commits
  20. 11 Apr, 2019 1 commit
    • Eric Anholt's avatar
      v3d: Add an optimization pass for redundant flags updates. · 8f065596
      Eric Anholt authored
      Our exec masking introduces lots of redundant flags updates, and even
      without that there will be cases where NIR comparisons on the same sources
      for different reasons may generate the same comparison instruction before
      the selection.
      
      total instructions in shared programs: 6492930 -> 6460934 (-0.49%)
      total uniforms in shared programs: 2117460 -> 2115106 (-0.11%)
      total spills in shared programs: 4983 -> 4987 (0.08%)
      total fills in shared programs: 6408 -> 6416 (0.12%)
      8f065596
  21. 09 Apr, 2019 1 commit
  22. 07 Apr, 2019 1 commit
  23. 05 Apr, 2019 2 commits