1. 10 Mar, 2018 1 commit
  2. 08 Mar, 2018 4 commits
    • Ian Romanick's avatar
      i965/vec4: Allow CSE on subset VF constant loads · 1583f49e
      Ian Romanick authored
      v2: Rewrite the code that generates the VF mask.  Suggested by Ken.
      
      No changes on other platforms.
      
      Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
      total instructions in shared programs: 13059891 -> 13059884 (<.01%)
      instructions in affected programs: 431 -> 424 (-1.62%)
      helped: 7
      HURT: 0
      helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
      helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49%
      95% mean confidence interval for instructions value: -1.00 -1.00
      95% mean confidence interval for instructions %-change: -3.39% -0.71%
      Instructions are helped.
      
      total cycles in shared programs: 409260032 -> 409260018 (<.01%)
      cycles in affected programs: 4228 -> 4214 (-0.33%)
      helped: 7
      HURT: 0
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28%
      95% mean confidence interval for cycles value: -2.00 -2.00
      95% mean confidence interval for cycles %-change: -1.15% 0.07%
      
      Inconclusive result (%-change mean confidence interval includes 0).
      Signed-off-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      1583f49e
    • Ian Romanick's avatar
      i965/vec4: Relax writemask condition in CSE · 360899d4
      Ian Romanick authored
      If the previously seen instruction generates more fields than the new
      instruction, still allow CSE to happen.  This doesn't do much, but it
      also enables a couple more shaders in the next patch.  It helped quite a
      bit in another change series that I have (at least for now) abandoned.
      
      v2: Add some extra comentary about the parameters to instructions_match.
      Suggested by Ken.
      
      No changes on Skylake, Broadwell, Iron Lake or GM45.
      
      Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
      total instructions in shared programs: 11780295 -> 11780294 (<.01%)
      instructions in affected programs: 302 -> 301 (-0.33%)
      helped: 1
      HURT: 0
      
      total cycles in shared programs: 257308315 -> 257308313 (<.01%)
      cycles in affected programs: 2074 -> 2072 (-0.10%)
      helped: 1
      HURT: 0
      
      Sandy Bridge
      total instructions in shared programs: 10506687 -> 10506686 (<.01%)
      instructions in affected programs: 335 -> 334 (-0.30%)
      helped: 1
      HURT: 0
      Signed-off-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      360899d4
    • Ian Romanick's avatar
      i965/fs: Merge CMP and SEL into CSEL on Gen8+ · 52c7df16
      Ian Romanick authored
      v2: Fix several problems handling inverted predicates.  Add a much
      bigger comment around the BRW_CONDITIONAL_NZ case.
      
      v3: Allow uniforms and shader inputs as sources for the original SEL and
      CMP instructions.  This enables a LOT more shaders to receive CSEL
      merging (5816 vs 8564 on SKL).
      
      v4: Report progress.
      
      Broadwell and Skylake had similar results. (Broadwell shown)
      helped: 8527
      HURT: 0
      helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1
      helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70%
      95% mean confidence interval for instructions value: -2.51 -2.36
      95% mean confidence interval for instructions %-change: -1.15% -1.10%
      Instructions are helped.
      
      total cycles in shared programs: 559442317 -> 558288357 (-0.21%)
      cycles in affected programs: 372699860 -> 371545900 (-0.31%)
      helped: 6748
      HURT: 1450
      helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12
      helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70%
      HURT stats (abs)   min: 1 max: 2538 x̄: 53.08 x̃: 14
      HURT stats (rel)   min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90%
      95% mean confidence interval for cycles value: -179.01 -102.51
      95% mean confidence interval for cycles %-change: -2.37% -2.08%
      Cycles are helped.
      
      LOST:   0
      GAINED: 6
      
      No changes on earlier platforms.
      Signed-off-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
      Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      52c7df16
    • Kenneth Graunke's avatar
      i965/fs: Add infrastructure for generating CSEL instructions. · 70de6159
      Kenneth Graunke authored
      v2 (idr): Don't allow CSEL with a non-float src2.
      
      v3 (idr): Add CSEL to fs_inst::flags_written.  Suggested by Matt.
      
      v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt).  Don't
      reset the access mode afterwards (suggested by Samuel and Matt).  Add
      support for CSEL not modifying the flags to more places (requested by
      Matt).
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3]
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      70de6159
  3. 07 Mar, 2018 34 commits
  4. 06 Mar, 2018 1 commit