1. 06 Jun, 2019 12 commits
    • Vasily Khoruzhick's avatar
    • Vasily Khoruzhick's avatar
      lima/ppir: fix crash when program uses no registers at all · 5980565a
      Vasily Khoruzhick authored
      Program may need no regalloc at all, e.g. in case when program consists
      of single discard op.
      Signed-off-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Reviewed-by: Qiang Yu's avatarQiang Yu <yuq825@gmail.com>
      5980565a
    • Jason Ekstrand's avatar
      util/hash_table: Assert that keys are not reserved pointers · b38dab10
      Jason Ekstrand authored
      If we insert a NULL key, it will appear to succeed but will mess up
      entry counting.  Similar errors can occur if someone accidentally
      inserts the deleted key.  The later is highly unlikely but technically
      possible so we should guard against it too.
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      b38dab10
    • Jason Ekstrand's avatar
      util/set: Assert that keys are not reserved pointers · 8306dabc
      Jason Ekstrand authored
      If we insert a NULL key, it will appear to succeed but will mess up
      entry counting.  Similar errors can occur if someone accidentally
      inserts the deleted key.  The later is highly unlikely but technically
      possible so we should guard against it too.
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      8306dabc
    • Jason Ekstrand's avatar
    • Jason Ekstrand's avatar
      nir/propagate_invariant: Don't add NULL vars to the hash table · d96878a6
      Jason Ekstrand authored
      Fixes: 8410cf66 "nir/propagate_invariant: Skip unknown vars"
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      d96878a6
    • Ian Romanick's avatar
      intel/compiler: Treat b32csel as potentially producing a Boolean result for resolve analysis · 1c30d26d
      Ian Romanick authored
      If the 2nd and 3rd source are both Boolean values, we can potentially
      avoid a resolve by only resolving the result of the b32csel.
      
      No changes on any Gen6+ Intel platform.
      
      v2: Use ?: instead of cast from bool to unsigned.  Suggested by Caio.
      
      Iron Lake
      total instructions in shared programs: 8142729 -> 8142677 (<.01%)
      instructions in affected programs: 12890 -> 12838 (-0.40%)
      helped: 26
      HURT: 0
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.25% max: 0.74% x̄: 0.45% x̃: 0.38%
      95% mean confidence interval for instructions value: -2.00 -2.00
      95% mean confidence interval for instructions %-change: -0.52% -0.39%
      Instructions are helped.
      
      total cycles in shared programs: 188549632 -> 188549394 (<.01%)
      cycles in affected programs: 60754 -> 60516 (-0.39%)
      helped: 25
      HURT: 1
      helped stats (abs) min: 2 max: 26 x̄: 9.92 x̃: 8
      helped stats (rel) min: 0.07% max: 2.23% x̄: 0.59% x̃: 0.27%
      HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
      HURT stats (rel)   min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70%
      95% mean confidence interval for cycles value: -12.91 -5.40
      95% mean confidence interval for cycles %-change: -0.84% -0.23%
      Cycles are helped.
      
      GM45
      total instructions in shared programs: 5013119 -> 5013093 (<.01%)
      instructions in affected programs: 6764 -> 6738 (-0.38%)
      helped: 13
      HURT: 0
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.24% max: 0.68% x̄: 0.43% x̃: 0.36%
      95% mean confidence interval for instructions value: -2.00 -2.00
      95% mean confidence interval for instructions %-change: -0.52% -0.34%
      Instructions are helped.
      
      total cycles in shared programs: 128977804 -> 128977700 (<.01%)
      cycles in affected programs: 37738 -> 37634 (-0.28%)
      helped: 13
      HURT: 0
      helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8
      helped stats (rel) min: 0.18% max: 0.46% x̄: 0.30% x̃: 0.26%
      95% mean confidence interval for cycles value: -8.00 -8.00
      95% mean confidence interval for cycles %-change: -0.36% -0.24%
      Cycles are helped.
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      1c30d26d
    • Ian Romanick's avatar
      intel/fs: Improve discard_if code generation · 0ba9497e
      Ian Romanick authored
      Previously we would blindly emit an sequence like:
      
              mov(1)          f0.1<1>UW       g1.14<0,1,0>UW
              ...
              cmp.l.f0(16)    g7<1>F          g5<8,8,1>F      0x41700000F  /* 15F */
      (+f0.1) cmp.z.f0.1(16)  null<1>D        g7<8,8,1>D      0D
      
      The first move sets the flags based on the initial execution mask.
      Later discard sequences contain a predicated compare that can only
      remove more SIMD channels.  Often times the only user of the result from
      the first compare is the second compare.  Instead, generate a sequence
      like
      
              mov(1)          f0.1<1>UW       g1.14<0,1,0>UW
              ...
              cmp.l.f0(16)    g7<1>F          g5<8,8,1>F      0x41700000F  /* 15F */
      (+f0.1) cmp.ge.f0.1(8)  null<1>F        g5<8,8,1>F      0x41700000F  /* 15F */
      
      If the results stored in g7 and f0.0 are not used, the comparison will
      be eliminated.  This removes an instruction and potentially reduces
      register pressure.
      
      v2: Major re-write of the commit message (including fixing the assembly
      code).  Suggested by Matt.
      
      All Gen8+ platforms had similar results. (Ice Lake shown)
      total instructions in shared programs: 17224434 -> 17198659 (-0.15%)
      instructions in affected programs: 2908125 -> 2882350 (-0.89%)
      helped: 18891
      HURT: 5
      helped stats (abs) min: 1 max: 12 x̄: 1.38 x̃: 1
      helped stats (rel) min: 0.03% max: 25.00% x̄: 1.76% x̃: 1.02%
      HURT stats (abs)   min: 9 max: 105 x̄: 51.40 x̃: 35
      HURT stats (rel)   min: 0.43% max: 4.92% x̄: 2.34% x̃: 1.56%
      95% mean confidence interval for instructions value: -1.39 -1.34
      95% mean confidence interval for instructions %-change: -1.79% -1.73%
      Instructions are helped.
      
      total cycles in shared programs: 361468458 -> 361170679 (-0.08%)
      cycles in affected programs: 38470116 -> 38172337 (-0.77%)
      helped: 16202
      HURT: 1456
      helped stats (abs) min: 1 max: 4473 x̄: 26.24 x̃: 18
      helped stats (rel) min: <.01% max: 28.44% x̄: 2.90% x̃: 2.18%
      HURT stats (abs)   min: 1 max: 5982 x̄: 87.51 x̃: 28
      HURT stats (rel)   min: <.01% max: 51.29% x̄: 5.48% x̃: 1.64%
      95% mean confidence interval for cycles value: -18.24 -15.49
      95% mean confidence interval for cycles %-change: -2.26% -2.14%
      Cycles are helped.
      
      total spills in shared programs: 12147 -> 12176 (0.24%)
      spills in affected programs: 175 -> 204 (16.57%)
      helped: 8
      HURT: 5
      
      total fills in shared programs: 25262 -> 25292 (0.12%)
      fills in affected programs: 269 -> 299 (11.15%)
      helped: 8
      HURT: 5
      
      Haswell
      total instructions in shared programs: 13530316 -> 13502647 (-0.20%)
      instructions in affected programs: 2507824 -> 2480155 (-1.10%)
      helped: 18859
      HURT: 10
      helped stats (abs) min: 1 max: 12 x̄: 1.48 x̃: 1
      helped stats (rel) min: 0.03% max: 27.78% x̄: 2.38% x̃: 1.41%
      HURT stats (abs)   min: 5 max: 39 x̄: 25.70 x̃: 31
      HURT stats (rel)   min: 0.22% max: 1.66% x̄: 1.09% x̃: 1.31%
      95% mean confidence interval for instructions value: -1.49 -1.44
      95% mean confidence interval for instructions %-change: -2.42% -2.34%
      Instructions are helped.
      
      total cycles in shared programs: 377865412 -> 377639034 (-0.06%)
      cycles in affected programs: 40169572 -> 39943194 (-0.56%)
      helped: 15550
      HURT: 1938
      helped stats (abs) min: 1 max: 2482 x̄: 25.67 x̃: 18
      helped stats (rel) min: <.01% max: 37.77% x̄: 3.00% x̃: 2.25%
      HURT stats (abs)   min: 1 max: 4862 x̄: 89.17 x̃: 35
      HURT stats (rel)   min: <.01% max: 67.67% x̄: 6.16% x̃: 2.75%
      95% mean confidence interval for cycles value: -14.42 -11.47
      95% mean confidence interval for cycles %-change: -2.05% -1.91%
      Cycles are helped.
      
      total spills in shared programs: 26769 -> 26814 (0.17%)
      spills in affected programs: 826 -> 871 (5.45%)
      helped: 9
      HURT: 10
      
      total fills in shared programs: 38383 -> 38425 (0.11%)
      fills in affected programs: 834 -> 876 (5.04%)
      helped: 9
      HURT: 10
      
      LOST:   5
      GAINED: 10
      
      Ivy Bridge
      total instructions in shared programs: 12079250 -> 12044139 (-0.29%)
      instructions in affected programs: 2409680 -> 2374569 (-1.46%)
      helped: 16135
      HURT: 0
      helped stats (abs) min: 1 max: 23 x̄: 2.18 x̃: 2
      helped stats (rel) min: 0.07% max: 37.50% x̄: 2.72% x̃: 1.68%
      95% mean confidence interval for instructions value: -2.21 -2.14
      95% mean confidence interval for instructions %-change: -2.76% -2.67%
      Instructions are helped.
      
      total cycles in shared programs: 180116747 -> 179900405 (-0.12%)
      cycles in affected programs: 25439823 -> 25223481 (-0.85%)
      helped: 13817
      HURT: 1499
      helped stats (abs) min: 1 max: 1886 x̄: 26.40 x̃: 18
      helped stats (rel) min: <.01% max: 38.84% x̄: 2.57% x̃: 1.97%
      HURT stats (abs)   min: 1 max: 3684 x̄: 98.99 x̃: 52
      HURT stats (rel)   min: <.01% max: 97.01% x̄: 6.37% x̃: 3.42%
      95% mean confidence interval for cycles value: -15.68 -12.57
      95% mean confidence interval for cycles %-change: -1.77% -1.63%
      Cycles are helped.
      
      LOST:   8
      GAINED: 10
      
      Sandy Bridge
      total instructions in shared programs: 10878990 -> 10863659 (-0.14%)
      instructions in affected programs: 1806702 -> 1791371 (-0.85%)
      helped: 13023
      HURT: 0
      helped stats (abs) min: 1 max: 5 x̄: 1.18 x̃: 1
      helped stats (rel) min: 0.07% max: 13.79% x̄: 1.65% x̃: 1.10%
      95% mean confidence interval for instructions value: -1.18 -1.17
      95% mean confidence interval for instructions %-change: -1.68% -1.62%
      Instructions are helped.
      
      total cycles in shared programs: 154082878 -> 153862810 (-0.14%)
      cycles in affected programs: 20199374 -> 19979306 (-1.09%)
      helped: 12048
      HURT: 510
      helped stats (abs) min: 1 max: 323 x̄: 20.57 x̃: 18
      helped stats (rel) min: 0.03% max: 17.78% x̄: 2.05% x̃: 1.52%
      HURT stats (abs)   min: 1 max: 448 x̄: 54.39 x̃: 16
      HURT stats (rel)   min: 0.02% max: 37.98% x̄: 4.13% x̃: 1.17%
      95% mean confidence interval for cycles value: -17.97 -17.08
      95% mean confidence interval for cycles %-change: -1.84% -1.75%
      Cycles are helped.
      
      LOST:   1
      GAINED: 0
      
      Iron Lake
      total instructions in shared programs: 8155075 -> 8142729 (-0.15%)
      instructions in affected programs: 949495 -> 937149 (-1.30%)
      helped: 5810
      HURT: 0
      helped stats (abs) min: 1 max: 8 x̄: 2.12 x̃: 2
      helped stats (rel) min: 0.10% max: 16.67% x̄: 2.53% x̃: 1.85%
      95% mean confidence interval for instructions value: -2.14 -2.11
      95% mean confidence interval for instructions %-change: -2.59% -2.48%
      Instructions are helped.
      
      total cycles in shared programs: 188584610 -> 188549632 (-0.02%)
      cycles in affected programs: 17274446 -> 17239468 (-0.20%)
      helped: 3881
      HURT: 90
      helped stats (abs) min: 2 max: 168 x̄: 9.08 x̃: 6
      helped stats (rel) min: <.01% max: 23.53% x̄: 0.83% x̃: 0.30%
      HURT stats (abs)   min: 2 max: 10 x̄: 2.80 x̃: 2
      HURT stats (rel)   min: <.01% max: 0.60% x̄: 0.10% x̃: 0.07%
      95% mean confidence interval for cycles value: -9.35 -8.27
      95% mean confidence interval for cycles %-change: -0.85% -0.77%
      Cycles are helped.
      
      GM45
      total instructions in shared programs: 5019308 -> 5013119 (-0.12%)
      instructions in affected programs: 489028 -> 482839 (-1.27%)
      helped: 2912
      HURT: 0
      helped stats (abs) min: 1 max: 8 x̄: 2.13 x̃: 2
      helped stats (rel) min: 0.10% max: 16.67% x̄: 2.46% x̃: 1.81%
      95% mean confidence interval for instructions value: -2.14 -2.11
      95% mean confidence interval for instructions %-change: -2.54% -2.39%
      Instructions are helped.
      
      total cycles in shared programs: 129002592 -> 128977804 (-0.02%)
      cycles in affected programs: 12669152 -> 12644364 (-0.20%)
      helped: 2759
      HURT: 37
      helped stats (abs) min: 2 max: 168 x̄: 9.03 x̃: 4
      helped stats (rel) min: <.01% max: 21.43% x̄: 0.75% x̃: 0.31%
      HURT stats (abs)   min: 2 max: 10 x̄: 3.62 x̃: 4
      HURT stats (rel)   min: <.01% max: 0.41% x̄: 0.10% x̃: 0.04%
      95% mean confidence interval for cycles value: -9.53 -8.20
      95% mean confidence interval for cycles %-change: -0.79% -0.70%
      Cycles are helped.
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      0ba9497e
    • Ian Romanick's avatar
      intel/fs: Add need_dest parameter to fs_visitor::nir_emit_alu · a2887085
      Ian Romanick authored
      This is the same as the need_dest parameter to
      prepare_alu_destination_and_sources.  This allows us to not change the
      register that is expected to hold an result if an instruction is
      re-emitted.  This is particularly a problem if the re-emitted
      instruction is a partial write.  A later patch will use this feature.
      
      No shader-db changes on any Intel platform.
      
      v2: Don't do the Boolean resolve when there is no destination.  If the
      ALU instruction didn't write a register, there's nothing to resolve.
      This replaces an earlier patch "intel/fs: Allocate dummy destination
      register when need_dest is false".
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      a2887085
    • Ian Romanick's avatar
      intel/fs: Allow cmod propagation across reads and writes of different flags · e13a5c7d
      Ian Romanick authored
      This also helps a later patch (intel/fs: Improve discard_if code
      generation) on about 200 shaders.
      
      v2: Document that other instruction sequences are also valid in
      subtract_merge_with_compare_intervening_mismatch_flag_write.  Suggested
      by Caio.
      
      All Intel platforms had similar results. (Ice Lake shown)
      total instructions in shared programs: 17224438 -> 17224434 (<.01%)
      instructions in affected programs: 296 -> 292 (-1.35%)
      helped: 4
      HURT: 0
      helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
      helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40%
      95% mean confidence interval for instructions value: -1.00 -1.00
      95% mean confidence interval for instructions %-change: -2.04% -0.81%
      Instructions are helped.
      
      total cycles in shared programs: 361468455 -> 361468458 (<.01%)
      cycles in affected programs: 2862 -> 2865 (0.10%)
      helped: 2
      HURT: 2
      helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
      helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31%
      HURT stats (abs)   min: 3 max: 4 x̄: 3.50 x̃: 3
      HURT stats (rel)   min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51%
      95% mean confidence interval for cycles value: -4.34 5.84
      95% mean confidence interval for cycles %-change: -0.70% 0.90%
      Inconclusive result (value mean confidence interval includes 0).
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      e13a5c7d
    • Ian Romanick's avatar
      intel/fs: Fix flag_subreg handling in cmod propagation · 8030cb75
      Ian Romanick authored
      There were two errors.  First, the pass could propagate conditional
      modifiers from an instruction that writes on flag register to an
      instruction that writes a different flag register.  For example,
      
          cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
          cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F
      
      could be come
      
          cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
      
      Second, if an instruction writes f0.1 has it's condition propagated, the
      modified instruction will incorrectly write flag f0.0.  For example,
      
          linterp(16) vgrf6:F, g2:F, attr0:F
          cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F
          (-f0.1) discard_jump(16) (null):UD
      
      could become
      
          linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F
          (-f0.1) discard_jump(16) (null):UD
      
      None of these cases will occur currently.  The only time we use f0.1 is
      for generating discard intrinsics.  In all those cases, we generate a
      squence like:
      
          cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F
          (+f0.1) cmp.z(16) null:D, vgrf7:D, 0d
          (-f0.1) discard_jump(16) (null):UD
      
      Due to the mixed types and incompatible conditions, this sequence would
      never see any cmod propagation.  The next patch will change this.
      
      No shader-db changes on any Intel platform.
      
      v2: Fix typo in comment in test case subtract_delete_compare_other_flag.
      Noticed by Caio.
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      8030cb75
    • Ian Romanick's avatar
      intel/fs: Add missing tests for cmod_propagate_not · 2dd60139
      Ian Romanick authored
      Tests like this should have been added in 4467040c ("i965/fs:
      Propagate conditional modifiers from not instructions").
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      2dd60139
  2. 05 Jun, 2019 28 commits