1. 20 Jun, 2019 1 commit
  2. 14 Apr, 2019 2 commits
  3. 22 Mar, 2019 1 commit
  4. 21 Mar, 2019 1 commit
  5. 12 Mar, 2019 8 commits
    • Brian Paul's avatar
      nir: silence a couple new compiler warnings · 02c2863d
      Brian Paul authored
      [33/630] Compiling C object 'src/compiler/nir/nir@sta/nir_loop_analyze.c.o'.
      ../src/compiler/nir/nir_loop_analyze.c: In function ‘try_find_trip_count_vars_in_iand’:
      ../src/compiler/nir/nir_loop_analyze.c:846:29: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
          if (*ind == NULL || *ind && (*ind)->type != basic_induction ||
                                   ^
      [85/630] Compiling C object 'src/compiler/nir/nir@sta/nir_opt_loop_unroll.c.o'.
      ../src/compiler/nir/nir_opt_loop_unroll.c: In function ‘complex_unroll_single_terminator’:
      ../src/compiler/nir/nir_opt_loop_unroll.c:494:17: warning: unused variable ‘unroll_loc’ [-Wunused-variable]
          nir_cf_node *unroll_loc =
                       ^
      Reviewed-by: Timothy Arceri's avatarTimothy Arceri <tarceri@itsqueeze.com>
      02c2863d
    • Timothy Arceri's avatar
      nir: find induction/limit vars in iand instructions · 3235a942
      Timothy Arceri authored
      This will be used to help find the trip count of loops that look
      like the following:
      
         while (a < x && i < 8) {
            ...
            i++;
         }
      
      Where the NIR will end up looking something like this:
      
         vec1 32 ssa_1 = load_const (0x00000004 /* 0.000000 */)
         loop {
            ...
            vec1 1 ssa_12 = ilt ssa_225, ssa_11
            vec1 1 ssa_17 = ilt ssa_226, ssa_1
            vec1 1 ssa_18 = iand ssa_12, ssa_17
            vec1 1 ssa_19 = inot ssa_18
      
            if ssa_19 {
               ...
               break
            } else {
               ...
            }
         }
      
      On RADV this unrolls a bunch of loops in F1-2017 shaders.
      
      Totals from affected shaders:
      SGPRS: 4112 -> 4136 (0.58 %)
      VGPRS: 4132 -> 4052 (-1.94 %)
      Spilled SGPRs: 0 -> 0 (0.00 %)
      Spilled VGPRs: 0 -> 0 (0.00 %)
      Private memory VGPRs: 0 -> 0 (0.00 %)
      Scratch size: 0 -> 0 (0.00 %) dwords per thread
      Code Size: 515444 -> 587720 (14.02 %) bytes
      LDS: 2 -> 2 (0.00 %) blocks
      Max Waves: 194 -> 196 (1.03 %)
      Wait states: 0 -> 0 (0.00 %)
      
      It also unrolls a couple of loops in shader-db on radeonsi.
      
      Totals from affected shaders:
      SGPRS: 128 -> 128 (0.00 %)
      VGPRS: 64 -> 64 (0.00 %)
      Spilled SGPRs: 0 -> 0 (0.00 %)
      Spilled VGPRs: 0 -> 0 (0.00 %)
      Private memory VGPRs: 0 -> 0 (0.00 %)
      Scratch size: 0 -> 0 (0.00 %) dwords per thread
      Code Size: 6880 -> 9504 (38.14 %) bytes
      LDS: 0 -> 0 (0.00 %) blocks
      Max Waves: 16 -> 16 (0.00 %)
      Wait states: 0 -> 0 (0.00 %)
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      3235a942
    • Timothy Arceri's avatar
      nir: pass nir_op to calculate_iterations() · 67c34784
      Timothy Arceri authored
      Rather than getting this from the alu instruction this allows us
      some flexibility. In the following pass we instead pass the
      inverse op.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      67c34784
    • Timothy Arceri's avatar
      nir: add get_induction_and_limit_vars() helper to loop analysis · 11e8f8a1
      Timothy Arceri authored
      This helps make find_trip_count() a little easier to follow but
      will also be used by a following patch.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      11e8f8a1
    • Timothy Arceri's avatar
      nir: add helper to return inversion op of a comparison · f219f611
      Timothy Arceri authored
      This will be used to help find the trip count of loops that look
      like the following:
      
         while (a < x && i < 8) {
            ...
            i++;
         }
      
      Where the NIR will end up looking something like this:
      
         vec1 32 ssa_1 = load_const (0x00000004 /* 0.000000 */)
         loop {
            ...
            vec1 1 ssa_12 = ilt ssa_225, ssa_11
            vec1 1 ssa_17 = ilt ssa_226, ssa_1
            vec1 1 ssa_18 = iand ssa_12, ssa_17
            vec1 1 ssa_19 = inot ssa_18
      
            if ssa_19 {
               ...
               break
            } else {
               ...
            }
         }
      
      So in order to find the trip count we need to find the inverse of
      ilt.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      f219f611
    • Timothy Arceri's avatar
      nir: simplify the loop analysis trip count code a little · 090feaac
      Timothy Arceri authored
      Here we create a helper is_supported_terminator_condition()
      and use that rather than embedding all the trip count code
      inside a switch.
      
      The new helper will also be used in a following patch.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      090feaac
    • Timothy Arceri's avatar
      nir: calculate trip count for more loops · 68ce0ec2
      Timothy Arceri authored
      This adds support to loop analysis for loops where the induction
      variable is compared to the result of min(variable, constant).
      
      For example:
      
         for (int i = 0; i < imin(x, 4); i++)
            ...
      
      We add a new bool to the loop terminator struct in order to
      differentiate terminators with this exit condition.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      68ce0ec2
    • Timothy Arceri's avatar
      nir: add guess trip count support to loop analysis · 03a452b7
      Timothy Arceri authored
      This detects an induction variable used as an array index to guess
      the trip count of the loop. This enables us to do a partial
      unroll of the loop, which can eventually result in the loop being
      eliminated.
      
      v2: check if the induction var is used to index more than a single
          array and if so get the size of the smallest array.
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      03a452b7
  6. 06 Mar, 2019 1 commit
  7. 06 Jan, 2019 1 commit
  8. 16 Dec, 2018 2 commits
    • Jason Ekstrand's avatar
      nir: Switch to using 1-bit Booleans for almost everything · 44227453
      Jason Ekstrand authored
      This is a squash of a few distinct changes:
      
          glsl,spirv: Generate 1-bit Booleans
      
          Revert "Use 32-bit opcodes in the NIR producers and optimizations"
      
          Revert "nir/builder: Generate 32-bit bool opcodes transparently"
      
          nir/builder: Generate 1-bit Booleans in nir_build_imm_bool
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      Reviewed-by: Bas Nieuwenhuizen's avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      Tested-by: Bas Nieuwenhuizen's avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      44227453
    • Jason Ekstrand's avatar
      nir: Rename Boolean-related opcodes to include 32 in the name · 80e8dfe9
      Jason Ekstrand authored
      This is a squash of a bunch of individual changes:
      
          nir/builder: Generate 32-bit bool opcodes transparently
      
          nir/algebraic: Remap Boolean opcodes to the 32-bit variant
      
          Use 32-bit opcodes in the NIR producers and optimizations
      
              Generated with a little hand-editing and the following sed commands:
      
              sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
              sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
              sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
              sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
              sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
              sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c
      
           Use 32-bit opcodes in the NIR back-ends
      
              Generated with a little hand-editing and the following sed commands:
      
              sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
              sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
              sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
              sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
              sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
              sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
              sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      Reviewed-by: Bas Nieuwenhuizen's avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      Tested-by: Bas Nieuwenhuizen's avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      80e8dfe9
  9. 13 Dec, 2018 1 commit
  10. 12 Dec, 2018 4 commits
    • Timothy Arceri's avatar
      nir: detect more induction variables · 9e6b39e1
      Timothy Arceri authored
      This allows loop analysis to detect inductions variables that
      are incremented in both branches of an if rather than in a main
      loop block. For example:
      
         loop {
            block block_1:
            /* preds: block_0 block_7 */
            vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
            vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
            vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
            vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
            vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
            vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
            vec1 32 ssa_14 = ige ssa_8, ssa_5
            /* succs: block_2 block_3 */
            if ssa_14 {
               block block_2:
               /* preds: block_1 */
               break
               /* succs: block_8 */
            } else {
               block block_3:
               /* preds: block_1 */
               /* succs: block_4 */
            }
            block block_4:
            /* preds: block_3 */
            vec1 32 ssa_15 = ilt ssa_6, ssa_8
            /* succs: block_5 block_6 */
            if ssa_15 {
               block block_5:
               /* preds: block_4 */
               vec1 32 ssa_16 = iadd ssa_8, ssa_7
               vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
               /* succs: block_7 */
            } else {
               block block_6:
               /* preds: block_4 */
               vec1 32 ssa_18 = iadd ssa_8, ssa_7
               vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
               /* succs: block_7 */
            }
            block block_7:
            /* preds: block_5 block_6 */
            vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
            vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
            vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
            /* succs: block_1 */
         }
      
      Unfortunatly GCM could move the addition out of the if for us
      (making this patch unrequired) but we still cannot enable the GCM
      pass without regressions.
      
      This unrolls a loop in Rise of The Tomb Raider.
      
      vkpipeline-db results (VEGA):
      
      Totals from affected shaders:
      SGPRS: 88 -> 96 (9.09 %)
      VGPRS: 56 -> 52 (-7.14 %)
      Spilled SGPRs: 0 -> 0 (0.00 %)
      Spilled VGPRs: 0 -> 0 (0.00 %)
      Private memory VGPRs: 0 -> 0 (0.00 %)
      Scratch size: 0 -> 0 (0.00 %) dwords per thread
      Code Size: 2168 -> 4560 (110.33 %) bytes
      LDS: 0 -> 0 (0.00 %) blocks
      Max Waves: 4 -> 4 (0.00 %)
      Wait states: 0 -> 0 (0.00 %)
      Reviewed-by: Thomas Helland's avatarThomas Helland <thomashelland90@gmail.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211
      9e6b39e1
    • Timothy Arceri's avatar
      nir: reword code comment · c03d6e61
      Timothy Arceri authored
      Reviewed-by: Thomas Helland's avatarThomas Helland <thomashelland90@gmail.com>
      c03d6e61
    • Timothy Arceri's avatar
      nir: in loop analysis track actual control flow type · 48b40380
      Timothy Arceri authored
      This will allow us to improve analysis to find more induction
      variables.
      Reviewed-by: Thomas Helland's avatarThomas Helland <thomashelland90@gmail.com>
      48b40380
    • Timothy Arceri's avatar
      nir: rework force_unroll_array_access() · 721566bd
      Timothy Arceri authored
      Here we rework force_unroll_array_access() so that we can reuse
      the induction variable detection in a following patch.
      Reviewed-by: Thomas Helland's avatarThomas Helland <thomashelland90@gmail.com>
      721566bd
  11. 10 Dec, 2018 2 commits
  12. 29 Aug, 2018 2 commits
  13. 30 Jun, 2018 1 commit
    • Timothy Arceri's avatar
      nir: fix selection of loop terminator when two or more have the same limit · 463f8490
      Timothy Arceri authored
      We need to add loop terminators to the list in the order we come
      across them otherwise if two or more have the same exit condition
      we will select that last one rather than the first one even though
      its unreachable.
      
      This fix is for simple unrolls where we only have a single exit
      point. When unrolling these type of loops the unreachable
      terminators and their unreachable branch are removed prior to
      unrolling. Because of the logic change we also switch some
      list access in the complex unrolling logic to avoid breakage.
      
      Fixes: 6772a17a ("nir: Add a loop analysis pass")
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason@jlekstrand.net>
      463f8490
  14. 23 Jun, 2018 3 commits
  15. 07 Jun, 2018 1 commit
  16. 23 Mar, 2017 1 commit
  17. 27 Feb, 2017 1 commit
  18. 04 Jan, 2017 1 commit
  19. 22 Dec, 2016 1 commit
    • Thomas Helland's avatar
      nir: Add a loop analysis pass · 6772a17a
      Thomas Helland authored
      This pass detects induction variables and calculates the
      trip count of loops to be used for loop unrolling.
      
      V2: Rebase, adapt to removal of function overloads
      
      V3: (Timothy Arceri)
       - don't try to find trip count if loop terminator conditional is a phi
       - fix trip count for do-while loops
       - replace conditional type != alu assert with return
       - disable unrolling of loops with continues
       - multiple fixes to memory allocation, stop leaking and don't destroy
         structs we want to use for unrolling.
       - fix iteration count bugs when induction var not on RHS of condition
       - add FIXME for && conditions
       - calculate trip count for unsigned induction/limit vars
      
      V4: (Timothy Arceri)
      - count instructions in a loop
      - set the limiting_terminator even if we can't find the trip count for
       all terminators. This is needed for complex unrolling where we handle
       2 terminators and the trip count is unknown for one of them.
      - restruct structs so we don't keep information not required after
       analysis and remove dead fields.
      - force unrolling in some cases as per the rules in the GLSL IR pass
      
      V5: (Timothy Arceri)
      - fix metadata mask value 0x10 vs 0x16
      
      V6: (Timothy Arceri)
      - merge loop_variable and nir_loop_variable structs and lists suggested by Jason
      - remove induction var hash table and store pointer to induction information in
        the loop_variable suggested by Jason.
      - use lowercase list_addtail() suggested by Jason.
      - tidy up init_loop_block() as per Jasons suggestions.
      - replace switch with nir_op_infos[alu->op].num_inputs == 2 in
        is_var_basic_induction_var() as suggested by Jason.
      - use nir_block_last_instr() in and rename foreach_cf_node_ex_loop() as suggested
        by Jason.
      - fix else check for is_trivial_loop_terminator() as per Connors suggetions.
      - simplify offset for induction valiables incremented before the exit conditions is
        checked.
      - replace nir_op_isub check with assert() as it should have been lowered away.
      
      V7: (Timothy Arceri)
      - use rzalloc() on nir_loop struct creation. Worked previously because ralloc()
        was broken and always zeroed the struct.
      - fix cf_node_find_loop_jumps() to find jumps when loops contain
        nested if statements. Code is tidier as a result.
      
      V8: (Timothy Arceri)
      - move is_trivial_loop_terminator() to nir.h so we can use it to assert is
        the loop unroll pass
      - fix analysis to not bail when looking for terminator when the break is in the else
        rather then the if
      - added new loop terminator fields: break_block, continue_from_block and
        continue_from_then so we don't have to gather these when doing unrolling.
      - get correct array length when forcing unrolling of variables
        indexed arrays that are the same size as the iteration count
      - add support for induction variables of type float
      - update trival loop terminator check to allow an if containing
        instructions as long as both branches contain only a single
        block.
      
      V9: (Timothy)
       - bunch of tidy ups and simplifications suggested by Jason.
       - rewrote trivial terminator detection, now the only restriction is there
         must be no nested jumps, anything else goes.
       - rewrote the iteration test to use nir_eval_const_opcode().
       - count instruction properly even when forcing an unroll.
       - bunch of other tidy ups and simplifications.
      
      V10: (Timothy)
       - some trivial tidy ups suggested by Jason.
       - conditional fix for break inside continue branch by Jason.
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason@jlekstrand.net>
      6772a17a