1. 14 Jun, 2021 2 commits
    • Rhys Perry's avatar
      aco: adjust the condition for expanding vertex fetch data format · 1d50ef9c
      Rhys Perry authored
      
      
      Instead of avoiding out-of-bounds access, avoid creating a load larger
      than the original attribute. This should work just as well, since the only
      situations expending a load helped was because we shrunk it first.
      
      Also fixes a bug where a 3 component load (4 components with the first
      component skipped) would be incorrectly expanded to 4 components because
      the stride check would never be performed. Maybe we should avoid skipping
      the first component in some situations, but I'm not sure if it's worth
      the VGPR cost.
      
      fossil-db (vega10):
      Totals from 583 (0.39% of 149974) affected shaders:
      CodeSize: 1496848 -> 1500868 (+0.27%); split: -0.03%, +0.30%
      Instrs: 286155 -> 286575 (+0.15%); split: -0.07%, +0.22%
      Latency: 2947101 -> 2946865 (-0.01%); split: -0.23%, +0.22%
      InvThroughput: 797396 -> 797127 (-0.03%); split: -0.08%, +0.04%
      
      fossil-db (polaris10):
      Totals from 583 (0.39% of 151365) affected shaders:
      SGPRs: 38880 -> 39216 (+0.86%)
      VGPRs: 24440 -> 24356 (-0.34%)
      CodeSize: 1506808 -> 1510876 (+0.27%); split: -0.01%, +0.28%
      Instrs: 288735 -> 289167 (+0.15%); split: -0.06%, +0.21%
      Latency: 2963263 -> 2961884 (-0.05%); split: -0.24%, +0.19%
      InvThroughput: 802351 -> 801665 (-0.09%); split: -0.12%, +0.04%
      Signed-off-by: Rhys Perry's avatarRhys Perry <pendingchaos02@gmail.com>
      Reviewed-by: Daniel Schürmann's avatarDaniel Schürmann <daniel@schuermann.dev>
      Part-of: <mesa/mesa!9007>
      1d50ef9c
    • Rhys Perry's avatar
      radv,aco: use all attributes in a binding to obtain an alignment for fetch · 91f8f828
      Rhys Perry authored
      
      
      Instead of assuming scalar alignment for an attribute, we can use the
      required alignment of other attributes in a binding to expect a higher
      one.
      
      This uses the alignment of all attributes in the pipeline, not just the
      ones loaded. This can create slightly better code, but could break
      pipelines which relied on unused (and unaligned) attributes no being
      loaded. I don't think such pipelines are allowed by the spec.
      
      fossil-db (Sienna Cichlid):
      Totals from 44350 (30.32% of 146267) affected shaders:
      VGPRs: 1694464 -> 1700616 (+0.36%); split: -0.08%, +0.44%
      CodeSize: 60207184 -> 58093836 (-3.51%); split: -3.51%, +0.00%
      MaxWaves: 1175998 -> 1174948 (-0.09%); split: +0.02%, -0.11%
      Instrs: 11763444 -> 11458952 (-2.59%); split: -2.60%, +0.01%
      Latency: 70679612 -> 67062215 (-5.12%); split: -5.27%, +0.15%
      InvThroughput: 11482495 -> 11362911 (-1.04%); split: -1.20%, +0.16%
      VClause: 359459 -> 343248 (-4.51%); split: -6.36%, +1.85%
      SClause: 422404 -> 419229 (-0.75%); split: -1.17%, +0.42%
      Copies: 754384 -> 764368 (+1.32%); split: -1.74%, +3.06%
      Branches: 197472 -> 197474 (+0.00%); split: -0.03%, +0.03%
      PreVGPRs: 1215348 -> 1215503 (+0.01%)
      Signed-off-by: Rhys Perry's avatarRhys Perry <pendingchaos02@gmail.com>
      Reviewed-by: Daniel Schürmann's avatarDaniel Schürmann <daniel@schuermann.dev>
      Part-of: <mesa/mesa!9007>
      91f8f828
  2. 10 Jun, 2021 2 commits
  3. 09 Jun, 2021 7 commits
  4. 08 Jun, 2021 2 commits
  5. 07 Jun, 2021 1 commit
  6. 04 Jun, 2021 1 commit
    • Daniel Schürmann's avatar
      radv,aco: scalarize all phis via nir_lower_phis_to_scalar() · dc807dff
      Daniel Schürmann authored
      
      
      This allows to remove some ACO code which did so previously.
      
      Totals from 93 (0.06% of 149839) affected shaders (Navi2):
      CodeSize: 582424 -> 582348 (-0.01%); split: -0.10%, +0.08%
      Instrs: 107083 -> 107011 (-0.07%); split: -0.08%, +0.01%
      Latency: 483338 -> 484881 (+0.32%); split: -0.09%, +0.40%
      InvThroughput: 101129 -> 101532 (+0.40%); split: -0.03%, +0.42%
      Copies: 9893 -> 9774 (-1.20%); split: -1.28%, +0.08%
      Branches: 2862 -> 2858 (-0.14%)
      PreSGPRs: 3342 -> 3339 (-0.09%)
      PreVGPRs: 4567 -> 4565 (-0.04%)
      Reviewed-by: Rhys Perry's avatarRhys Perry <pendingchaos02@gmail.com>
      Part-of: <mesa/mesa!11181>
      dc807dff
  7. 03 Jun, 2021 1 commit
    • Rhys Perry's avatar
      aco: don't create 4 and 5 dword NSA instructions on GFX10 · 903f814b
      Rhys Perry authored
      "stability issues", apparently: https://reviews.llvm.org/D103348
      
      
      
      fossil-db (Navi10):
      Totals from 4512 (3.01% of 149839) affected shaders:
      VGPRs: 221516 -> 223308 (+0.81%); split: -0.07%, +0.88%
      CodeSize: 23000080 -> 23070672 (+0.31%); split: -0.08%, +0.39%
      MaxWaves: 107718 -> 107496 (-0.21%); split: +0.11%, -0.32%
      Instrs: 4321890 -> 4362822 (+0.95%); split: -0.00%, +0.95%
      Latency: 71495710 -> 71581476 (+0.12%); split: -0.07%, +0.19%
      InvThroughput: 11858568 -> 11938960 (+0.68%); split: -0.00%, +0.68%
      VClause: 76575 -> 76585 (+0.01%); split: -0.05%, +0.07%
      SClause: 168771 -> 168709 (-0.04%); split: -0.06%, +0.02%
      Copies: 182305 -> 221948 (+21.75%); split: -0.00%, +21.75%
      PreVGPRs: 194657 -> 195635 (+0.50%); split: -0.00%, +0.50%
      Signed-off-by: Rhys Perry's avatarRhys Perry <pendingchaos02@gmail.com>
      Reviewed-by: Timur Kristóf's avatarTimur Kristóf <timur.kristof@gmail.com>
      Fixes: c353895c ("aco: use non-sequential addressing")
      Part-of: <mesa/mesa!10898>
      903f814b
  8. 28 May, 2021 1 commit
    • Timur Kristóf's avatar
      aco: Use s_cbranch_vccz/nz in post-RA optimization. · a93092d0
      Timur Kristóf authored
      
      
      A simple post-RA optimization which takes advantage of the
      s_cbranch_vccz and s_cbranch_vccnz instructions.
      
      It works on the following pattern:
      
      vcc = v_cmp ...
      scc = s_and vcc, exec
      p_cbranch scc
      
      The result looks like this:
      
      vcc = v_cmp ...
      p_cbranch vcc
      
      Fossil DB results on Sienna Cichlid:
      
      Totals from 4814 (3.21% of 149839) affected shaders:
      CodeSize: 15371176 -> 15345964 (-0.16%)
      Instrs: 3028557 -> 3022254 (-0.21%)
      Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
      InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%
      Signed-off-by: Timur Kristóf's avatarTimur Kristóf <timur.kristof@gmail.com>
      Reviewed-by: Daniel Schürmann's avatarDaniel Schürmann <daniel@schuermann.dev>
      Part-of: <mesa/mesa!7779>
      a93092d0
  9. 26 May, 2021 1 commit
  10. 20 May, 2021 2 commits
  11. 18 May, 2021 2 commits
  12. 12 May, 2021 5 commits
  13. 10 May, 2021 4 commits
  14. 27 Apr, 2021 2 commits
    • Rhys Perry's avatar
      radv,aco: use nir_address_format_vec2_index_32bit_offset · ee9b744c
      Rhys Perry authored
      The vec2 index helps the compiler make use of SMEM's SOFFSET field when
      loading descriptors.
      
      fossil-db (GFX10.3):
      Totals from 126326 (86.37% of 146267) affected shaders:
      VGPRs: 4898704 -> 4899088 (+0.01%); split: -0.02%, +0.03%
      SpillSGPRs: 13490 -> 14404 (+6.78%); split: -1.10%, +7.87%
      CodeSize: 306442996 -> 302277700 (-1.36%); split: -1.36%, +0.01%
      MaxWaves: 3277108 -> 3276624 (-0.01%); split: +0.01%, -0.02%
      Instrs: 58301101 -> 57469370 (-1.43%); split: -1.43%, +0.01%
      VClause: 1208270 -> 1199264 (-0.75%); split: -1.02%, +0.28%
      SClause: 2517691 -> 2432744 (-3.37%); split: -3.75%, +0.38%
      Copies: 3518643 -> 3161097 (-10.16%); split: -10.45%, +0.29%
      Branches: 1228383 -> 1228254 (-0.01%); split: -0.12%, +0.11%
      PreSGPRs: 3973880 -> 4031099 (+1.44%); split: -0.19%, +1.63%
      PreVGPRs: 3831599 -> 3831707 (+0.00%)
      Cycles: 1785250712 -> 1778222316 (-0.39%); split: -0.42%, +0.03%
      VMEM: 52873776 -> 50663317 (-4.18%); split: +0.18%, -4.36%
      SMEM: 8534270 -> 836166...
      ee9b744c
    • Samuel Pitoiset's avatar
  15. 23 Apr, 2021 1 commit
  16. 22 Apr, 2021 1 commit
  17. 20 Apr, 2021 2 commits
    • Rhys Perry's avatar
      aco: remove image parameter from get_sampler_desc() · 0eaa5dfa
      Rhys Perry authored
      
      
      We can just check whether tex_instr is NULL instead.
      Signed-off-by: Rhys Perry's avatarRhys Perry <pendingchaos02@gmail.com>
      Reviewed-by: Samuel Pitoiset's avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
      Reviewed-by: Bas Nieuwenhuizen's avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      Part-of: <mesa/mesa!10036>
      0eaa5dfa
    • Rhys Perry's avatar
      aco: set TRUNC_COORD=0 for nir_texop_tg4 · 3cbe9894
      Rhys Perry authored
      
      
      Fixes black squares in Assassin's Creed: Valhalla and rendering of
      FidelityFX-CACAO demo.
      
      fossil-db (sienna cichlid):
      Totals from 3052 (2.09% of 146267) affected shaders:
      SpillSGPRs: 8437 -> 8646 (+2.48%)
      CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40%
      Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29%
      Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05%
      InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03%
      VClause: 92114 -> 92132 (+0.02%)
      SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01%
      Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61%
      Branches: 219629 -> 219635 (+0.00%)
      PreSGPRs: 248970 -> 249366 (+0.16%)
      
      fossil-db (polaris10):
      Totals from 3050 (2.06% of 147787) affected shaders:
      SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02%
      VGPRs: 242572 -> 242612 (+0.02%)
      SpillSGPRs: 10387 -> 10675 (+2.77%)
      CodeSize: 31872460 -> 31996128 (+0.39%)
      MaxWaves: 10924 -> 10925 (+0.01%)
      Instrs: 6222217 -> 6239072 (+0.27%)
      Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09%
      InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06%
      VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01%
      SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00%
      Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41%
      Branches: 219698 -> 219703 (+0.00%)
      PreSGPRs: 244251 -> 244644 (+0.16%)
      Signed-off-by: Rhys Perry's avatarRhys Perry <pendingchaos02@gmail.com>
      Reviewed-by: Samuel Pitoiset's avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
      Reviewed-by: Bas Nieuwenhuizen's avatarBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      Fixes: 58f25098 ("radv: Use TRUNC_COORD on samplers")
      Closes: mesa/mesa#3110
      Part-of: <mesa/mesa!10036>
      3cbe9894
  18. 19 Apr, 2021 1 commit
  19. 17 Apr, 2021 1 commit
  20. 14 Apr, 2021 1 commit