1. 13 May, 2020 1 commit
  2. 01 May, 2020 2 commits
  3. 29 Apr, 2020 1 commit
  4. 10 Apr, 2020 2 commits
  5. 07 Apr, 2020 1 commit
    • Caio Marcelo de Oliveira Filho's avatar
      intel/fs: Allow multiple slots for position · 395de69b
      Caio Marcelo de Oliveira Filho authored
      Change brw_compute_vue_map() to also take the number of pos slots.  If
      more than one slot is used, the VARYING_SLOT_POS is treated as an
      array.
      
      When using Primitive Replication, instead of a single position, the
      VUE must contain an array of positions.  Padding might be
      necessary (after clip distance) to ensure rest of attributes start
      aligned.
      
      v2: Add note about array in the commit message and assert that
          pos_slots >= 1 to make clear 0 is invalid. (Jason)
          Move padding to be after the clip distance.
      
      v3: Apply the correct offset when gathering the sources from outputs.
      
      Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: Rafael Antognolli's avatarRafael Antognolli <rafael.antognolli@intel.com>
      Part-of: <mesa/mesa!2313>
      395de69b
  6. 03 Apr, 2020 1 commit
  7. 16 Mar, 2020 1 commit
  8. 03 Mar, 2020 2 commits
  9. 22 Feb, 2020 1 commit
  10. 03 Jan, 2020 1 commit
  11. 23 Dec, 2019 2 commits
  12. 11 Dec, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Create smaller program keys without legacy features · 2e654db2
      Kenneth Graunke authored
      A lot of the brw_*_prog_key fields are for emulating features on legacy
      hardware that iris doesn't support.  In particular, all of the texture
      swizzle fields take up a lot of space.  These dead fields make hashing
      the shader keys more expensive than it ought to be.
      
      We introduce iris-specific keys with only the information we need, and
      translate them to brw keys when actually compiling new variants.  This
      way, key comparisons can use the small keys.  The size reductions are:
      
         VS:  328 bytes ->  8 bytes
         TCS: 312 bytes -> 24 bytes
         TES: 304 bytes -> 24 bytes
         GS:  284 bytes ->  8 bytes
         FS:  304 bytes -> 16 bytes
         CS:  280 bytes ->  4 bytes
      
      Scores for the Piglit drawoverhead microbenchmark case with a shader
      program change improve by roughly 30%.
      Reviewed-by: Eric Anholt's avatarEric Anholt <eric@anholt.net>
      2e654db2
  13. 04 Dec, 2019 1 commit
    • Jason Ekstrand's avatar
      iris: Stop setting up fake params · 0604768a
      Jason Ekstrand authored
      In d1c4e64a, we added a parameter to tell the back-end compiler to
      ignore the param array and just push however many constants you ask it
      to push.  Iris doesn't want to push anything so it gives a bogus number
      of parameters and trusts the back-end compiler to dead-code all of them.
      Now that we can tell the back-end compiler to stop re-arranging things,
      delete the hack and enable the new simpler code path.
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      0604768a
  14. 14 Nov, 2019 2 commits
  15. 12 Nov, 2019 1 commit
  16. 23 Oct, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Rework edgeflag handling · 8dadef2e
      Kenneth Graunke authored
      We were relying on specific pass ordering in st to avoid setting
      inputs_read/outputs_written for edge flags.  Instead, just assume
      that it happens and throw out the results we don't want.
      
      We should probably revisit this and try and add a vertex element
      property like I originally wanted so we can avoid having it be
      associated with the VS altogether.
      8dadef2e
  17. 17 Oct, 2019 2 commits
  18. 14 Oct, 2019 1 commit
  19. 10 Oct, 2019 1 commit
  20. 09 Oct, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Implement the Broadwell NP Z PMA Stall Fix · 0b7ecfdd
      Kenneth Graunke authored
      This should help avoid stalls in the pixel mask array in certain
      non-promoted depth cases.  It especially helps for Z16, as each bit
      in the PMA corresponds to two pixels when using Z16, as opposed to
      the usual one pixel.
      
      Improves performance in GFXBench5 TRex by 22% (n=1).
      0b7ecfdd
  21. 23 Sep, 2019 1 commit
    • Kenneth Graunke's avatar
      intel: Increase Gen11 compute shader scratch IDs to 64. · b9e93db2
      Kenneth Graunke authored
      From the MEDIA_VFE_STATE docs:
      
         "Starting with this configuration, the Maximum Number of Threads must
          be set to (#EU * 8) for GPGPU dispatches.
      
          Although there are only 7 threads per EU in the configuration, the
          FFTID is calculated as if there are 8 threads per EU, which in turn
          requires a larger amount of Scratch Space to be allocated by the
          driver."
      
      It's pretty clear that we need to increase this for scratch address
      calculations, because the FFTID has a certain bit-pattern.  The quote
      above seems to indicate that we should increase the actual thread count
      programmed in MEDIA_VFE_STATE as well, but we think the intention is to
      only bump the scratch space.
      
      Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.
      
      Fixes: 5ac804bd ("intel: Add a preliminary device for Ice Lake")
      Reviewed-by: Matt Turner's avatarMatt Turner <mattst88@gmail.com>
      b9e93db2
  22. 18 Sep, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Avoid uploading SURFACE_STATE descriptors for UBOs if possible · 3da8a8a3
      Kenneth Graunke authored
      If we can entirely push uniform data, we don't need a SURFACE_STATE
      descriptor for pulling data.  Since constant uploads are a very common
      operation, and being able to push all data is also very common, we would
      like to avoid the overhead in this case.
      
      This patch defers uploading new descriptors.  Instead of handling that
      at iris_set_constant_buffer, we do it at iris_update_compiled_shaders,
      where we can see the currently bound shader variants.  If any need pull
      descriptors, and descriptors are missing, we update them and flag that
      the binding table also needs to be refreshed.
      
      Improves performance in GFXBench5 gl_driver2 on an i7-6770HQ by
      31.9774% +/- 1.12947% (n=15).
      Reviewed-by: Caio Marcelo de Oliveira Filho's avatarCaio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
      3da8a8a3
  23. 17 Sep, 2019 1 commit
  24. 12 Sep, 2019 1 commit
  25. 03 Sep, 2019 1 commit
    • Connor Abbott's avatar
      nir: Fix num_ssbos when lowering atomic counters · dcc64fcf
      Connor Abbott authored
      Otherwise it's impossible to know the maximum SSBO index for both
      internal TGSI shaders from TTN (which don't have any notion of atomic
      counters and no offset) as well as shaders from GLSL.
      
      I fixed everything I could find while grepping for num_ssbos and
      num_abos, which hopefully is everything (iris was the only user I could
      find that uses it in a meaningful way).
      Reviewed-by: default avatarMarek Olšák <marek.olsak@amd.com>
      dcc64fcf
  26. 25 Aug, 2019 1 commit
  27. 21 Aug, 2019 1 commit
  28. 20 Aug, 2019 2 commits
  29. 12 Aug, 2019 1 commit
  30. 08 Aug, 2019 1 commit
  31. 06 Aug, 2019 1 commit
  32. 01 Aug, 2019 2 commits