1. 19 Oct, 2020 1 commit
    • Kenneth Graunke's avatar
      isl, anv, iris: Add a centralized helper to select MOCS based on usage · 02fe825a
      Kenneth Graunke authored
      On Gen12+, we can enable additional caches in certain usage situations.
      This routes that decision making to a central place in ISL, based on
      surface usage flags, and updates both drivers to use it.  (i965 doesn't
      need to change because it doesn't support Gen12.)
      We continue handling the "external" decision via an anv_mocs() wrapper
      for now, since we store that flag in anv_bo, which isl doesn't know
      about.  (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
      actually sure that would be cleaner.)
      This patch should not have any functional nor performance effects, as
      we continue selecting the exact same MOCS values for now.
      Reviewed-by: Jason Ekstrand's avatarJason Ekstrand <jason@jlekstrand.net>
      Part-of: <!7104>
  2. 09 Sep, 2020 1 commit
  3. 04 Sep, 2020 1 commit
  4. 19 Jun, 2020 2 commits
    • Nanley Chery's avatar
      iris: Disable sRGB fast-clears for non-0/1 values · f8961ea0
      Nanley Chery authored
      For texturing and draw calls, HW expects the clear color to be in two
      different color spaces after sRGB fast-clears - sRGB in the former and
      linear in the latter. Up until now, iris has stored the clear color in
      the sRGB color space. Limit the allowable clear colors for sRGB
      fast-clears to 0/1 so that both color space requirements are satisfied.
      Makes iris pass the sRGB -> sRGB subtest of the fcc-write-after-clear
      piglit test on gen9+.
      * Drop iris_context::blend_enables. (Ken)
      * Drop some more resolve-related blend-state-tracking code.
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Part-of: <!4972>
    • Nanley Chery's avatar
      iris: Avoid fast-clear with incompatible view · 48a3f4c4
      Nanley Chery authored
      For rendering operations, avoid adding or using fast-cleared blocks if
      the render format is incompatible with the clear color interpretation.
      Note that the clear color is currently interpreted through the
      resource's surface format.
      Makes iris pass subtests of the fcc-write-after-clear piglit test:
      * UNORM -> SNORM, partial block on gen8+.
      * linear -> sRGB, partial block on gen9+.
      * UNORM -> SNORM, full block on gen12.
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Part-of: <!4972>
  5. 03 Jun, 2020 1 commit
  6. 16 Mar, 2020 2 commits
  7. 22 Feb, 2020 1 commit
  8. 04 Jan, 2020 2 commits
  9. 06 Dec, 2019 1 commit
  10. 25 Nov, 2019 2 commits
    • Kenneth Graunke's avatar
      iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces. · 060a2c52
      Kenneth Graunke authored
      When replacing the backing storage for texture buffers, image buffers,
      and so on, we may need to update the "Surface Base Address" field in
      any corresponding SURFACE_STATE.  This is easier to accomplish if we
      have a copy on the CPU - we can just compare the current field, update
      it, and re-upload.
      This patch adds a CPU-side copy to the new iris_surface_state wrapper
      struct, and reworks allocation and upload to fill things out on the
      CPU copy first, then upload that to the GPU when finished.
      This will be necessary to fix iris_invalidate_resource bugs shortly.
      Technically, we never replace the backing storage for pipe_surfaces
      (render targets), so we don't need to make this change there.  However,
      it's nice to have surfaces, sampler views, and image views handled
      similarly.  Plus, if we ever wanted to swap out backing storage for
      busy textures, we'd need this infrastructure.
      v2: Properly free memory (caught by Andrii Simiklit)
    • Kenneth Graunke's avatar
      iris: Create an "iris_surface_state" wrapper struct · 2b09e818
      Kenneth Graunke authored
      Today, we only have a state reference to the GPU buffer containing our
      uploaded SURFACE_STATEs.  However, we're going to want a CPU-side copy
      soon.  Making a wrapper struct means we can talk about both together,
      and also put both in the field called "surface_state".
  11. 29 Oct, 2019 1 commit
  12. 28 Oct, 2019 1 commit
  13. 27 Sep, 2019 1 commit
  14. 18 Sep, 2019 1 commit
  15. 09 Sep, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Avoid flushing for cache history on transfer range flushes · 410894c6
      Kenneth Graunke authored
      The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps
      appending data, and calling glFlushMappedBufferRange().  We were
      invalidating the VF cache each time it flushed a new range, which
      results in a ton of VF flushes.
      If the contents of the destination in the target range are undefined
      (never even possibly written), this patch makes us assume that it's
      likely not in the cache and so cache invalidations are required.  If
      the destination range is defined, we continue cache flushing as we may
      need to expunge stale data.
      This eliminates 88% of the VF cache invalidates on Manhattan 3.0.
      Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU
      frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).
  16. 23 Aug, 2019 1 commit
  17. 20 Aug, 2019 3 commits
  18. 13 Aug, 2019 1 commit
  19. 17 Jul, 2019 1 commit
  20. 20 Jun, 2019 2 commits
  21. 07 May, 2019 1 commit
  22. 24 Apr, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Split iris_flush_and_dirty_for_history into two helpers. · 21688a30
      Kenneth Graunke authored
      We create two new helpers, iris_flush_bits_for_history, and
      iris_dirty_for_history, then use them in the existing function.
      The first accumulates flush bits based on res->bind_history, but doesn't
      actually perform a flush.  This allows us to accumulate flush bits by
      looping over multiple resources, but ultimately emit a single flush for
      all of them.
      The latter flags dirty bits without flushing, which again allows us to
      handle multiple resources, but also is more convenient when writing from
      the CPU where we don't need a flush (as in commit 4d122360).
  23. 23 Apr, 2019 2 commits
    • Kenneth Graunke's avatar
      iris: Track valid data range and infer unsynchronized mappings. · 77449d7c
      Kenneth Graunke authored
      Applications frequently call glBufferSubData() to consecutive regions
      of a VBO to append new vertex data.  If no data exists there yet, we
      can promote these to unsynchronized writes, even if the buffer is busy,
      since the GPU can't be doing anything useful with undefined content.
      This can avoid a bunch of unnecessary blitting on the GPU.
      u_threaded_context would do this for us, and in fact prohibits us from
      doing so (see TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED).  But we haven't
      hooked that up yet, and it may be useful to disable u_threaded_context
      when debugging...at which point we'd still want this optimization.  At
      the very least, it would let us measure the benefit of threading
      independently from this optimization.  And it's not a lot of code.
      Removes most stall avoidance blits in "Total War: WARHAMMER."
      On my Skylake GT4e at 1920x1080, this appears to improve performance
      in games by the following (but I did not do many runs for proper
      statistics gathering):
         | DiRT Rally        | +2% (avg) | + 2% (max) |
         | Bioshock Infinite | +3% (avg) | + 9% (max) |
         | Shadow of Mordor  | +7% (avg) | +20% (max) |
    • Kenneth Graunke's avatar
      iris: Rework image views to store pipe_image_view. · b45dff1d
      Kenneth Graunke authored
      This will be useful when rebinding images.
  24. 02 Apr, 2019 1 commit
    • Rafael Antognolli's avatar
      iris: Add aux.sampler_usages. · 7339660e
      Rafael Antognolli authored
      We want to skip some types of aux usages (for instance,
      ISL_AUX_USAGE_HIZ when the hardware doesn't support it, or when we have
      multisampling) when sampling from the surface.
      Instead of checking for those cases while filling the surface state and
      leaving it blank, let's have a version of aux.possible_usages for
      sampling. This way we can also avoid allocating surface state for the
      cases we don't use.
      Fixes: a8b5ea8e
       "iris: Add function to update clear color in surface state."
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
  25. 20 Mar, 2019 3 commits
  26. 19 Mar, 2019 1 commit
  27. 08 Mar, 2019 1 commit
    • Kenneth Graunke's avatar
      iris: Use copy_region and staging resources to avoid transfer stalls · 9d1334d2
      Kenneth Graunke authored
      This is similar to intel_miptree_map_blit and intel_buffer_object.c's
      temporary blits in i965.
      Improves performance of DiRT Rally by 20-25% by eliminating stalls.
      Breaks piglit's spec/arb_shader_image_load_store/host-mem-barrier,
      by using the GPU to do uploads, exposing a st/mesa issue where it
      doesn't give us memory_barrier() calls.  This is a pre-existing issue
      and will be fixed by a later patch (currently out for review).
  28. 21 Feb, 2019 3 commits