1. 03 Jun, 2021 28 commits
  2. 02 Jun, 2021 12 commits
    • Danylo Piliaiev's avatar
      turnip: fix register_index calculations of xfb outputs · b71e27ea
      Danylo Piliaiev authored
      nir_assign_io_var_locations() does not use outputs_written when
      assigning driver locations. Use driver_location to avoid incorrectly
      guessing what locations it assigned.
      
      Copied from lavapipe 8731a1be
      
      
      
      Will fix provoking vertex tf tests when VK_EXT_provoking_vertex
      would be enabled:
       dEQP-VK.rasterization.provoking_vertex.transform_feedback.*
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Reviewed-by: Samuel Iglesias Gonsálvez's avatarSamuel Iglesias Gonsálvez <siglesias@igalia.com>
      Part-of: <mesa/mesa!11111>
      b71e27ea
    • Danylo Piliaiev's avatar
      turnip: emit vb stride dynamic state when it is dirty · 551d7fdd
      Danylo Piliaiev authored
      Due to incorrect condition we never emitted vb stride
      if state was dynamically set.
      
      Fixes vertex explosion with Zink.
      
      See mesa/mesa#4738
      
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Reviewed-by: Samuel Iglesias Gonsálvez's avatarSamuel Iglesias Gonsálvez <siglesias@igalia.com>
      Part-of: <mesa/mesa!11133>
      551d7fdd
    • Kenneth Graunke's avatar
      iris: Use bo->mmap_mode in transfer map read check · ccfde508
      Kenneth Graunke authored
      
      
      The scenario we want to avoid is reading from WC or UC mappings,
      so this is an easier to follow check.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      ccfde508
    • Kenneth Graunke's avatar
      iris: Pick a single mmap mode (WB/WC) at BO allocation time · f62724cc
      Kenneth Graunke authored
      Previously, iris_bufmgr had the ability to maintain multiple
      simultaneous memory mappings for a BO, one in WB mode (with CPU caches),
      and another in WC (streaming) mode.  Depending on the flags passed to
      iris_bo_map(), we would select one mode or the other.
      
      The rules for deciding which to use were:
      
      - Systems with LLC always use WB mode because it's basically free
      - Non-LLC systems used...
        - WB maps for all BOs where snooping is enabled (which translates to
          when BO_ALLOC_COHERENT is set at allocation time)
        - WB maps for reads unless persistent, coherent, async, or raw.
        - WC maps for everything else.
      
      This patch simplifies the system by selecting a single mmap mode at
      BO allocation time, and always using that.  Each BO now has at most one
      map at a time, rather than up to two (or three before we deleted GTT
      map support in recent patches).
      
      In practical terms, this eliminates the capability to use WB maps for
      reads of non-snooped BOs on non-LLC systems.  Such reads would now be
      slow, uncached reads.  However, iris_transfer_map recently began using
      staging blits for such reads - so the GPU copies the data to a snooped
      buffer which will be mapped WB.  So, rather than incurring slow UC
      reads, we really just take the hit of a blit, and retain fast reads.
      
      The rest of the rules remain the same.
      
      There are a few reasons for this:
      
      1. TTM doesn't support mapping an object as both WB and WC.  The
         cacheability is treated as a property of the object, not the map.
         The kernel is moving to use TTM as part of adding discrete local
         memory support.  So it makes sense to centralize on that model.
      
      2. Mapping the same BO as both WB and WC is impossible to support on
         some CPUs.  It works on x86/x86_64, which was fine for integrated
         GPUs, but it may become an issue for discrete graphics paired with
         other CPUs (should that ever be a thing we need to support).
      
      3. It's overall much simpler.  We only have one bo->map field, and
         manage to drop a significant amount of boilerplate.
      
      One issue that arises is the interaction with the BO cache: BOs with
      WB maps and WC maps will be lumped together into the same cache.  This
      means that a cached BO may have the wrong mmap mode.  We check that,
      and if it doesn't match, we unmap it, waiting until iris_bo_map is
      called to restore one with the desired mode.  This may underutilize
      cache mappings slightly on non-LLC systems, but I don't expect it to
      have a large impact.
      
      Closes: #4747
      
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <!10941>
      f62724cc
    • Kenneth Graunke's avatar
      iris: Delete GTT mapping support · 22bfb535
      Kenneth Graunke authored
      
      
      In the bad old days, i965 used GTT mapping for detiling maps.  iris
      never has, however.  The only reason it used GTT maps was in weird
      fallback cases for dealing with BO imports from foreign memory.  We
      now do staging blits for those, and never mmap them.
      
      There are no more users of GTT mapping, so we can delete it.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      22bfb535
    • Kenneth Graunke's avatar
      iris: Drop fallback GEM_MMAP_GTT if GEM_MMAP with I915_MMAP_WC fails · 2f30cf4a
      Kenneth Graunke authored
      
      
      XXX: This is actually wrong.  The dmabuf imported case can be mapped via
      GEM_MMAP_GTT if the iommu is working, according to Joonas, but GEM_MMAP
      would fall over and fail.  So we would need this fallback.
      ALTERNATIVELY...we would need to flag such imported dmabufs as
      unmappable, and then make iris_transfer_map/unmap always do blits
      instead of direct mappings.  That seems like the saner approach
      
      We never want to use GEM_MMAP_GTT, as it does detiling maps, and iris
      always wants direct maps.  There were originally two cases that this
      fallback path was attempting to handle:
      
      1. The BO was allocated from stolen memory that we can't GEM_MMAP.
      
         At one point, kernel patches were being proposed to use stolen
         memory for userspace buffers, but these never landed.  The kernel
         has never given us stolen memory, so we cannot hit this case.
      
      2. Imported objects may be from memory we can't GEM_MMAP.
      
         For example, a DMABUF from a discrete AMD/NVIDIA GPU in a PRIME
         setup would be backed by memory that we can't GEM_MMAP.  We could
         try and mmap these directly with GEM_MMAP_GTT, but that relies on
         the IOMMU working.  We could mmap the DMABUF fd directly (but have
         never tried to do so), but there are complex rules there.  Instead,
         we now flag those imports, however, and rely on the iris_transfer_map
         code to perform staging blits on the GPU, so we never even try to
         map them directly.  So this case won't reach us here any longer.
      
      With both of those out of the way, there is no need for a fallback.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      2f30cf4a
    • Kenneth Graunke's avatar
      iris: Assert on mapping a tiled buffer without MAP_RAW · 05a43d42
      Kenneth Graunke authored
      
      
      iris has never relied on detiled maps using hardware fences.
      This code is a remnant of i965, where that was actually used.
      
      We can just assert that callers don't do such a thing.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      05a43d42
    • Kenneth Graunke's avatar
      iris: Use staging blits for transfers involving imported BOs · 3319ab0d
      Kenneth Graunke authored
      
      
      Direct mappings of imported DMABUFs can be tricky.  If they're allocated
      from our own device, then we can probably mmap them and it'd be fine.
      But they may come from a different device (such as a discrete GPU), in
      which case I915_GEM_MMAP wouldn't work, I915_GEM_MMAP_GTT would require
      a working IOMMU, and directly mmap'ing the DMABUF fd would come with a
      bunch of rules and restrictions which are hard to get right.
      
      CPU mapping an imported DMABUF image for writes seems very uncommon,
      solidly in the "what are you even doing?" realm.  Mapping an imported
      DMABUF for reading might be a thing, in case someone wanted to do
      glReadPixels on it.  But in that case, the cost of doing a staging
      blit is probably acceptable.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      3319ab0d
    • Kenneth Graunke's avatar
      iris: Use staging blits for reads from uncached buffers. · 643c4ade
      Kenneth Graunke authored
      
      
      If we're doing CPU reads of a resource that doesn't have CPU caches
      enabled for the mapping (say, in device local memory, or WC mapped),
      then blit it to a temporary that does have those caches enabled.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      643c4ade
    • Kenneth Graunke's avatar
      iris: Track imported vs. exported status separately · 49070038
      Kenneth Graunke authored
      
      
      Not all external objects are the same.  Imported buffers may be from
      other devices (say a dmabuf from an AMD or NVIDIA discrete card) which
      are backed by memory that we can't use with I915_GEM_MMAP.  However,
      exported buffers are ones that we know we allocated ourselves from our
      own device.  We may not know what other clients are doing with them,
      but we can assume a bit more about where they came from.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <!10941>
      49070038
    • Kenneth Graunke's avatar
      iris: Make an iris_bo_is_external() helper and use it in a few places · 1a395e10
      Kenneth Graunke authored
      
      
      I'd like to start tracking "imported" vs. "exported" for objects,
      rather than a blanket "external" flag.  Instead of directly checking
      bo->external, use a new helper that will eventually be "imported or
      exported".
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <!10941>
      1a395e10
    • Kenneth Graunke's avatar
      iris: Delete a comment suggesting we use tiled staging buffers · 1c73445d
      Kenneth Graunke authored
      
      
      We basically tried this, and it performed worse, so delete the
      suggestion in the comments that we may want to do it someday.
      Acked-by: Lionel Landwerlin's avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Part-of: <mesa/mesa!10941>
      1c73445d