1. 29 Apr, 2021 2 commits
  2. 17 Mar, 2021 2 commits
  3. 15 Mar, 2021 1 commit
  4. 12 Feb, 2021 1 commit
  5. 05 Feb, 2021 3 commits
    • Iago Toral's avatar
      v3dv: allow a component swizzle in copy_buffer_to_image_shader · c72d9955
      Iago Toral authored
      
      
      This is trivial because this path relies on our blit_shader interface
      which supports this already, so it just needs to pass it along.
      
      I don't think this is ever triggered practice, since we should be
      able to handle any case that could require this with the texel buffer
      path, but at least it allows us to simplify a bit the code.
      
      Tested by  manually disabling the priority paths to ensure we exercise
      component swizzles with this path.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!8875>
      c72d9955
    • Iago Toral's avatar
      v3dv: batch copies in the copy_buffer_to_image_blit path · 4d4a0797
      Iago Toral authored
      
      
      This path is very memory hungry and batching allows us to reduce
      this by allocating memory just once and reuse it for all regions
      in the batch instead of allocating once per region.
      
      v2: document return value for this function (apinheiro).
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!8875>
      4d4a0797
    • Iago Toral's avatar
      v3dv: handle D/S buffer to image copies with the texel buffer path · 7aa04ad0
      Iago Toral authored
      
      
      We do this by converting them to a compatible color copy and using a
      destination color mask as well as a source component swizzle to handle
      D24 format semantics according to the V3D hardware requirements,
      similar to what we do with our blit shader interface.
      
      This path is faster than the terrible copy_buffer_to_image_blit,
      which requires to copy the source buffer to a tiled image first
      and should be avoided as much as possible, since it is slow and
      can also quickly increase device memory usage.
      
      This fixes occasional OOM errors when loading traces in renderdoc.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!8875>
      7aa04ad0
  6. 25 Jan, 2021 1 commit
  7. 21 Jan, 2021 1 commit
  8. 06 Jan, 2021 1 commit
  9. 17 Dec, 2020 2 commits
  10. 01 Dec, 2020 5 commits
  11. 30 Nov, 2020 6 commits
  12. 27 Nov, 2020 2 commits
    • Iago Toral's avatar
      v3dv: batch buffer to image copies with the texel buffer path if possible · ca44b3ed
      Iago Toral authored
      
      
      When copying multiple regions that have the same image subresource we are
      effectively copying various rects across the same layer range, so we can
      batch together all the rects to copy for each layer in a single job.
      
      This allows us to significantly reduce CPU overhead when recording the
      command, as we need to produce less jobs and allocate less descriptor
      sets. It also offers smaller gains in execution time due to the reduced
      job count.
      
      A stress test where we copy 10 subrects of an image in a loop 100 time,
      choosing regions that will involve the texel buffer path, we get these
      results:
      
                        | Recording Time | Execution Time |
              ----------|----------------|----------------|
              master    |     3.021s     |    0.112s      |
              ----------|----------------|----------------|
              patch     |     0.163s     |    0.080s      |
              ----------|----------------|----------------|
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7782>
      ca44b3ed
    • Iago Toral's avatar
      v3dv: fix leak in the buffer to image copy via texel buffer · 2809e2e8
      Iago Toral authored
      Fixes: ba69c36a
      
       ("v3dv: add a buffer to image copy path using a texel buffer")
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7782>
      2809e2e8
  13. 19 Nov, 2020 1 commit
    • Iago Toral's avatar
      v3dv: remove box check from texel buffer copy fragment shader · 01e3f430
      Iago Toral authored
      
      
      We are already ensuring that we only copy the appropriate pixel
      rect via the scissor and viewport state, so there is no need to
      do this check in the shader.
      
      Using a stress test with 100 buffer to image copies of a single
      layered image with 10 miplevels recorded into a command buffer and
      measuring the time it gets to execute the command buffer we get
      these results:
      
                    | Execution Time |
          ----------|----------------|
          master    |     0.142s     |
          ----------|----------------|
          patch     |     0.071s     |
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7671>
      01e3f430
  14. 17 Nov, 2020 5 commits
    • Alejandro Piñeiro's avatar
      v3dv: use the common base object type and struct · 30b6fbc4
      Alejandro Piñeiro authored
      Used as reference Hyujun's commit
      5d3fdbc5
      
      , that does the same for
      turnip.
      
      This commit also replaces in several cases alloc for zalloc, and adds
      checks on more Destroy methods if the object to be free is NULL or
      not. Most of them were needed to avoid crashes/weird behaviour due
      trying to use un-initialized data. Note that now that vk_object_free
      iterates over a array, making it more against un-initialized or just
      NULL data.
      
      Additionally, using zalloc we can also remove some memset to 0. In
      fact we needed to remove them, as if not, they would override the
      vk_object_base object to 0 (the alternative would me doing a memset
      computing a pointer offset, but that's is not needed as we can just
      use zalloc).
      
      v2:
         * Call memset(0) on reused descriptor sets when calling
           ResetDescriptorPool, not when reallocating them (Iago)
         * Add null check when calling DestroyImageView (detected by a full CTS run)
      
      v3: Fixed rebase conflicts after last meta copy/clear changes
      Reviewed-by: Iago Toral's avatarIago Toral Quiroga <itoral@igalia.com>
      Part-of: <!7627>
      30b6fbc4
    • Iago Toral's avatar
      v3dv: rename playout and dslayout fields to use underscores. · 249aed1f
      Iago Toral authored
      
      
      Following a suggestion from Alejandro, since playout is a word on its own
      and can be confusing. It also makes it more consistent with other
      variable names that use an underscore.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7651>
      249aed1f
    • Iago Toral's avatar
      v3dv: blit shader clean-ups · ba2e979b
      Iago Toral authored
      
      
      This avoids redundant per-layer operations that are the same across
      layers or that only need to do once. Namely:
      
      - The sampler for the blit source is the same for all layers.
      - The decision about whether we need to load TLB contents or not only
        needs to be done once.
      - Some command buffer state such as the pipeline, the viewport and the
        scissor is the same for all layers and should only be bound once.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7651>
      ba2e979b
    • Iago Toral's avatar
      v3dv: initialize pipeline layouts for meta operations at driver initialization · 840ba251
      Iago Toral authored
      
      
      This removes the need to lock just to check if we have created them
      due to the lazy allocation strategy we had in place.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7651>
      840ba251
    • Iago Toral's avatar
      v3dv: add a buffer to image copy path using a texel buffer · ba69c36a
      Iago Toral authored
      
      
      This is much faster than the blit fallback (which requires to upload
      the linear buffer to a tiled image) and the CPU path.
      
      A simple stress test involving 100 buffer to image copies of a
      single layer image with 10 mipmap levels provides the following
      results:
      
      Path           | Recording Time | Execution Time |
      -------------------------------------------------|
      Texel Buffer   |     2.954s      |     0.137s    |
      -------------------------------------------------|
      Blit           |    10.732s      |     0.148s    |
      -------------------------------------------------|
      CPU            |     0.002s      |     1.453s    |
      -------------------------------------------------|
      
      So generally speaking, this texel buffer copy path is the fastest
      of the paths that can do partial copies, however, the CPU path might
      provide better results in cases where command buffer recording is
      important to overall performance. This is probably the reason why
      the CPU path seems to provide slightly better results for vkQuake2.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7651>
      ba69c36a
  15. 11 Nov, 2020 3 commits
  16. 02 Nov, 2020 2 commits
  17. 27 Oct, 2020 1 commit
    • Iago Toral's avatar
      v3dv: grow meta descriptor pool dynamically · 666817ce
      Iago Toral authored
      
      
      Our blit shader path allocates a descriptor pool to create
      combined image sampler descriptors for blit source images. So
      far, we had sized this pool statically and the driver would
      fail if we ever need to allocate more descriptors than that.
      
      With this change, we switch to using a dynamic allocation
      mechanism instead where we allocate as many pools as we need to
      meet descriptor set allocation requirements for the command buffer.
      
      Also, every time a new pool needs to be created, we double its
      size (up to a limit), so we can start small and avoid wasting
      memory for command buffers that only have a small number of blits,
      while trying to keep allocation overhead low for command buffers
      that record a lot of blits.
      
      v2: use existing framework for automatic destruction of private
          driver objects to free allocated pools.
      Reviewed-by: Alejandro Piñeiro's avatarAlejandro Piñeiro <apinheiro@igalia.com>
      Part-of: <!7311>
      666817ce
  18. 22 Oct, 2020 1 commit