1. 10 Mar, 2018 1 commit
  2. 04 Jan, 2018 1 commit
  3. 30 Nov, 2017 1 commit
  4. 22 Nov, 2017 1 commit
  5. 09 Nov, 2017 2 commits
    • Nicolai Hähnle's avatar
      anv: fix build failure · ffc20606
      Nicolai Hähnle authored
      Fixes: e3a8013d ("util/u_queue: add util_queue_fence_wait_timeout")
      ffc20606
    • Timothy Arceri's avatar
      mesa: Add new fast mtx_t mutex type for basic use cases · f98a2768
      Timothy Arceri authored
      While modern pthread mutexes are very fast, they still incur a call to an
      external DSO and overhead of the generality and features of pthread mutexes.
      Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
      inline the atomic operation and make the fast case just two intructions.
      Mutexes are subtle and finicky to implement, so we carefully copy the
      implementation from Ulrich Dreppers well-written and well-reviewed paper:
      
        "Futexes Are Tricky"
        http://www.akkadia.org/drepper/futex.pdf
      
      
      
      We implement "mutex3", which gives us a mutex that has no syscalls on
      uncontended lock or unlock.  Further, the uncontended case boils down to a
      cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
      and an untaken branch.  We use __builtin_expect() to indicate that contention
      is unlikely so that gcc will put the contention code out of the main code
      flow.
      
      A fast mutex only supports lock/unlock, can't be recursive or used with
      condition variables.  We keep the pthread mutex implementation around as
      for the few places where we use condition variables or recursive locking.
      For platforms or compilers where futex and atomics aren't available,
      simple_mtx_t falls back to the pthread mutex.
      
      The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
      applications.  Most CPU bound cases are helped and some of our internal
      bind_buffer_object heavy benchmarks gain up to 10%.
      Signed-off-by: Kristian H. Kristensen's avatarKristian Høgsberg <krh@bitplanet.net>
      Signed-off-by: Timothy Arceri's avatarTimothy Arceri <tarceri@itsqueeze.com>
      Reviewed-by: default avatarNicolai Hähnle <nicolai.haehnle@amd.com>
      f98a2768
  6. 18 Oct, 2017 1 commit
    • chadversary's avatar
      anv: Move size check from anv_bo_cache_import() to caller (v2) · 9775894f
      chadversary authored
      This change prepares for VK_ANDROID_native_buffer. When the user imports
      a gralloc hande into a VkImage using VK_ANDROID_native_buffer, the user
      provides no size. The driver must infer the size from the internals of
      the gralloc buffer.
      
      The patch is essentially a refactor patch, but it does change behavior
      in some edge cases, described below. In what follows, the "nominal size"
      of the bo refers to anv_bo::size, which may not match the bo's "actual
      size" according to the kernel.
      
      Post-patch, the nominal size of the bo returned from
      anv_bo_cache_import() is always the size of imported dma-buf according
      to lseek(). Pre-patch, the bo's nominal size was difficult to predict.
      If the imported dma-buf's gem handle was not resident in the cache, then
      the bo's nominal size was align(VkMemoryAllocateInfo::allocationSize,
      4096).  If it *was* resident, then the bo's nominal size was whatever
      the cache returned. As a consequence, the first cache insert decided the
      bo's nominal size, which could be significantly smaller compared to the
      dma-buf's actual size, as the nominal size was determined by
      VkMemoryAllocationInfo::allocationSize and not lseek().
      
      I believe this patch cleans up that messy behavior. For an imported or
      exported VkDeviceMemory, anv_bo::size should now be the true size of the
      bo, if I correctly understand the problem (which I possibly don't).
      
      v2:
        - Preserve behavior of aligning size to 4096 before checking. [for
          jekstrand]
        - Check size with < instead of <=, to match behavior of commit c0a4f56f
          "anv: bo_cache: allow importing a BO larger than needed". [for
          chadv]
      9775894f
  7. 17 Oct, 2017 1 commit
  8. 11 Oct, 2017 1 commit
  9. 12 Sep, 2017 1 commit
  10. 29 Aug, 2017 1 commit
  11. 15 Jul, 2017 1 commit
  12. 23 May, 2017 1 commit
    • Jason Ekstrand's avatar
      anv: Stop setting BO flags in bo_init_new · 00df1cd9
      Jason Ekstrand authored
      
      
      The idea behind doing this was to make it easier to set various flags.
      However, we have enough custom flag settings floating around the driver
      that this is more of a nuisance than a help.  This commit has the
      following functional changes:
      
       1) The workaround_bo created in anv_CreateDevice loses both flags.
          This shouldn't matter because it's very small and entirely internal
          to the driver.
      
       2) The bo created in anv_CreateDmaBufImageINTEL loses the
          EXEC_OBJECT_ASYNC flag.  In retrospect, it never should have gotten
          EXEC_OBJECT_ASYNC in the first place.
      Reviewed-by: Nanley Chery's avatarNanley Chery <nanley.g.chery@intel.com>
      Cc: "17.1" <mesa-stable@lists.freedesktop.org>
      00df1cd9
  13. 05 May, 2017 20 commits
  14. 04 May, 2017 1 commit
  15. 28 Apr, 2017 2 commits
  16. 27 Apr, 2017 1 commit
  17. 11 Apr, 2017 1 commit
  18. 05 Apr, 2017 1 commit
    • Jason Ekstrand's avatar
      anv: Add support for 48-bit addresses · 651ec926
      Jason Ekstrand authored
      
      
      This commit adds support for using the full 48-bit address space on
      Broadwell and newer hardware.  Thanks to certain limitations, not all
      objects can be placed above the 32-bit boundary.  In particular, general
      and state base address need to live within 32 bits.  (See also
      Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
      to handle this, we add a supports_48bit_address field to anv_bo and only
      set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
      for all client-allocated memory objects but leave it false for
      driver-allocated objects.  While this is more conservative than needed,
      all driver allocations should easily fit in the first 32 bits of address
      space and keeps things simple because we don't have to think about
      whether or not any given one of our allocation data structures will be
      used in a 48-bit-unsafe way.
      Reviewed-by: Kristian H. Kristensen's avatarKristian H. Kristensen <krh@bitplanet.net>
      651ec926
  19. 28 Nov, 2016 1 commit