1. 24 Jun, 2021 3 commits
  2. 31 May, 2021 1 commit
    • Danylo Piliaiev's avatar
      turnip: place a limit on the growth of BOs · f38fd3c5
      Danylo Piliaiev authored
      
      
      There is a limit on IB size, which on freedreno is set to 0x100000.
      Going beyond it results in hangs, however I found that the last
      0x100000 packet just doesn't get executed. Thus the real limit is
      0x0FFFFF.
      
      This could be tested by appending nops to the cmdstream and placing
      e.g. CP_INTERRUPT at the end, at any position other than being
      0x100000 packet it results in a hang.
      
      Fixes:
        dEQP-VK.api.command_buffers.record_many_draws_secondary_2
        dEQP-VK.api.command_buffers.record_many_draws_primary_2
      
      However these tests could trigger hangcheck timeouts.
      
      Also this fixes hangs when opening captures of games in RenderDoc.
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!10786>
      f38fd3c5
  3. 27 May, 2021 1 commit
  4. 20 May, 2021 2 commits
  5. 19 May, 2021 1 commit
    • Connor Abbott's avatar
      ir3/cf: Rewrite pass · e894e83e
      Connor Abbott authored
      The old pass had a few bugs:
      - It tried to avoid folding f2f32 into f2f16, but didn't consider
        conversions that were already folded in.
      - It didn't prevent folding an f2f16 or f2f32 into a non-floating-point
        op.
      
      In addition it wasn't written in a manner which made handling integer
      conversions practical. This rewrites the pass to instead calculate the
      "type" of the conversion source and then check whether folding the
      conversion is allowed. This allows us to cleanly separate the
      declarative part where we describe how the HW works from the policy part
      where we decide whether the transform is allowed, and makes it simple to
      add support for folding integer conversions.
      
      Closes: #3208
      Part-of: <mesa/mesa!10859>
      e894e83e
  6. 17 May, 2021 1 commit
  7. 14 May, 2021 1 commit
  8. 13 May, 2021 2 commits
  9. 11 May, 2021 3 commits
  10. 05 May, 2021 2 commits
    • Emma Anholt's avatar
      turnip: Demote API version to 1.1. · 7bcda214
      Emma Anholt authored
      We don't support major 1.2 required extensions like timeline semaphores.
      Fixes many complaints in the dEQP-VK.info.vulkan1p2.* group.
      
      We were originally bumped to 1.2 in 75755e0e ("turnip: Pretend to
      support Vulkan 1.2") but hopefully that build issue has been fixed in the
      entrypoint reworks since then.
      
      Part-of: <!10471>
      7bcda214
    • Danylo Piliaiev's avatar
      ir3: memory_barrier also controls shared memory access order · cb8a0079
      Danylo Piliaiev authored
      nir_intrinsic_memory_barrier has the same semantic as memoryBarrier()
      in GLSL, which is:
      
      GLSL 4.60, 4.10. "Memory Qualifiers":
       "The built-in function memoryBarrier() can be used if needed to
       guarantee the completion and relative ordering of memory accesses
       performed by a single shader invocation."
      
      GLSL 4.60, 8.17. "Shader Memory Control Functions":
       "The built-in functions memoryBarrier() and groupMemoryBarrier() wait
       for the completion of accesses to all of the above variable types."
      
      Fixes tests:
       dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp
       dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.image.comp
      
      Fixes: 819a613a
      
       ("freedreno/ir3: moar better scheduler")
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Reviewed-by: Samuel Iglesias Gonsálvez's avatarSamuel Iglesias Gonsálvez <siglesias@igalia.com>
      Part-of: <!9054>
      cb8a0079
  11. 21 Apr, 2021 1 commit
    • Danylo Piliaiev's avatar
      ir3: make possible to specify branchstack up to 64 · 9402d5a6
      Danylo Piliaiev authored
      
      
      On a6xx/a5xx there is such dependency between branchstack bitfield
      and the amount of nested ifs, which could be seen with blob:
      
      IFs   BRANCHSTACK
      0	0
      1	1
      2	2
      3	2
      4	3
      5	3
      6	4
      ...
      59	30
      60	31
      61	31
      62	32
      63	32
      64	32
      
      Remove open-coded branchstack for a5xx compute along the way.
      
      Fixes tests:
       dEQP-VK.spirv_assembly.instruction.compute.float16.opvectorshuffle.344
       dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_vert
       dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.444_geom
       dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.244_tessc
       dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_frag
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!9859>
      9402d5a6
  12. 15 Apr, 2021 1 commit
  13. 14 Apr, 2021 3 commits
    • Michel Dänzer's avatar
      ci: Move docker images from Debian buster to bullseye · af0fde95
      Michel Dänzer authored
      Among other things, this gets us GCC 10 (was 6).
      
      Requires some changes to third party components we use:
      
      * Install apitrace (& waffle) from Debian; was hitting issues with the
        local build, and it's the same version 9.0 anyway.
      * Update Fossilize to a newer commit which builds with GCC 10.
      * apt.llvm.org repositories are no longer needed.
      * Use an SPIRV-LLVM-Translator commit which builds with LLVM 11.0.1.
      * Install XCB packages from Debian, 1.13 fails to build with Python 3.9.
      * Install wayland-protocols from Debian, 1.12 is too old for
        libgtk-3-dev in bullseye.
      
      LLVM 7/8 packages are no longer available.
      
      Also adapt expected test results to Xvfb now exposing multi-samle
      GLXFBConfigs.
      
      v2:
      * Install clang instead of clang-11.
      
      Closes: #3124
      Reviewed-by: Eric Anholt <eric@anholt.net> # v1
      Part-of: <!9833>
      af0fde95
    • Connor Abbott's avatar
      tu: Expose VK_KHR_spirv_1_4 and VK_EXT_scalar_block_layout · 765c3b85
      Connor Abbott authored
      VK_KHR_spirv_1_4 is trivial because vtn already supports all the added
      SPIR-V features that aren't gated behind Vulkan extensions. I've
      observed some robustness2 CTS tests requiring this. However there are
      a few tests currently failing due to lacking spilling.
      
      VK_EXT_scalar_block_layout should also be trivial, since support for
      "straddling" UBO loads was added recently for other reasons. This is
      used by every robustness2 CTS test.
      
      Part-of: <!8695>
      765c3b85
    • Juan A. Suárez's avatar
      ci: Update VK-GL-CTS to 1.2.6.0 · 9e5762c3
      Juan A. Suárez authored
      
      
      v2:
       - Bump up MESA_ROOTFS_TAG instead of arm_build (Michel)
      Acked-by: Michel Dänzer's avatarMichel Dänzer <mdaenzer@redhat.com>
      Reviewed-by: José María Casanova Crespo's avatarJose Maria Casanova Crespo <jmcasanova@igalia.com>
      Signed-off-by: Juan A. Suárez's avatarJuan A. Suarez Romero <jasuarez@igalia.com>
      Part-of: <!10136>
      9e5762c3
  14. 01 Apr, 2021 1 commit
  15. 29 Mar, 2021 1 commit
  16. 12 Mar, 2021 1 commit
    • Danylo Piliaiev's avatar
      turnip: fill VkMemoryDedicatedRequirements · 1a2f1e3f
      Danylo Piliaiev authored
      
      
      We support VK_KHR_dedicated_allocation so we must fill
      VkMemoryDedicatedRequirements.
      
      Vulkan spec states:
      
       "[...] requiresDedicatedAllocation may be VK_TRUE under one of the
       following conditions:
      
       The pNext chain of VkImageCreateInfo for the call to vkCreateImage used
       to create the image being queried included a VkExternalMemoryImageCreateInfo
       structure, and any of the handle types specified in
       VkExternalMemoryImageCreateInfo::handleTypes requires dedicated allocation,
       as reported by vkGetPhysicalDeviceImageFormatProperties2 in
       VkExternalImageFormatProperties::externalMemoryProperties.externalMemoryFeatures,
       the requiresDedicatedAllocation field will be set to VK_TRUE."
      
      All handle types require dedicated allocation at the moment.
      
      Fixes:
       dEQP-VK.api.external.memory.opaque_fd.dedicated.image.info
       dEQP-VK.memory.requirements.dedicated_allocation.buffer.regular
       dEQP-VK.memory.requirements.dedicated_allocation.image.transient_tiling_optimal
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!9086>
      1a2f1e3f
  17. 10 Mar, 2021 1 commit
  18. 04 Mar, 2021 1 commit
  19. 22 Feb, 2021 1 commit
  20. 19 Feb, 2021 3 commits
    • Danylo Piliaiev's avatar
      turnip,freedreno/a6xx: tell hw the size of shared mem used by CS · 0fa7ec14
      Danylo Piliaiev authored
      
      
      Before, we only used 2k of shared memory.
      
      It was found that 5 lower bits of SP_CS_UNKNOWN_A9B1 do control
      the available size of shared memory for compute shaders, with
      AVAILABLE_SIZE = (SP_CS_UNKNOWN_A9B1_SHARED_SIZE + 1) * 1k
      up to 32k. And SP_CS_UNKNOWN_A9B1_SHARED_SIZE being zero enables
      all 32k of shared memory.
      
      Fixes tests:
       dEQP-VK.rasterization.line_continuity.line-strip
       dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_nonlocal.workgroup.comp
       dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_nonlocal.workgroup.guard_local.buffer.comp
       dEQP-VK.memory_model.write_after_read.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_nonlocal.workgroup.comp
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!9157>
      0fa7ec14
    • Danylo Piliaiev's avatar
      turnip: consider tile_max_h when calculating tiling config · 14a00042
      Danylo Piliaiev authored
      
      
      Otherwise we may get a tile height exceeding the maximum.
      
      Fixes tests:
       dEQP-VK.pipeline.render_to_image.core.2d.huge.height.r8g8b8a8_unorm
       dEQP-VK.pipeline.render_to_image.core.2d.huge.height.r8g8b8a8_unorm_d16_unorm
       dEQP-VK.pipeline.render_to_image.core.2d.huge.height.r8g8b8a8_unorm_s8_uint
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!9159>
      14a00042
    • Danylo Piliaiev's avatar
      turnip: consider HW limit on number of views when apply multipos opt · b6b3b384
      Danylo Piliaiev authored
      Blob doesn't apply multipos optimization starting from 11 views
      even on a650, however in practice, with the limit of 16 views,
      tests pass on a640/a650 and fail on a630.
      
      Fixes tests:
       dEQP-VK.multiview.draw_indexed.max_multi_view_view_count
       dEQP-VK.multiview.input_attachments.max_multi_view_view_count
       dEQP-VK.multiview.masks.max_multi_view_view_count
       dEQP-VK.multiview.multisample.max_multi_view_view_count
       dEQP-VK.multiview.queries.max_multi_view_view_count
       dEQP-VK.multiview.renderpass2.index.fragment_shader.max_multi_view_view_count
       dEQP-VK.multiview.secondary_cmd_buffer.max_multi_view_view_count
      
      Fixes: 8d275778
      
       ("tu: Enable multi-position output")
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!9135>
      b6b3b384
  21. 08 Feb, 2021 2 commits
  22. 04 Feb, 2021 2 commits
  23. 20 Jan, 2021 1 commit
    • Danylo Piliaiev's avatar
      turnip: don't emit tess consts if they are not used · fa743894
      Danylo Piliaiev authored
      If tess consts aren't used they don't get included in constlen,
      and we risk overrunning consts of the next stage.
      
      Fixes:
       dEQP-VK.tessellation.invariance.outer_edge_index_independence.quads_fractional_even_spacing_ccw
       dEQP-VK.tessellation.invariance.outer_triangle_set.quads_fractional_odd_spacing
       dEQP-VK.tessellation.invariance.primitive_set.isolines_fractional_odd_spacing_ccw
       dEQP-VK.tessellation.invariance.primitive_set.quads_fractional_odd_spacing_cw
      
      Closes: #4117
      
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!8578>
      fa743894
  24. 18 Jan, 2021 1 commit
  25. 14 Jan, 2021 2 commits
    • Danylo Piliaiev's avatar
      turnip: make GS use correct varyings size from previous stage · cea4d850
      Danylo Piliaiev authored
      
      
      Fixes:
       dEQP-VK.tessellation.invariance.primitive_set.triangles_fractional_even_spacing_ccw
       dEQP-VK.tessellation.invariance.outer_edge_division.triangles_fractional_even_spacing
       dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_fractional_odd_spacing_cw
       dEQP-VK.tessellation.invariance.outer_edge_symmetry.quads_fractional_odd_spacing_ccw
       dEQP-VK.tessellation.invariance.outer_edge_symmetry.isolines_equal_spacing_cw
       dEQP-VK.tessellation.invariance.outer_edge_index_independence.triangles_equal_spacing_ccw
       dEQP-VK.tessellation.invariance.outer_edge_index_independence.triangles_fractional_even_spacing_cw
       dEQP-VK.tessellation.invariance.inner_triangle_set.triangles_equal_spacing
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!8497>
      cea4d850
    • Danylo Piliaiev's avatar
      turnip/ir3: handle image load/stores produced by AtomicLoad/Store · ad098553
      Danylo Piliaiev authored
      
      
      SpvOpAtomicLoad and SpvOpAtomicStore are translated into
      nir_intrinsic_image_deref_store/load instead of some separate
      atomic intrinsics, however they don't have src or dest type
      specified. Turnip doesn't support shaderImageFloat32Atomics
      so type is just integer.
      
      Fixes:
      dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_local.image.frag
      dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_local.image.comp
      dEQP-VK.memory_model.write_after_read.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_local.image.comp
      dEQP-VK.memory_model.write_after_read.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_local.image.comp
      dEQP-VK.memory_model.write_after_read.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_nonlocal.workgroup.guard_local.image.comp
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!8476>
      ad098553
  26. 13 Jan, 2021 1 commit
    • Danylo Piliaiev's avatar
      turnip: implement indirect dispatch · 5331b1d9
      Danylo Piliaiev authored
      
      
      Vulkan guarantees only 4 byte alignment of offset for vkCmdDrawIndirect,
      while CP_LOAD_STATE.EXT_SRC_ADDR requires 16 byte alignment which
      makes us copy indirect parameters to a correctly aligned buffer.
      
      Blob does essentially the same but emits indirect CP_LOAD_STATE
      with src = SS6_UBO and EXT_SRC_ADDR = 0xe0000, and only for a
      first dispatch.
      
      Fixes:
      dEQP-VK.compute.indirect_dispatch.*
      Signed-off-by: Danylo Piliaiev's avatarDanylo Piliaiev <dpiliaiev@igalia.com>
      Part-of: <!8444>
      5331b1d9