- Apr 14, 2021
-
-
Lionel Landwerlin authored
Prior to supporting VK_EXT_descriptor_indexing all of our descriptor limits where below 64k which fitted a uint16_t. Now all of those can go up to 2^20 entries so we need 32bits indexes to keep track of them. This change leaves the dynamic indexes at 16bits. We could arguably bump them too, up to the reviewer's taste. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 6e230d76 ("anv: Implement VK_EXT_descriptor_indexing") Closes: mesa/mesa#4636 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!10228>
-
Samuel Pitoiset authored
This format is supported by the driver. Fixes vertex explosion in Dirt 5. Closes: mesa/mesa#4635 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <mesa/mesa!10226>
-
Connor Abbott authored
Consider a simple loop that does a series of texture instructions and then reduces the results: vec4 sum = vec4(0); for (int i = 0; i < N; i++) { sum += texture(...); } Assume that the loop is unrolled and we schedule the resulting basic block. Right now, after we schedule the first texture instruction, the only instructions available to schedule that don't incur a sync are the instructions to setup the second texture instruction. So we keep picking the texture instructions, no matter how large N is, resulting in a pathological schedule for register pressure when N is very large: sum1 = texture(...); sum2 = texture(...); sum3 = texture(...); ... sum = sum1 + sum2 + sum3 + ...; In particular this happens with some CTS tests for VK_EXT_robustness2, where a loop like that with many iterations is marked as [[unroll]], forcing NIR to unroll it. This solution is a balance between the current approach and always scheduling for register pressure (and ignoring sync's). We only allow a certain number of texture fetches to be in flight before considering textures to "sync", even though they don't really, both because they likely *will* sync in reality (overflowing the internal queue of waiting texture instructions) and because at some point we need the normal algorithm to kick in and start lowering register pressure. Part-of: <mesa/mesa!7571>
-
Connor Abbott authored
Once we insert a use of a given tex or SFU instruction, then we must wait for that tex/SFU instruction (as well as all earlier ones) to complete, so we shouldn't penalize further uses, even if a subsequent tex/SFU instruction gets scheduled after the first use. This especially matters after the next commit when we start forcibly breaking up long sequences of texture instructions, since if we schedule a group of 8 texture instructions then we want to schedule the uses of those instructions in parallel with the next 8 texture instructions to reduce register pressure. Part-of: <mesa/mesa!7571>
-
Erik Faye-Lund authored
Similar to the previous commit, we should also verify that the source-format support linear-filter if we try to blit with it. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!10234>
-
Erik Faye-Lund authored
Some Vulkan-drivers don't support blitting between all formats and layouts. So let's verify this while blitting, and fall back to the normal rendering code-path instead. This fixes a crash on start-up in OpenArena on V3DV. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!10234>
-
Bas Nieuwenhuizen authored
Now that perftest is stored in the winsys. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!10198>
-
Bas Nieuwenhuizen authored
In most games I tested we use 32 MiB of cmdbuffers+cmd upload buffers at most. Especially since we have mutable descriptors it seems somewhat unlikely anything else will eat it up so be a bit more aggressive allocating them in VRAM. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!10198>
-
Bas Nieuwenhuizen authored
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!10198>
-
Erik Faye-Lund authored
If first_frame_done isn't set, but fence is NULL, we end up dereferncing that NULL-pointer. This can happen in the case where the first submitted batch has no work, and pfence was passed as a NULL-pointer. While we're at it, simplify the check with the surrounding code, which also checks for a NULL-pointer here. Fixes: e93ca92d ("zink: force explicit fence only on first frame flush") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!10235>
-
Timur Kristóf authored
Late export is theoretically better if used with LATE_ALLOC, but in practice, the early export has an advantage of lower register usage, therefore more concurrent waves. The idea of this commit is that "small" shaders benefit from early primitive export more, due to being able to launch much more waves. Let's consider a NIR shader "small" when it has only 1 block. This yields both better performance, and better stats, than always using late export. Fossil DB on Sienna: Totals from 12807 (8.76% of 146265) affected shaders: VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83% SpillSGPRs: 1458 -> 1538 (+5.49%) CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14% MaxWaves: 282902 -> 278516 (-1.55%) Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18% VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30% SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16% Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26% Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26% PreSGPRs: 434701 -> 447396 (+2.92%) PreVGPRs: 527783 -> 540527 (+2.41%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!10106>
-
Timur Kristóf authored
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!10106>
-
Timur Kristóf authored
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!10106>
-
Timur Kristóf authored
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!10106>
-
Timur Kristóf authored
The user-set priority of shaders matters very little, but we hope this might still help speed up VS input loads especially. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!10106>
-
Timur Kristóf authored
We learned that the gs_alloc_req is not actually when the export space allocation happens. So it makes no sense to prioritize it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!10106>
-
Erik Faye-Lund authored
This fixes basic rendering on top of V3DV, which doesn't seem to expose the cached memory we expect and love. Fixes: 598dc3dc ("zink: use cached memory for all resources when possible") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!10230>
-
Timur Kristóf authored
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!10155>
-
Timur Kristóf authored
Previously, every wave had multiple active lanes read the LDS, and the data was processed by VALU DPP instructions. Now, only the first lane reads the LDS in order to avoid bank conflicts, and the results are processed by SALU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!10155>
-
Boris Brezillon authored
The key passed to _mesa_hash_table_search() is wrong, fix it. Fixes: 8ba2f9f6 ("panfrost: Create a blitter library to replace the existing preload helpers") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!10232>
-
Erik Faye-Lund authored
This seems to simply be a mixup of what utility function to use. util_clear_render_target clears on the CPU, whereas util_blitter_clear_render_target clears on the GPU. Because we do the zink_blit_begin dance, it seems reasonable to assume the latter was intended. Fixes: 622f8f6e ("zink: add a pipe_context::clear_texture hook") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!10211>
-
Michel Dänzer authored
This is possible again thanks to mesa/mesa!9955 , and this MR requires rebuilding all templates based docker images anyway, so we can pull in the latest templates for free. We need to exclude /dev/* when unpacking rootfs tarballs for the arm_test image, since x86 container build jobs do not allow mknod anymore with current templates. The baremetal test jobs have another filesystem mounted on /dev anyway. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
We're not using the templates for the Windows image. Fixes needless rebuild of the Windows image when the ci-templates commit is changed. Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Also build deqp-runner once in x86_test-base instead of separately in x86_test-{gl,vk}. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
While we're at it, use a tag instead of whatever happens to be the current main branch for building libclc. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
v2: * Drop local build from x86_test-gl image as well (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net> # v1 Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Debian bullseye has a separate command-line-only renderdoc package, so no need to install Qt packages and build renderdoc anymore. Closes: mesa/mesa#3125 Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Among other things, this gets us GCC 10 (was 6). Requires some changes to third party components we use: * Install apitrace (& waffle) from Debian; was hitting issues with the local build, and it's the same version 9.0 anyway. * Update Fossilize to a newer commit which builds with GCC 10. * apt.llvm.org repositories are no longer needed. * Use an SPIRV-LLVM-Translator commit which builds with LLVM 11.0.1. * Install XCB packages from Debian, 1.13 fails to build with Python 3.9. * Install wayland-protocols from Debian, 1.12 is too old for libgtk-3-dev in bullseye. LLVM 7/8 packages are no longer available. Also adapt expected test results to Xvfb now exposing multi-samle GLXFBConfigs. v2: * Install clang instead of clang-11. Closes: mesa/mesa#3124 Reviewed-by: Eric Anholt <eric@anholt.net> # v1 Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Preparation for moving to Debian bullseye, which has packages for LLVM 9 & 11, but not 10. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
LLVM support has been disabled in the meson-armhf job for some time, so they were unused. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Avoids warning with GCC 10: ../src/intel/blorp/blorp_blit.c: In function 'blorp_nir_combine_samples': ../src/intel/blorp/blorp_blit.c:702:25: error: 'texture_data[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized] 702 | texture_data[0] = nir_fmul(b, texture_data[0], | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 703 | nir_imm_float(b, 1.0 / tex_samples)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!9833>
-
Pierre-Eric Pelloux-Prayer authored
Avoids warning with newer GCC: ../src/gallium/drivers/r600/sb/sb_sched.cpp: In member function 'void r600_sb::literal_tracker::reset()': ../src/gallium/drivers/r600/sb/sb_sched.cpp:1953:26: error: 'void* memset(void*, int, size_t)' clearing an object of non-trivial type 'struct r600_sb::literal'; use assignment or value-initialization instead [-Werror=class-memaccess] 1953 | memset(lt, 0, sizeof(lt)); | ^ In file included from ../src/gallium/drivers/r600/sb/sb_sched.cpp:35: ../src/gallium/drivers/r600/sb/sb_bc.h:409:8: note: 'struct r600_sb::literal' declared here 409 | struct literal { | ^~~~~~~ [ Michel Dänzer: * Expanded commit log v2: * Clear all 4 members of lt[4] (Eric Anholt) ] Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <mesa/mesa!9833>
-
Michel Dänzer authored
Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!9833>
-
Juan A. Suárez authored
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <mesa/mesa!10231>
-
Rhys Perry authored
No fossil-db changes, probably because all fp16 shaders have at least one 16-bit mov or vec2 somehwere. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!10227>
-
Connor Abbott authored
This was absorbed into Vulkan 1.1, but we forgot to expose it separately. It's a subset of what's allowed by VK_EXT_scalar_block_layout. Part-of: <mesa/mesa!8695>
-
Connor Abbott authored
VK_KHR_spirv_1_4 is trivial because vtn already supports all the added SPIR-V features that aren't gated behind Vulkan extensions. I've observed some robustness2 CTS tests requiring this. However there are a few tests currently failing due to lacking spilling. VK_EXT_scalar_block_layout should also be trivial, since support for "straddling" UBO loads was added recently for other reasons. This is used by every robustness2 CTS test. Part-of: <mesa/mesa!8695>
-
Juan A. Suárez authored
When emitting the GL shader state, verify the attribute has a resource bound; otherwise just skip it v2 (chema): - Move comment - Set num_elements_to_emit = 1 if it is 0 Cc: mesa-stable Closes: mesa/mesa#4205 Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <mesa/mesa!8826>
-
Alejandro Piñeiro authored
64 was a temporary and conservative "big enough" value, but we can do better. Note that as mentioned on the FIXME, we could be even more detailed, adding a descriptor map allocate method based on the descriptor type. That would mean more individual allocations, and slightly more complexity. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <mesa/mesa!10207>
-