Commits on Source (58)
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!13015>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13015>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!13015>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!13015>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!13015>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!13015>
-
Marek Olšák authored
We don't have to use the special DCC settings for lower resolutions. This will cause corruption if X and an windowed app use different Mesa versions. The fix is to restart the X server. I expect to get false bug reports due to this. Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!13013>
-
Marek Olšák authored
Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!13013>
-
Mike Blumenkrantz authored
this should only happen in zink_descriptors_update_lazy_masked Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12931>
-
Mike Blumenkrantz authored
if the pipeline layout changes, it's technically illegal to not rebind the descriptor set even if it hasn't changed Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12931>
-
Mike Blumenkrantz authored
make this reusable like the others Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
these are going to come through as direct variable derefs, so it's simple to handle the functionality by reusing the same codepath to generate image types Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
this works by tracking 1024-member arrays of images and textures using idalloc for indexing. each idalloc id is an index into the array that is returned as a handle, and this handle is used to index into the array in shaders. in the driver, VK_EXT_descriptor_indexing features are used to enable updates on the live bindless descriptor set and leave unused members of the arrays unbound, which works as long as no member is updated while it is in use. to avoid this, idalloc ids must cycle through a batch once the image/texture handle is destroyed before being returned to the available pool in shaders, bindless ops come in one of two types: - i/o variables - bindless instructions for i/o, the image/texture variables have to be rewritten back to the integer handles which represent them so that the successive shader stage utilizing them can perform the indexing for instructions, the src representing the image/texture has to be rewritten as a deref into the bindless image/texture array Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
this is the 6th descriptor set being bound, so don't even advertise it if 6 sets can't be bound Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12855>
-
Mike Blumenkrantz authored
Part-of: <!12855>
-
Mike Blumenkrantz authored
These pass all the CTS tests, though not sure how useful they are. [airlied: these may need some work in the future depending on app expectations] Reviewed-by:
Dave Airlie <airlied@redhat.com> Part-of: <!12953>
-
Dave Airlie authored
This just adds all the wrappers in the right places hopefully Reviewed-by:
Roland Scheidegger <sroland@vmware.com> Acked-by:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <!12953>
-
Dave Airlie authored
Again if you get passed an invoc but the exec mask has the active lane somewhere other than at 0, then if we have an invoc we should find the active lane and extract the value from invoc rather than using the idx. This fixes a bunch of VK 1.2 subgroup tests once 1.2 is enabled: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst* Reviewed-by:
Roland Scheidegger <sroland@vmware.com> Acked-by:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <!12953>
-
Dave Airlie authored
The remaining extensions are optional features, just turn on vk 1.2 with them reporting as off. Reviewed-by:
Roland Scheidegger <sroland@vmware.com> Acked-by:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <!12953>
-
Dave Airlie authored
Acked-by:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <!12953>
-
Andreas Baierl authored
Bit 12 of render->aux1 is GL_CCW/GL_CW. For GL_CCW (default of glFrontFace) we have to set that bit active. This is not what the blob does and what the original reverse engineering documentation says. The blob sets this value inverted and does some bogus negation of the fragment shaders gl_FrontFacing variable instead. Anyway, doing it this way does not cause regressions but fixes dEQP-GLES2.functional.shaders.builtin_variable.frontfacing and 4 piglit tests. Reviewed-by:
Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by:
Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <!7690>
-
Boris Brezillon authored
Since we have no guarantee that start < end, we can't really tell to which one the offset applies to. Let the caller take care of that. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <!12961>
-
Boris Brezillon authored
Fixes several problems in the pan_blit() logic: 1. We actually need the reciprocal of the depth scaling in z_scale (maybe we should rename this field z_scale_rcp to make it clear) 2. When Z end < Z start we should remove one to the cur_layer/layer_offset instead of doing it on the last_layer field, otherwise there's an off-by-one error 3. The Z src offset should be adjusted to account for scaling. If we don't do that we won't sample from the right layer when upscaling. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <!12961>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <!12961>
-
Tomeu Vizoso authored
In preparation for testing panvk. Signed-off-by:
Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!13016>
-
Tomeu Vizoso authored
Just run some selected tests for now because we miss a lot of functionality, which would cause so many crashes that the runs aren't practical. Once the core functionality is implemented, we can switch to the master case list with skips. Signed-off-by:
Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <mesa/mesa!13016>
-
Samuel Pitoiset authored
To match the pipeline key. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Part-of: <!13032>
-
Samuel Pitoiset authored
Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Part-of: <!13032>
-
Samuel Pitoiset authored
To match radv_shader_variant_key. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Part-of: <!13032>
-
Samuel Pitoiset authored
It exactly matches the shader keys now. Everything was copied from the pipeline key to the shader keys. There is still some work to completely remove radv_shader_variant_key. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Part-of: <!13032>
-
Daniel Schürmann authored
Fixes: 6ed18749 ('aco: allow live-range splits of linear vgprs in top-level blocks') Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!13058>
-
Adam Jackson authored
Reviewed-by:
Michel Dänzer <mdaenzer@redhat.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> Part-of: <!13002>
-
Adam Jackson authored
Reviewed-by:
Michel Dänzer <mdaenzer@redhat.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> Part-of: <!13002>
-
Adam Jackson authored
In GLX a "tag" usually means a context tag, "fbconfig attribute" is a bit more obvious. Reviewed-by:
Michel Dänzer <mdaenzer@redhat.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> Part-of: <!13002>
-
Adam Jackson authored
The X server doesn't get this wrong. It's not the client's job to correct what the server says here. And if anyone ever implements HDR for X11, you might in fact want to be able to use floats with a window. Reviewed-by:
Michel Dänzer <mdaenzer@redhat.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com> Part-of: <!13002>
-
Daniel Schürmann authored
This avoids unintended reordering of VMEM instructions. It is also highly unlikely that we find more independent instructions before previous clause-related instructions. Totals from 1921 (1.28% of 150170) affected shaders: (GFX10.3) VGPRs: 103832 -> 103736 (-0.09%); split: -0.10%, +0.01% CodeSize: 8695560 -> 8706000 (+0.12%); split: -0.03%, +0.15% Instrs: 1643752 -> 1646349 (+0.16%); split: -0.04%, +0.20% Latency: 26755527 -> 26614645 (-0.53%); split: -0.67%, +0.14% InvThroughput: 7226604 -> 72048094 (-0.30%); split: -0.39%, +0.08% VClause: 46536 -> 46201 (-0.72%); split: -0.81%, +0.09% SClause: 47910 -> 47769 (-0.29%); split: -0.43%, +0.14% Copies: 94647 -> 94558 (-0.09%); split: -0.26%, +0.17% Branches: 36843 -> 36847 (+0.01%); split: -0.00%, +0.01% Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!10896>
-
Daniel Schürmann authored
This allows more aggressive clause-forming in presence of larger def-use distances. To compensate for the effect, VMEM_CLAUSE_MAX_GRAB_DIST was decreased. Totals from 5788 (3.85% of 150170) affected shaders: (GFX10.3) VGPRs: 483960 -> 475272 (-1.80%); split: -1.82%, +0.02% CodeSize: 59661240 -> 59669084 (+0.01%); split: -0.01%, +0.02% MaxWaves: 70408 -> 71450 (+1.48%); split: +1.51%, -0.03% Instrs: 11222417 -> 11224479 (+0.02%); split: -0.01%, +0.03% Latency: 349397104 -> 349298602 (-0.03%); split: -0.03%, +0.00% InvThroughput: 88584832 -> 87762262 (-0.93%); split: -0.93%, +0.00% VClause: 168905 -> 177089 (+4.85%); split: -0.48%, +5.32% SClause: 375795 -> 375767 (-0.01%); split: -0.01%, +0.01% Copies: 840298 -> 840231 (-0.01%); split: -0.04%, +0.03% Branches: 373265 -> 373278 (+0.00%); split: -0.00%, +0.00% Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!10896>
-
Daniel Schürmann authored
This patch allows to form clauses even if the register pressure is at the limit with the effect that VMEM instructions are less scattered after the first clause in a Block. It respects the previous clause size to avoid excessive moving of VMEM instructions. VMEM_CLAUSE_MAX_GRAB_DIST is further reduced to compensate some of the effects. Totals from 28922 (19.26% of 150170) affected shaders: (GFX10.3) VGPRs: 1546568 -> 1523072 (-1.52%); split: -1.52%, +0.00% CodeSize: 117524892 -> 117510288 (-0.01%); split: -0.08%, +0.07% MaxWaves: 605554 -> 611120 (+0.92%) Instrs: 22292568 -> 22291927 (-0.00%); split: -0.10%, +0.09% Latency: 488975399 -> 490230904 (+0.26%); split: -0.06%, +0.32% InvThroughput: 117842300 -> 116521653 (-1.12%); split: -1.15%, +0.03% VClause: 541550 -> 522464 (-3.52%); split: -9.73%, +6.20% SClause: 718185 -> 718298 (+0.02%); split: -0.00%, +0.02% Copies: 1420603 -> 1386949 (-2.37%); split: -2.64%, +0.27% Branches: 559559 -> 559278 (-0.05%); split: -0.06%, +0.01% Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!10896>
-
Emma Anholt authored
The common code fails dEQP-VK.wsi.display_control.register_device_event due to having a stub NOT_IMPLEMENTED return, and thus fails the CTS. This is one of our last failures, so disable the extension until it can get finished off, so we can unblock passing the CTS. Part-of: <!13010>
-
Juan A. Suárez authored
Acked-by:
Emma Anholt <emma@anholt.net> Signed-off-by:
Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <!13065>
-
Emma Anholt authored
Much more useful info for dEQP-GLES2.functional.buffer.write.random.0 than "i915_vbuf_render_draw_elements: Assertion `0' failed." Part-of: <!13052>
-
Emma Anholt authored
Not defined anywhere, and the members it's setting up don't exist. Part-of: <!13052>
-
Emma Anholt authored
You do want to stream the vertices out to the WC mapping, as the code has been doing, rather than writing into malloc and doing a memcpy later and wasting cache space. Part-of: <!13052>
-
Emma Anholt authored
We were assertion failing on some large draws due to indices >16bits, despite asking draw to limit the max indices. I haven't managed to track it down, so flip us back to the older, non-index drawing path that doesn't hit this bug until it can get fixed. Leave an I915_DEBUG=vbuf flag around so we can look into this later. This is a pretty big performance hit for vertex shaders. Using glmark2 -b build:use-vbo=true: i915g-vbuf: 211 fps i915g-nonvbuf: 185 fps i915c: 41 fps Given how massively better i915g still is than i915c (llvmpipe VS instead of the classic swrast interpreter), I think it's still worth it to get i915g correct before we fix this perf regression. Fixes: #4971 Part-of: <!13052>
-
Jesse Natalie authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com> Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!12273>
-
Jesse Natalie authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com> Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!12273>
-
Jesse Natalie authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com> Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!12273>
-
Lionel Landwerlin authored
v2: rename s/eval/elem_val/ (Caio) Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jesse Natalie <jenatali@microsoft.com> Reviewed-by:
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <!13030>
-
Lionel Landwerlin authored
The LLVM-SPIRV translator creates variables with initializers, but most of those are actually undef initializers. We can just skip composites that are entirely made of undefs, but for partially undefs, we will still zero initialize. v2: Rename wa_llvm_spirv_undef_initializer to wa_llvm_spirv_ignore_workgroup_initializer (Caio) Limit workaround to OpenCL (Caio) Make workaround clearer (Caio) v3: Only apply workaround on workgroup storage (Caio) Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jesse Natalie <jenatali@microsoft.com> Reviewed-by:
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <!13030>
-
Lionel Landwerlin authored
If an OpVariable's initializer is undef, there is no need to initialize the variable. v2: Comment the code (Caio) Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <!13030>
-
Caio Oliveira authored
This knowledge was repeated in multiple places so move the values to intel_device_info struct. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by:
Jason Ekstrand <jason@jlekstrand.net> Part-of: <!13014>
-
Caio Oliveira authored
Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!13014>
-
Mike Blumenkrantz authored
these should be mutually exclusive Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!12919>
-
Roland Scheidegger authored
llvmpipe expects valid size parameter, and when just VK_WHOLE_SIZE is passed very bad things can happen. This was handled specially before, but got dropped when lavapipe was converted to use the generated command queue. Fixes: eb7eccc7 ("lavapipe: Use generated command queue code") Reviewed-By:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <!13036>
-
Ella-0 authored
Fixes following piglit fails: spec@ext_framebuffer_object@fbo-blending-formats spec@ext_framebuffer_object@fbo-blending-formats@GL_RGB10 Cc: mesa-stable Reviewed-by:
Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <!13051>
Showing
- .gitlab-ci.yml 1 addition, 1 deletion.gitlab-ci.yml
- .gitlab-ci/common/generate-env.sh 1 addition, 0 deletions.gitlab-ci/common/generate-env.sh
- .gitlab-ci/test-source-dep.yml 1 addition, 0 deletions.gitlab-ci/test-source-dep.yml
- docs/features.txt 1 addition, 1 deletiondocs/features.txt
- src/amd/common/ac_surface.c 13 additions, 4 deletionssrc/amd/common/ac_surface.c
- src/amd/compiler/aco_instruction_selection.cpp 9 additions, 9 deletionssrc/amd/compiler/aco_instruction_selection.cpp
- src/amd/compiler/aco_instruction_selection_setup.cpp 3 additions, 3 deletionssrc/amd/compiler/aco_instruction_selection_setup.cpp
- src/amd/compiler/aco_register_allocation.cpp 11 additions, 7 deletionssrc/amd/compiler/aco_register_allocation.cpp
- src/amd/compiler/aco_scheduler.cpp 31 additions, 4 deletionssrc/amd/compiler/aco_scheduler.cpp
- src/amd/vulkan/radv_nir_to_llvm.c 13 additions, 11 deletionssrc/amd/vulkan/radv_nir_to_llvm.c
- src/amd/vulkan/radv_pipeline.c 84 additions, 107 deletionssrc/amd/vulkan/radv_pipeline.c
- src/amd/vulkan/radv_private.h 26 additions, 28 deletionssrc/amd/vulkan/radv_private.h
- src/amd/vulkan/radv_shader.c 6 additions, 6 deletionssrc/amd/vulkan/radv_shader.c
- src/amd/vulkan/radv_shader.h 45 additions, 9 deletionssrc/amd/vulkan/radv_shader.h
- src/broadcom/ci/piglit-v3d-rpi4-fails.txt 0 additions, 2 deletionssrc/broadcom/ci/piglit-v3d-rpi4-fails.txt
- src/broadcom/ci/piglit-vc4-rpi3-skips.txt 1 addition, 0 deletionssrc/broadcom/ci/piglit-vc4-rpi3-skips.txt
- src/compiler/spirv/spirv_to_nir.c 31 additions, 5 deletionssrc/compiler/spirv/spirv_to_nir.c
- src/compiler/spirv/vtn_private.h 6 additions, 0 deletionssrc/compiler/spirv/vtn_private.h
- src/compiler/spirv/vtn_variables.c 11 additions, 1 deletionsrc/compiler/spirv/vtn_variables.c
- src/freedreno/ci/deqp-freedreno-a630-fails.txt 0 additions, 3 deletionssrc/freedreno/ci/deqp-freedreno-a630-fails.txt