- Jul 14, 2021
-
-
Ian Romanick authored
No difference proven at 95.0% confidence (n=10) in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13. v2: Only update each block's IP data once instead of once per block. Suggested by Emma. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!11632>
-
Ian Romanick authored
Performance improvement in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 for n=30: release build (w/Fedora build flags): -0.82% ± 0.23% Meson -Dbuildtype=debugoptimized: -0.74% ± 0.27% The difference in the debugoptimized build is the calls to inst_is_in_block(block, this) still exist on each call to remove(). v2: Only update each block's IP data once instead of once per block. Suggested by Emma. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!11632>
-
Ian Romanick authored
Performance improvement in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 for n=30: release build (w/Fedora build flags): -7.79% ± 0.25% Meson -Dbuildtype=debugoptimized: -5.10% ± 0.40% The difference in the debugoptimized build is the calls to inst_is_in_block(block, this) still exist on each call to remove(). v2: Only update each block's IP data once instead of once per block. Suggested by Emma. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!11632>
-
Ian Romanick authored
Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!11632>
-
Ian Romanick authored
Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!11632>
-
Daniel Schürmann authored
Reduces overall compile times by ~0.45%. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!11879>
-
Daniel Schürmann authored
Reduces overall compile times by ~0.2%. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!11879>
-
Daniel Schürmann authored
These were responsible for ~20% of the time spent in instruction selection. Reduces overall compile times by ~0.5%. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!11879>
-
Iago Toral authored
OpenGL ES 3.1 specifies that a geometry shader can write to gl_PrimitiveID, which can then be read by a fragment shader. OpenGL ES 3.2 additionally adds the capacity for the fragment shader to read gl_PrimitiveID even if there is no geometry shader. This commit adds support for this feature, which is also implicitly expected by the geometry shader feature in Vulkan 1.0. Fixes: dEQP-VK.pipeline.framebuffer_attachment.no_attachments dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <mesa/mesa!11874>
-
Iago Toral authored
If all drawing is scissored but we have multiple discontinuous scissor rects, we end up flushing all the tiles in the rect that covers all scissor rects, which can be a waste, particularly for large render targets. The obvious case for this are updates to a mega texture or atlas for example. This change checks if all rendering happenings against scissor rects, in which case it keeps track of the rects and uses this to discard tiles that are not included in any of them. This optimization needs to be disabled if we have any non-scissored rendering, including non-scissored clears. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <mesa/mesa!11875>
-
Erik Faye-Lund authored
This is mostly a theoretical fix for the Nine frontent, which doesn't want rectangular lines even when multisampling. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!11841>
-
Erik Faye-Lund authored
I accidentally repeated the rectangular lines test instead of checking for smooth lines. Whoopsie! Fixes: c3b0f439 ("zink: fill in the right line-mode based on state") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!11841>
-
Erik Faye-Lund authored
The Vulkan 1.2 specification, section 11.2.12 ("Host Access to Device Memory Objects") say the following: > memory must have been created with a memory type that reports > VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT Since there's no guarantee that there's any memory that is *both* device-local *and* host-visible, let's just use the latter requirement. Fixes: 8af568e4 ("vulkan: implement wsi_win32 backend") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!11848>
-
Erik Faye-Lund authored
The Vulkan 1.2 specification, section 11.2.12 ("Host Access to Device Memory Objects") say the following: > If size is not equal to VK_WHOLE_SIZE, size must be greater than 0 So, mapping a zero-sized range is illegal. Let's instead map the reported size of the image, which we already know. Fixes: 8af568e4 ("vulkan: implement wsi_win32 backend") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!11848>
-
Faith Ekstrand authored
Since we always modify this structure with atomics on the u64, it seems better to use the u64 here too. I have no idea if this fixes a bug. Part-of: <mesa/mesa!11857>
-
Autumn Ashton authored
This is the only barrier to lavapipe fully working in RenderDoc. Fixes: 21864bda ("llvmpipe: unmap display target of shader image/sampler") Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <mesa/mesa!11856>
-
Rob Clark authored
This adds support for a660 and a635. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11790>
-
Rob Clark authored
Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Newer a6xx devices seem to drop 8b/pixel UBWC support. The turnip part was adapted from Jonathans patch on !10892 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Newer a6xx devices drop this packet from the sqe firmware, and use direct (pkt4) register writes instead for the few cases that previously used CP_REG_WRITE. The turnip part was adapted from Jonathans patch on !10892 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
At some point we might want to change this to minimum fw version, but for now it can be a bool. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Removing more gpu_id checks that will become bogus as we add more a6xx. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Unused since d968995c, and this gets rid of one more gpu_id check. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11790>
-
Rob Clark authored
Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
A step towards getting rid of checks for gpu_id sprinkled around. Checking major generation is ok, but checking for == or >= a specific gpu_id is going to start getting messy as we add more a6xx. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Split out from earlier patch to reduce churn. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Split out from previous patch to reduce churn. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11790>
-
Rob Clark authored
This way we can make the tables const. At the same time, for a6xx, this introduces a "sub-generation template" to reduce the copy/paste for parameters which are keyed to the sub-generation. It also explicitly lists every supported GPU, to get rid of duplicate lists of supported gpus between the device-info and drivers. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Rob Clark authored
Everywhere else symbols/types/etc are shortend to "fd_*", so lets do the same here for consistency. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Jonathan Marek authored
Replace it with a calculation which works for all current GPUs. Duplicated the calculation in both drivers because freedreno_dev_info isn't meant for derived parameters (and drivers might want to just calculate on the fly instead). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <!11790>
-
Jonathan Marek authored
The larger page alignment is directly related to the 96 tile alignment, so check for that instead of a specific gpu id. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11790>
-
Jonathan Marek authored
- It hurts users with newer firmware who don't need the workaround - Kernel now rejects older firmware due to security issues, so this will prevent users from using older firmware anyway. - Only whitelisting 650 enables the workaround by default for any new GPUs Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11790>
-
Kenneth Graunke authored
This avoids having to call out through the PLT just to lock/unlock. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!11858>
-
Icecream95 authored
Valgrind still complains about uninitialised values, but tests don't flake anymore. Fixes flakes in dEQP-GLES3.functional.fragment_out.* Part-of: <mesa/mesa!11862>
-
Dave Airlie authored
anything you can do I can do better^W^Wadapt for crocus Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!11859>
-
- Jul 13, 2021
-
-
Timur Kristóf authored
The result is about +5-ish fps in Doom Eternal. It turns out that the location of position exports matters more than we thought, and it's actually better to keep them at the bottom for culling shaders rather than schedule it up to the top. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!10525>
-
Timur Kristóf authored
Uniforms have the same value in all invocations, therefore they can safely be reused by invocations even after repacking. This saves several instructions from culling shaders, mainly UBO loads and such. We exclude uniform floats, because those would harm the VGPR usage of the shaders too much. Fossil DB results on Sienna Cichlid (with NGG culling on): Totals from 55379 (43.05% of 128647) affected shaders: VGPRs: 1926472 -> 1925360 (-0.06%); split: -0.07%, +0.01% SpillSGPRs: 139 -> 330 (+137.41%) CodeSize: 159472988 -> 157462856 (-1.26%); split: -1.27%, +0.00% MaxWaves: 1571492 -> 1571412 (-0.01%) Instrs: 30665685 -> 30302076 (-1.19%); split: -1.21%, +0.02% Latency: 127385148 -> 126723891 (-0.52%); split: -0.55%, +0.03% InvThroughput: 21096298 -> 20773069 (-1.53%); split: -1.53%, +0.00% VClause: 514792 -> 511231 (-0.69%); split: -0.83%, +0.13% SClause: 713959 -> 679556 (-4.82%); split: -4.84%, +0.02% Copies: 2975106 -> 2828185 (-4.94%); split: -5.39%, +0.45% Branches: 1201921 -> 1152766a (-4.09%) PreSGPRs: 1753786 -> 1892848 (+7.93%); split: -0.00%, +7.93% PreVGPRs: 1590522 -> 1583574 (-0.44%); split: -0.44%, +0.00% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!10525>
-
Timur Kristóf authored
These will be useful for some optimizations. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!10525>
-