- Jun 08, 2022
-
-
Dylan Baker authored
-
Dylan Baker authored
-
- May 09, 2022
-
-
Matt Turner authored
Cuts 119 KiB from iris_dri.so and libvulkan_intel.so. text data bss dec hex filename 917511 0 0 917511 e0007 meson-generated_.._intel_perf_metrics.c.o (before) 796986 0 0 796986 c293a meson-generated_.._intel_perf_metrics.c.o (after) text data bss dec hex filename 14130948 365708 210004 14706660 e067e4 iris_dri.so (before) 14009332 365708 210004 14585044 de8cd4 iris_dri.so (after) text data bss dec hex filename 8124225 214264 22820 8361309 7f955d libvulkan_intel.so (before) 8002609 214264 22820 8239693 7dba4d libvulkan_intel.so (after) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 8860ff33) Part-of: <mesa/mesa!16405>
-
Matt Turner authored
Along with fixing the grammar, this allows it to be deduplicated since the properly worded description exists in later generations' XMLs. Cuts 96 B from iris_dri.so and libvulkan_intel.so. text data bss dec hex filename 917613 0 0 917613 e006d meson-generated_.._intel_perf_metrics.c.o (before) 917511 0 0 917511 e0007 meson-generated_.._intel_perf_metrics.c.o (after) text data bss dec hex filename 14131044 365708 210004 14706756 e06844 iris_dri.so (before) 14130948 365708 210004 14706660 e067e4 iris_dri.so (after) text data bss dec hex filename 8124321 214264 22820 8361405 7f95bd libvulkan_intel.so (before) 8124225 214264 22820 8361309 7f955d libvulkan_intel.so (after) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit d80d3c67) Part-of: <mesa/mesa!16405>
-
Matt Turner authored
Reduces their sizes from 4 bytes to 1. Cuts 6 KiB from iris_dri.so and libvulkan_intel.so. text data bss dec hex filename 924401 0 0 924401 e1af1 meson-generated_.._intel_perf_metrics.c.o (before) 917613 0 0 917613 e006d meson-generated_.._intel_perf_metrics.c.o (after) text data bss dec hex filename 14137732 365708 210004 14713444 e08264 iris_dri.so (before) 14131044 365708 210004 14706756 e06844 iris_dri.so (after) text data bss dec hex filename 8131009 214264 22820 8368093 7fafdd libvulkan_intel.so (before) 8124321 214264 22820 8361405 7f95bd libvulkan_intel.so (after) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 7024b8e0) Part-of: <mesa/mesa!16405>
-
Matt Turner authored
The compiler does a good job of deduplicating strings already, but we can eliminate the pointers to each string by combining the strings into a single char array and storing only an index into that array. The longest of the char arrays is the descriptions array, which is a little over 45 KiB, so still under MSVC's 64 KiB string literal limit [0]. Because the string length is under 64 KiB we can use uint16_t as the index type, which roughly doubles our savings as compared to an int. This cuts 77 KiB from iris_dri.so (0.5%) and libvulkan_intel.so (0.9%). text data bss dec hex filename 926811 25920 0 952731 e899b meson-generated_.._intel_perf_metrics.c.o (before) 924401 0 0 924401 e1af1 meson-generated_.._intel_perf_metrics.c.o (after) text data bss dec hex filename 14190852 391628 210004 14792484 e1b724 iris_dri.so (before) 14137732 365708 210004 14713444 e08264 iris_dri.so (after) text data bss dec hex filename 8184097 240184 22820 8447101 80e47d libvulkan_intel.so (before) 8131009 214264 22820 8368093 7fafdd libvulkan_intel.so (after) relinfo: iris_dri.so (before): 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users iris_dri.so (after) : 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users libvulkan_intel.so (before): 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users libvulkan_intel.so (after) : 8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users [0] https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?view=msvc-170&viewFallbackFrom=vs-2019 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 6c0246dc) Part-of: <!16405>
-
Matt Turner authored
intel_perf_query_counter contains fields for things we can't or don't want to store in our static data (like runtime-determined max values) or oa_read_counter function pointers which are dependent on the GPU gen and would make deduplication very ineffective. Cuts 16 KiB from iris_dri.so and libvulkan_intel.so. text data bss dec hex filename 926811 43200 0 970011 ecd1b meson-generated_.._intel_perf_metrics.c.o (before) 926811 25920 0 952731 e899b meson-generated_.._intel_perf_metrics.c.o (after) text data bss dec hex filename 14190852 408908 210004 14809764 e1faa4 iris_dri.so (before) 14190852 391628 210004 14792484 e1b724 iris_dri.so (after) text data bss dec hex filename 8184097 257464 22820 8464381 8127fd libvulkan_intel.so (before) 8184097 240184 22820 8447101 80e47d libvulkan_intel.so (after) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit df5e743c) Part-of: <!16405>
-
Matt Turner authored
And specifically mark it with ATTRIBUTE_NOINLINE. Otherwise it will be inlined and actually slightly increase code size. Cuts 505 KiB from iris_dri.so and libvulkan_intel.so. text data bss dec hex filename 1538720 0 0 1538720 177aa0 meson-generated_.._intel_perf_metrics.c.o (before) 926811 43200 0 970011 ecd1b meson-generated_.._intel_perf_metrics.c.o (after) text data bss dec hex filename 14751700 365708 210004 15327412 e9e0b4 iris_dri.so (before) 14190852 408908 210004 14809764 e1faa4 iris_dri.so (after) text data bss dec hex filename 8744913 214264 22820 8981997 890ded libvulkan_intel.so (before) 8184097 257464 22820 8464381 8127fd libvulkan_intel.so (after) Relocations increase because the counter initializations are moved from code (in .text) to pointers (in .text) to .rodata, which require relocations. relinfo: iris_dri.so (before): 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users iris_dri.so (after) : 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users libvulkan_intel.so (before): 8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users libvulkan_intel.so (after) : 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit bbbbb032) Part-of: <!16405>
-
Matt Turner authored
No changes in resulting code (yes, seriously!). GCC constant propagates the static const arrays into the code, yielding bit for bit identical results. This will however enable further cleanups. Before this patch, we emit 11916 different initializations of intel_perf_query_counter. With this patch we emit an array of 539 and initialize the intel_perf_query_counters in terms of those. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 5e6c7a57) Part-of: <!16405>
-
Matt Turner authored
Just an annoyance I noticed when I needed to generate the description string in two different places. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 3172b5bb) Part-of: <!16405>
-
Now my editor can help me format code as I type. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 12e065dd) Part-of: <!16405>
-
This cuts the compile time down for this file on my ryzen from real 1m4.077s to real 0m30.827s Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit acc2d08c) Part-of: <!16405>
-
- Mar 19, 2022
-
-
Stops gamescope from recompiling pipelines on every start. Cc: mesa-stable Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <!15454> (cherry picked from commit 4f6c7a60)
-
- Mar 18, 2022
-
-
this is a harmless case, but it's still wrong cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!15429> (cherry picked from commit 8294d454)
-
this fixes desync+crash when: 1. usage is added for bs A 2. tracking is added for bs B 3. tracking is removed for bs B 4. context is destroyed 5. usage A is now dangling and will crash if accessed as seen in glmark2 cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!15429> (cherry picked from commit 7da211e2)
-
This fixes the test_resolve_non_issued_query_data vkd3d-proton test. This change is required on TGL+ (maybe ICL?) because on all platforms 3D pipeline writes are not coherent with CS. On previous platform we fixed this by flushing the render cache to make sure data is visble to CS before it writes to memory. But on more recently platforms, flushing the render cache leaves the data in the tile cache which is still not coherent with CS, so we need to flush that one too. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <!14552> (cherry picked from commit 8b71118a)
-
We've run into issues before where PIPE_CONTROL races MI_STORE_* commands. So make sure we emit the availability using the same type of CS so that memory writes are properly ordered. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <!14552> (cherry picked from commit 4e30da78)
-
this needs to be re-set any time the cmdbuf changes cc: mesa-stable Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!15397> (cherry picked from commit efa72413)
-
Most of the time, this doesn't matter. On the versions with _sat, if the destination type is incorrect, the clamping will not happen correctly. Fixes the following CTS tests: dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_uu_v4i8_out32 v2: Update anv-tgl-fails.txt. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Fixes: 0f809dbf ("intel/compiler: Basic support for DP4A instruction") Part-of: <!15417> (cherry picked from commit 19330eeb)
-
anv_batch_bo has a length field that we use to flush cachelines. Not having that field initialized properly leads us to access out of bound memory. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <!15425> (cherry picked from commit d68b9f0e)
-
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 83fee30e ("anv: allow multiple command buffers in anv_queue_submit") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <!15425> (cherry picked from commit 78acae38)
-
If you hadn't already called wsi_GetPhysicalDeviceDisplayProperties2KHR or wsi_GetDrmDisplayEXT before calling GetPhysicalDeviceDisplayPlaneProperties2KHR, then the connectors list wouldn't be populated and you'd get no plane properties. Fixes failure of dEQP-VK.wsi.display.get_display_plane_capabilities when run on its own. Fixes: #4575 Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!15353> (cherry picked from commit da834a12)
-
these all need to check for z coord oob to avoid crashing cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!15388> (cherry picked from commit 6345575f)
-
these become owned and freed by llvmpipe, so ensure that freeing them there won't cause crashes cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!15281> (cherry picked from commit 2f9976de)
-
Eric Engestrom authored
This reverts commit a27f2d99. As of a0829cf2 ("GL: drop symbols mangling support"), this extra complexity isn't needed anymore. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Part-of: <!2298> (cherry picked from commit 5dbbc0f0)
-
The readthedocs theme now supports Sphinx 4.x, so there's no longer any reason to stick with 3.x. This reverts commit a545b6ed. Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com> Part-of: <!15212> (cherry picked from commit dd9b8881)
-
Handle arrays generically by using the last component of the coordinate source as the array index. That works for both 2D arrays and cube arrays, fixing cube arrays. Cube arrays were already handled correctly in core Panfrost code. This code path is not tested in dEQP-GLES31 without exposing OES_cube_map_array, which depends on OES_geometry_shader, which we don't have. Yet we do expose PIPE_CAP_CUBE_ARRAY, so ARB_cube_map_array is exposed. Disabling PIPE_CAP_CUBE_ARRAY would be an easy band-aid fix, but it's easy enough to handle correctly. dEQP-GLES31 passes with a hack enabling OES_cube_map_array [without geometry shaders]. Also fixes 1D arrays on Bifrost for the same reasons. Fixes: 70d6c567 ("pan/bi: Emit TEXC with builder") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!15254> (cherry picked from commit 53f1e57e)
-
Hardware support was removed with Midgard. Use mesa/st to emulate GL_CLAMP with nir_lower_tex automatically (the Zink lowering), and disable GL_MIRROR_CLAMP which isn't lowered correctly. Fixes *texwrap* Piglit tests on G52. Fixes: f9ceab7b ("panfrost: Fix CLAMP wrap mode") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!15253> (cherry picked from commit 1f97819f)
-
Eric Engestrom authored
-
Fix msvc build regression after 0536b691 reported by Prodea Alexandru-Liviu. Closes: #6137 Fixes: 0536b691 ("util: fix build with clang 10 on mips64") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <!15355> (cherry picked from commit e50eb1ce)
-
The Vulkan spec for VK_KHR_depth_stencil_resolve allows a format mismatch between the primary attachment and the resolve attachment within certain limits. In particular, VUID-VkSubpassDescriptionDepthStencilResolve-pDepthStencilResolveAttachment-03181 If pDepthStencilResolveAttachment is not NULL and does not have the value VK_ATTACHMENT_UNUSED and VkFormat of pDepthStencilResolveAttachment has a depth component, then the VkFormat of pDepthStencilAttachment must have a depth component with the same number of bits and numerical type VUID-VkSubpassDescriptionDepthStencilResolve-pDepthStencilResolveAttachment-03182 If pDepthStencilResolveAttachment is not NULL and does not have the value VK_ATTACHMENT_UNUSED, and VkFormat of pDepthStencilResolveAttachment has a stencil component, then the VkFormat of pDepthStencilAttachment must have a stencil component with the same number of bits and numerical type So you can resolve from a depth/stencil format to a depth-only or stencil-only format so long as the number of bits matches. Unfortunately, this has never been tested because the CTS tests which purport to test this are broken and actually test with a destination combined depth/stencil format. Fixes: 5e4f9ea3 ("anv: Implement VK_KHR_depth_stencil_resolve") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <!15333> (cherry picked from commit d65dbe80)
-
This causes inconsistencies between sctx->framebuffer.state and other sctx->framebuffer properties (like compressed_cb_mask). The point of this code was to fix an issue with vi_separate_dcc_stop_query, which was removed by 804e2924 we can safely drop it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Closes: #6099 Cc: mesa-stable Part-of: <!15261> (cherry picked from commit 968d6812)
-
cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <!15280> (cherry picked from commit 5ab0e3f0)
-
Fixes: b15bfe92 ("anv: implement VK_EXT_color_write_enable") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <!15280> (cherry picked from commit 1e3e7b3a)
-
in the initial implementation, a stream like: * CmdBeginTransformFeedbackEXT * CmdSetRasterizerDiscardEnableEXT * CmdDraw * CmdEndTransformFeedbackEXT * CmdBeginTransformFeedbackEXT * CmdDraw * CmdEndTransformFeedbackEXT would never enable transform feedback, as it only checked for the change in rasterizer_discard state Fixes: 4d531c67 ("anv: support rasterizer discard dynamic state") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <!15269> (cherry picked from commit 52f69784)
-
This essentially ports 64405230 Author: Keith Packard <keithp@keithp.com> Date: Fri Aug 6 16:11:18 2021 -0700 iris: Map scanout buffers WC instead of WB [v2] to crocus. Fixes: f3630548 ("crocus: initial gallium driver for Intel gfx 4-7") Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <!15313> (cherry picked from commit e8c3be0e)
-
for genuine early depth tests, the samplecount must be updated after depth test but before samplemask is applied for inferred-early or regular depth tests, the samplemask can be applied before the depth test Fixes: d9276ae9 ("llvmpipe: handle gl_SampleMask writing.") fixes: dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_samples_4 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <mesa/mesa!15319> (cherry picked from commit 42e78ba1)
-
Eric Engestrom authored
-