- Mar 11, 2025
-
-
Alyssa Rosenzweig authored
fixes supertuxkart without the old hack. Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
would have exposed the bug fixed. Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
Fixes regression in Deus Ex: Human Revolution (DX11) via DXVK reported by James Calligeros. Pending CTS coverage: https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/5640 Only the alignment check here is load bearing but I clarified the logic while at it. Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
oops. - .partial = flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT, Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
works on current kernels. Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
An even bigger pile of tests... This adds general miptree tests for some compressed formats, and even more comprehensive miptree and tile size tests for more formats and size combinations. Since the tests are based on observing tiling done by the Metal driver, we don't know the actual tile size, but rather we can just identify which tile sizes logically have the same result (since several sizes can be equivalent). This is encoded as a bit mask, split into two halves to handle the case of stride padding for compressed textures (which tile sizes are valid with and without stride padding can vary, and sometimes you can either change the tile size or add padding and end up with the same result). As long as whatever configuration the layout code comes up with has its corresponding bit set in the bit mask, the tiling should be correct. Signed-off-by:
Asahi Lina <lina@asahilina.net>
-
Alyssa Rosenzweig authored
Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Alyssa Rosenzweig authored
Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
ppMaxPrimitiveCounts also requires the same nested dynamic array special treatment. Fixes: 6bac77b7 ("venus: sync protocol for ray tracing support") Signed-off-by:
Yiwei Zhang <zzyiwei@chromium.org> Part-of: <!33995>
-
make a separate sync mainly to isolate the next fix Signed-off-by:
Yiwei Zhang <zzyiwei@chromium.org> Part-of: <!33995>
-
If a valid primary file descriptor is already set (e.g. from vc4), don't overwrite it with -1. This prevents losing a valid primary fd and resolves issues arising when vc4 is the first node returned by `drmGetDevices2()` and v3d is the second. Closes: #12777 Fixes: 188f1c6c ("v3dv: rewrite device identification") Signed-off-by:
Maíra Canal <mcanal@igalia.com> Reviewed-by:
Iago Toral Quiroga <itoral@igalia.com> Part-of: <!33958>
-
If the kernel supports modifiers and the GPU is a Turing+ then force using zink instead of nvc0. Signed-off-by:
Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by:
Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by:
Karol Herbst <kherbst@redhat.com> Acked-by:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by:
Mel Henning <mhenning@darkrefraction.com> Part-of: <!29232>
-
With secondary command buffers, inherited rendering can be used but it's basically impossible to know if the depth/stencil attachment enabled HiZ/HiS. But it's required to disable WALK_ALIGN8 to avoid GPU hangs. This assumes that HiZ/HiS is enabled for inherited rendering as long as a depth/stencil attachment is used. It's not the most optimal approach but it's not supposed to hurt either. This fixes a GPU hang with dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.contents_secondary_cmdbuffers and friends. GFX1200 isn't affected because it doesn't support HiZ/HiS. Cc: mesa-stable Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!33986>
-
We currently report a deviceName as e.g. "Mali-G610 (Panfrost)", but panfrost has nothing to di with the physical device, and the suffix doesn't belong there at all. So let's remove that suffix from PanVK. This results in output like this from vulkaninfo: ---8<--- VkPhysicalDeviceProperties: --------------------------- apiVersion = 1.1.305 (4198705) driverVersion = 25.0.99 (104857699) vendorID = 0x13b5 deviceID = 0xa8670000 deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = Mali-G610 pipelineCacheUUID = <snip> ---8<--- We already sort of namedrop Panfrost in the driver properties: ---8<--- VkPhysicalDeviceDriverPropertiesKHR: ------------------------------------ driverID = DRIVER_ID_MESA_PANVK driverName = panvk driverInfo = Mesa 25.1.0-devel (git-136dd9f9) conformanceVersion: major = 1 minor = 4 subminor = 1 patch = 2 ---8<--- While this might techically speaking be a regression, PanVK has been marked as experimental until Mesa 25.0. But to reduce the risk of people starting to depend on this behavior, let's also backport this change to the 25.0 release. The patch looks a bit funny, because we add the " (Panfrost)"-suffix in common code, and this moves it to the Gallium driver. But effectively, this means PanVK is the only driver that sees a change of behavior. Backport-to: 25.0 Reviewed-by:
John Anthony <john.anthony@arm.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!33972>
-
Signed-off-by:
Daniel Stone <daniels@collabora.com> Part-of: <!33431>
-
These should make G610 properly stable now. Signed-off-by:
Daniel Stone <daniels@collabora.com> Part-of: <!33431>
-
We have a common implementation for this, let's just use that. Similar to the previous commit, this is a bit silly. But if we ever get in a situation where VK_EXT_display actually makes sense, this stuff should "just work", so let's enable it for good measure. Tested-by:
Alexandre ARNOUD <aarnoud@me.com> Acked-by:
Daniel Stone <daniels@collabora.com> Part-of: <!33916>
-
It seems the common WSI code does all that's really needed here for us already. Enabling this lets me run vkmark on PanVK. This is a bit silly, because what actually happens here is that we end up passing -1 as the display_fd to wsi_device_init(). This in turn leads us to returning zero usable displays, which renders the extension somewhat useless. But it is better than not supporting the extension, and not supporting applications who have a hard depdendency on it fail, like is the case with vkmark. Tested-by:
Alexandre ARNOUD <aarnoud@me.com> Acked-by:
Daniel Stone <daniels@collabora.com> Part-of: <!33916>
-
We're currently exposing a bunch of extensions that requiring Vulkan 1.1, and we'll soon enough do the same for Vulkan 1.2. Instead of having to update each of these extensions separately once we add new Vulkan version support for some gens, let's use a single variable for this instead. And while we *could* query the exposed vulkan version and do this a bit more "automatically", this makes it easy to leave some needless checks behind if the baseline version changes. Leaving this as a arch check in this function should make it a bit more obvious when the check can be removed. Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!33971>
-
This extension requires Vulkan 1.1, which we don't yet expose on Bifrost GPUs. Fixes: a9592a0c ("panvk: enable subgroupExtendedTypes") Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!33971>
-
Foz-DB Navi21: Totals from 3 (0.00% of 79789) affected shaders: Instrs: 1708 -> 1722 (+0.82%) CodeSize: 9416 -> 9460 (+0.47%) Latency: 12094 -> 12371 (+2.29%); split: -0.02%, +2.31% InvThroughput: 1967 -> 1992 (+1.27%) Copies: 105 -> 106 (+0.95%) PreVGPRs: 131 -> 132 (+0.76%) VALU: 1155 -> 1169 (+1.21%) Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!33974>
-
Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!33207>
-
Without these it's impossible to know which application generated the event. Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!33207>
-
Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!33207>
-
u_trace_perfetto_active uses an atomic read so avoid doing it too much in hot path. Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!33207>
-
num_components should be 1 as we're loading an offset value. Fixes: ec68f049 ("st/mesa: switch GL_SELECT shader to IO intrinsics") Closes: #12774 Reviewed-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!33982>
-
Signed-off-by:
Valentine Burley <valentine.burley@collabora.com> Part-of: <!33965>
-
Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!33970>
-
Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!33970>
-
Kenneth Graunke authored
This was only needed on Sandybridge. We can delete the brw code, and replace the generic devinfo bit with a helper inside the elk compiler itself. Thanks to Iván Briano for noticing we still had dead brw code for this. Reviewed-by:
Ivan Briano <ivan.briano@intel.com> Part-of: <!33764>
-
Kenneth Graunke authored
We were not using the minimum values from devinfo for anything. For tessellation control, the minimum value is 0, so we continue taking MAX2 of that with 1 when tessellation is enabled so we have at least something guaranteed to be present. For geometry, the minimum value is already non-zero (and updated by the previous patch). This will have the side-effect of raising the minimum number of URB entries for geometry stages. This is currently not known to fix anything, but should be more closely following the documentation. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!33764>
-
Kenneth Graunke authored
We've been programming our minimum number of URB entries for geometry shaders to 2, but it appears that we should have been setting 8 on Broadwell and later. Additionally, there's a workaround on Skylake and later that requires us to add flushing (which we haven't) or use a minimum of 16 URB entries. This alone will not fix anything, as nothing reads this devinfo field presently (will be fixed in the next commit). Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!33764>
-
Kenneth Graunke authored
As we added new platforms, the device info macros evolved over time Most platforms had a "FEATURES" macro, some had a "HW_INFO" macro, a few had macros for URB entries - some with min entries only, some with min and max, some including the .urb = { ... } braces, others not. Thread counts or subslice info was sometimes considered FEATURES, sometimes HW_INFO, sometimes inserted only in the final structure. FEATURES macros often inherited from an ancestor platform, but not necessarily the prior platform - many were based on GFX8_FEATURES. Many redundantly set the same feature bits as prior platforms. This patch aims to clean up the situation, so it's a little more organized, especially if you look at multiple generations. Macros are now split into several separate pieces: 1. The FEATURES macro only has architectural features, such as LSC, ray tracing support, 64-bit integers, flat CCS, and so on. Thread counts, subslice info, and URB sizes that may vary by SKU are not included here. This makes it easy for one platform to inherit the features from the previous, while not pulling in that extra data. 2. THREAD_COUNTS macros contain maximum thread counts from the 3DSTATE_VS documentation and so on. 3. URB_MIN_MAX_ENTRIES macros contain the entire URB configuration, including .urb = { ... }. 4. PAT_ENTRIES macros (on modern platforms) contains our choice of which PAT entries to use for various types of resources. 5. CONFIG macros combine all of the above into a tidy bundle for use in defining various structures, and may also include the platform macro or simulator ID for convenience. On recent platforms where hwconfig tables exist, items #2-3 could potentially be dropped and filled in from there instead. For XEHP+ where we require hwconfig, we instead have a PLACEHOLDER_THREADS_AND_URB macro that makes it clear that these values are updated from hwconfig. One nice thing is that the bits that could (or do) come from hwconfig tables are now cleanly separate from those that do not (i.e. platform feature support, PAT entry selection, and so on). This patch does not touch GFX7 or earlier macros. We could probably offer a similar treatment there, but they're generally working and not quite as complex. To verify that this commit does not have unintentional changes, I recommend running objdump -s build/src/intel/dev/libintel_dev.a.p/intel_device_info.c.o before and after this commit, and diffing the output. The devinfo structures produced are identical. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!33764>
-
Kenneth Graunke authored
intel_device_info_init_common calculates this for Gfx9+ based on max_threads_per_psd and slice information. Mark it as zero in the structures to make clear that the value there isn't useful, and make it easier to diff binaries for the next commit's refactors. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!33764>
-
Kenneth Graunke authored
The documentation for 3DSTATE_URB_HS has 0 as the minimum number of HS URB entries for all platforms. See BSpecs 32162, 47137, 56271 for Gfx6-11, Xe, and Xe2-3, respectively. This should silence warnings about our device info field not matching the hwconfig tables. Notably, nothing in our drivers currently uses this value so it cannot have a functional impact. Fixes: 4064b554 ("intel/dev: reduce warning noise from urb settings") Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!33764>
-