- Aug 03, 2022
-
-
Dylan Baker authored
-
Part-of: <mesa/mesa!17875>
-
Since the CCU only gets used for unaligned attachment stores or resolves with the wrong formats, we can use that space for attachments in many cases. This gets two more of vk-5-normal's main renderpass's attachments to fit in the next gmem_pixels increment, leaving 1 to go. Other renderpasses do get better gmem_pixels, and a few get better tile sizes as a result, but the fps increase from those looks to be <.2% at least. Part-of: <!16921>
-
We now choose between two (equal as of this commit) layouts based on whether the renderpass's stores will use the CCU space, and assert that we always know the chosen layout when we go using the gmem offsets. This required making vkCmdClearAttachments in a secondary take the 3D path instead of gmem blits, since secondaries only have to be compatible with the primary's renderpass, rather than equal. Part-of: <!16921>
-
These tests are all only run in a full vk run. These removed ones were fixed in !17684 and I'm betting the bypass ones were pre-existing (we hadn't updated 630's full vk run list for these new stencil tests, I belive -- my previous full run update was just from one of the two jobs). Part-of: <!16921>
-
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!17859>
-
Wa_14010455700 is dependent on the format and sample count, but our code to track whether or not it had been applied was only dependent on the format. As a result, we failed to enable the workaround when an app used a D16 2xMSAA buffer, then a D16 1xMSAA buffer right afterwards. Make the workaround tracking code sample-dependent to fix this. Cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!17859>
-
Wa_14010455700 is dependent on the format and sample count, but our code to track whether or not it had been applied was only dependent on the format. As a result, we failed to enable the workaround when an app used a D16 2xMSAA buffer, then a D16 1xMSAA buffer right afterwards. Make the workaround tracking code sample-dependent to fix this. Cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <!17859>
-
I'm sorry to whoever wrote this, but (x - (int) (x < 0)) ^ -((int) (x < 0)) is not an acceptable way to write iabs. Shader-db results on Intel Tiger Lake with lower_idiv enabled: total instructions in shared programs: 21122548 -> 21122570 (<.01%) instructions in affected programs: 2369 -> 2391 (0.93%) helped: 2 HURT: 8 total cycles in shared programs: 791609360 -> 791608062 (<.01%) cycles in affected programs: 114106 -> 112808 (-1.14%) helped: 9 HURT: 1 If we make the Intel back-end less stupid, we get to 9/1 helped/HURT for instructions as well but that's for a different MR. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <!17845>
-
the bindless and push sets don't have update templates stored to the program, so merging these loops avoids trying to destroy them cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!17866>
-
this has its own flag cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!17866>
-
if the tcs was generated, then the prgram was added to the non-tcs cache, which means deleting it from the tcs+tes cache will fail and then context_destroy will explode Fixes: 4123ee3c ("zink: invoke descriptor_program_deinit for programs on context destroy") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <!17866>
-
Without GS header geometry shader is never invoked which may cause issues if it has side-effects. Fixes GL CTS tests running via Zink: KHR-GL46.shader_image_load_store.multiple-uniforms KHR-GL46.texture_cube_map_array.image_op_geometry_sh Closes: #6940 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <!17771>
-
insert_dst checked whether dst is unused, however for precolored inputs we always want to reserve a reg for them. Input could be unused only if we explicitly want it. Suggested-by: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <!17771>
-
to put it next to its only use and remove the structure fields Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <!17864>
-
This reduces the scalar load from vec4 to vec2. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reuse the level parameter to do that, which allows us to keep the pipe_transfer size unchanged. It's kinda hacky, but it's the simplest way to do it. This will be used by the blit test to initialize MSAA textures. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
This is recommended for better performance. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
- move gfx_level checking into the function - rename the function - call it in si_blit later - set the SQTT event Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
options.fp16 can be true even when the hw doesn't support FP16. options.fp16 should only affect the CAP because 16-bit ops can still be used by internal shaders. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
This is required by MSAA image stores for internal compute blits. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
The only change in behavior is that RGBX stores now overwrite X, which is what CB does and it's faster. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
The new blit test discovered that it doesn't always work. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
It does things in the correct order, which isn't easy to get right. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
This fixes a problem where the destination has a DCC-incompatible view format and triggers a DCC decompression using a custom u_blitter path, which is disallowed inside u_blitter due to it being a u_blitter recursion that always crashes. This is also better because we'll get the best codepath (u_blitter or compute) instead of just u_blitter, Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!17864>
-