Commits · mesa-22.2.0-rc1 · Kingstom / mesa

Aug 03, 2022

VERSION: bump for 22.2.0-rc1 · f8367fc4
Dylan Baker authored 2 years ago

mesa-22.2.0-rc1

f8367fc4
VERSION: bump 22.3.0-devel · 373b2326
Dylan Baker authored 2 years ago and Marge Bot committed 2 years ago
```
Part-of: <mesa/mesa!17875>
```
22.2-branchpoint

373b2326

turnip: Use the GMEM CCU space for attachments when the stores won't. · fcd96ce0

Emma Anholt authored 2 years ago and

Marge Bot committed 2 years ago

Since the CCU only gets used for unaligned attachment stores or resolves
with the wrong formats, we can use that space for attachments in many
cases.

This gets two more of vk-5-normal's main renderpass's attachments to fit
in the next gmem_pixels increment, leaving 1 to go.  Other renderpasses do
get better gmem_pixels, and a few get better tile sizes as a result, but
the fps increase from those looks to be <.2% at least.

Part-of: <!16921>

fcd96ce0

turnip: Split the tiling config into separate layouts based on CCU usage. · b8a334b5

Emma Anholt authored 2 years ago and

Marge Bot committed 2 years ago

We now choose between two (equal as of this commit) layouts based on
whether the renderpass's stores will use the CCU space, and assert that we
always know the chosen layout when we go using the gmem offsets.

This required making vkCmdClearAttachments in a secondary take the 3D path
instead of gmem blits, since secondaries only have to be compatible with
the primary's renderpass, rather than equal.

Part-of: <!16921>

b8a334b5

ci/freedreno: Update a630 s8 resolve xfails. · a1db4fca

Emma Anholt authored 2 years ago and

Marge Bot committed 2 years ago

These tests are all only run in a full vk run.  These removed ones were
fixed in !17684
and I'm betting the bypass ones were pre-existing (we hadn't updated 630's
full vk run list for these new stencil tests, I belive -- my previous full
run update was just from one of the two jobs).

Part-of: <!16921>

a1db4fca

tu: Restore formatting of tu_clear_blit.c · 19418adf

Connor Abbott authored 2 years ago and

Marge Bot committed 2 years ago

Conflict resolution appears to have gone awry.  Use my previous resolution
of that rebase instead.

Fixes: 89263fde ("tu: Use common vk_image struct")
Part-of: <!16921>

19418adf

iris: Dedent enum iris_depth_reg_mode · 6875e075
Nanley Chery authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <!17859>
```
6875e075

iris: Make the D16 reg mode single-sampled · a75cd15b

Nanley Chery authored 2 years ago and

Marge Bot committed 2 years ago


Wa_14010455700 is dependent on the format and sample count, but our
code to track whether or not it had been applied was only dependent on
the format.

As a result, we failed to enable the workaround when an app used a D16
2xMSAA buffer, then a D16 1xMSAA buffer right afterwards.

Make the workaround tracking code sample-dependent to fix this.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <!17859>

a75cd15b

anv: Make the D16 reg mode single-sampled · e7419c11

Nanley Chery authored 2 years ago and

Marge Bot committed 2 years ago


Wa_14010455700 is dependent on the format and sample count, but our
code to track whether or not it had been applied was only dependent on
the format.

As a result, we failed to enable the workaround when an app used a D16
2xMSAA buffer, then a D16 1xMSAA buffer right afterwards.

Make the workaround tracking code sample-dependent to fix this.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <!17859>

e7419c11

nir/lower_idiv: Be less creative about signs · a4a15f50

Alyssa Rosenzweig authored 2 years ago and

Marge Bot committed 2 years ago


I'm sorry to whoever wrote this, but

   (x - (int) (x < 0)) ^ -((int) (x < 0))

is not an acceptable way to write iabs.

Shader-db results on Intel Tiger Lake with lower_idiv enabled:

    total instructions in shared programs: 21122548 -> 21122570 (<.01%)
    instructions in affected programs: 2369 -> 2391 (0.93%)
    helped: 2
    HURT: 8

    total cycles in shared programs: 791609360 -> 791608062 (<.01%)
    cycles in affected programs: 114106 -> 112808 (-1.14%)
    helped: 9
    HURT: 1

If we make the Intel back-end less stupid, we get to 9/1 helped/HURT for
instructions as well but that's for a different MR.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <!17845>

a4a15f50

zink: combine loops for lazy descriptor program deinit · e13c9d21

Mike Blumenkrantz authored 2 years ago and

Marge Bot committed 2 years ago


the bindless and push sets don't have update templates stored to
the program, so merging these loops avoids trying to destroy them

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <!17866>

e13c9d21

zink: don't flag lazy push constant set dirty on batch change · 74509905
Mike Blumenkrantz authored 2 years ago and Marge Bot committed 2 years ago
```
this has its own flag

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <!17866>
```
74509905

zink: fix gfx program cache pruning with generated tcs · c7ef4f97

Mike Blumenkrantz authored 2 years ago and

Marge Bot committed 2 years ago


if the tcs was generated, then the prgram was added to the non-tcs cache,
which means deleting it from the tcs+tes cache will fail and then
context_destroy will explode

Fixes: 4123ee3c ("zink: invoke descriptor_program_deinit for programs on context destroy")

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <!17866>

c7ef4f97

ir3: Never remove GS_HEADER_IR3 sysval input · e1c89abd

Danylo Piliaiev authored 2 years ago and

Marge Bot committed 2 years ago

Without GS header geometry shader is never invoked which may cause
issues if it has side-effects.

Fixes GL CTS tests running via Zink:
 KHR-GL46.shader_image_load_store.multiple-uniforms
 KHR-GL46.texture_cube_map_array.image_op_geometry_sh

Closes: #6940



Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <!17771>

e1c89abd

ir3/ra: Always insert interval for precolored inputs · ed7814de

Danylo Piliaiev authored 2 years ago and

Marge Bot committed 2 years ago


insert_dst checked whether dst is unused, however for precolored
inputs we always want to reserve a reg for them. Input could be
unused only if we explicitly want it.

Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <!17771>

ed7814de

radeonsi: move small prim precision computation out of si_emit_cull_state · ff8e5254

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


to put it next to its only use and remove the structure fields

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <!17864>

ff8e5254

radeonsi: move the no-AA small prim precision cull constant into an SGPR · fa46f3d4

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


This reduces the scalar load from vec4 to vec2.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <!17864>

fa46f3d4

radeonsi: add a randomized blit test · 788dce46
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
788dce46

radeonsi: allow texture_map to upload only 1 sample for MSAA instead of all · a42be1ef

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


Reuse the level parameter to do that, which allows us to keep
the pipe_transfer size unchanged. It's kinda hacky, but it's the simplest
way to do it. This will be used by the blit test to initialize MSAA textures.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

a42be1ef

radeonsi: make various blit functions non-static · 2afaedf1
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
2afaedf1
radeonsi/gfx11: use a better workaround for the export conflict bug · f129db91
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
This is recommended for better performance.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
f129db91
radeonsi/gfx11: enable shader prefetch except for initial chip revisions · 2ed9eb1b
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
2ed9eb1b
radeonsi/gfx11: rename si_calc_inst_pref_size -> si_get_shader_prefetch_size · a09d9710
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
a09d9710
radeonsi/gfx11: skip code in si_update_shaders that has no effect · a791e7f3
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
a791e7f3
radeonsi/gfx11: use better PRIM_GRP_SIZE_GFX11 setting · 34196148
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
34196148
radeonsi/gfx11: set SAMPLE_MASK_TRACKER_WATERMARK = 15 and clean up · 23a1dca8
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
23a1dca8
radeonsi/gfx11: use correct VGT_TESS_DISTRIBUTION settings · b1af3616
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
b1af3616

radeonsi: cosmetic changes around do_hardware_msaa_resolve · 28842d96

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


- move gfx_level checking into the function
- rename the function
- call it in si_blit later
- set the SQTT event

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

28842d96

radeonsi: fold async_copy into the preceding conditional in si_blit · b1b0a860
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
b1b0a860
radeonsi: move compute-related code from si_blit.c to si_compute_blit.c · 7f1485d5
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
7f1485d5

radeonsi: check for 16-bit hw support instead of relying on options.fp16 · 3b7512ca

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


options.fp16 can be true even when the hw doesn't support FP16.
options.fp16 should only affect the CAP because 16-bit ops can still be
used by internal shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

3b7512ca

radeonsi: add need_fmask_expand parameter into si_decompress_subresource · 2847106b

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


This is required by MSAA image stores for internal compute blits.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

2847106b

radeonsi: follow shader_info.float_controls_execution_mode (mostly) · 9e9cc629
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
9e9cc629

radeonsi: don't do image stores with RGBX, L, LA, I, and SRGB formats · 0482ff31

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


The only change in behavior is that RGBX stores now overwrite X, which is
what CB does and it's faster.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

0482ff31

radeonsi: remove compute-based DCC decompression because it's broken · b42a4a7f

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


The new blit test discovered that it doesn't always work.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

b42a4a7f

radeonsi: add common helper si_launch_grid_internal_images that is more robust · 9da309a7

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


It does things in the correct order, which isn't easy to get right.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

9da309a7

radeonsi: make si_launch_grid_internal static · 2a854647
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
2a854647

radeonsi: call pipe->blit instead of util_blitter_blit after MSAA resolving · 233b4271

Marek Olšák authored 2 years ago and

Marge Bot committed 2 years ago


This fixes a problem where the destination has a DCC-incompatible view
format and triggers a DCC decompression using a custom u_blitter path, which
is disallowed inside u_blitter due to it being a u_blitter recursion that
always crashes.

This is also better because we'll get the best codepath (u_blitter or
compute) instead of just u_blitter,

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>

233b4271

radeonsi: move SI_MAX_VRAM_MAP_SIZE to si_debug_options.h · 922f54a0
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
922f54a0
radeonsi: unify VGT_TESS_DISTRIBUTION programming · 38cd2a61
Marek Olšák authored 2 years ago and Marge Bot committed 2 years ago
```
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <!17864>
```
38cd2a61

Admin message