- Oct 14, 2020
-
-
Dylan Baker authored
-
Dylan Baker authored
-
- Oct 13, 2020
-
-
Fixes artifacts on decals in Path of Exile. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Closes: mesa/mesa#3610 Cc: mesa-stable Part-of: <mesa/mesa!7062> (cherry picked from commit 037d9fb2)
-
VC4 doesn't have support for UMAX and UMIN integer operations. So we should avoid algebraic optimizations that generate umax/umin ops. Fixes: 8e1b75b3 ("nir/algebraic: optimize iand/ior of (n)eq zero") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <mesa/mesa!7083> (cherry picked from commit d5e5f72e)
-
Before 8e1b75b3 ("nir/algebraic: optimize iand/ior of (n)eq zero") this optimization didn't need the use of umax/umin. VC4 HW supports only signed integer max/min operations. lower_umin and lower_umax are added to allow enabling previous optimizations behaviour for this cases. Fixes: 8e1b75b3 ("nir/algebraic: optimize iand/ior of (n)eq zero") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!7083> (cherry picked from commit e7127b34)
-
==12920== Conditional jump or move depends on uninitialised value(s) ==12920== at 0x8F39391: util_fast_urem32 (fast_urem_by_const.h:71) ==12920== by 0x8F39391: hash_table_search (hash_table.c:285) ==12920== by 0x8B06D5D: ac_compute_dcc_retile_tile_indices (ac_surface.c:136) Fixes: a37aeb12 "amd/common: Cache intra-tile addresses for retile map." Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <mesa/mesa!7055> (cherry picked from commit a4e4644e)
-
With arrays we really have to use the correct size for the base mipmap to get the right array pitch. In particular, using surf_pitch results in pitch that is bigger than the base mipmap and hence results in wrong pitches computed by the HW. It seems that on GFX9 this has mostly been hidden by the epitch provided in the descriptor but this is not something we do on GFX10 anymore. Now this has some draw-backs: 1. normalized coordinates don't work 2. Bounds checking uses slightly bigger bounds. 2 mostly is not an issue as we still ensure that they're within the texture memory and not overlapping other layers/mips, but we can't properly ignore writes. 1 is kinda dead in the water ... On the other hand I'd argue that using normalized coords & a filter for sampling a block view of a compressed format is extraordinarily useless. The old method we employed already had these drawbacks for everything except the base miplevel of the imageview. AFAICT this is the same tradeoff AMDVLK makes and no CTS test hits this. (once it does I think the HW is dead in the water ... Only workaround I can think of is shader processing which is hard because we don't know texture formats at compile time.) I also removed the extra calculations when the image has only 1 mip level because they ended up being a no-op in that case. CC: mesa-stable Closes: mesa/mesa#2292 Closes: mesa/mesa#2266 Closes: mesa/mesa#2483 Closes: mesa/mesa#2906 Gitlab: mesa/mesa#3607 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7090> (cherry picked from commit 1fb3e1fb)
-
Dylan Baker authored
-
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Fixes: 18f9fc91 ('spirv: add and use a generator id enum') Part-of: <mesa/mesa!7096> (cherry picked from commit 044d2130)
-
SPIRV->NIR emits nir_op_unpack_half_2x16_flush_to_zero instead of nir_op_unpack_half_2x16 if the shader enables denorm flush to zero for 16-bit floating point. This doesn't fix anything known and CTS doesn't have tests. Fixes: 56d9bcdd ("radv: enable more float_controls features") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <mesa/mesa!6939> (cherry picked from commit b9ca4923)
-
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Mauro Rossi <issor.oruam@gmail.com> Fixes: 18f9fc91 ('spirv: add and use a generator id enum') Part-of: <mesa/mesa!7097> (cherry picked from commit 1070bba1)
-
Fixes: 7568c97d ("radv: Use atomics to read query results.") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7050> (cherry picked from commit c02e933d)
-
Dylan Baker authored
-
- Oct 12, 2020
-
-
Scratch stores are being lowered to the instructions with side-effects, however they should be enabled in fs helper invocations, since they are produced from operations which don't imply side-effects. To fix this - we move the decision of whether the sample mask predication is enable to the point where logical brw instructions are created. GLSL example of the issue: int tmp[1024]; ... do { // changes to tmp } while (some_condition(tmp)) If `tmp` is lowered to scrach memory, `some_condition` would be undefined if scratch write is predicated on sample mask, making possible for the while loop to become infinite and hang the GPU. Closes: mesa/mesa#3256 Fixes: 53bfcdee Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!6056> (cherry picked from commit 77486db8)
-
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: mesa-stable Part-of: <mesa/mesa!7062> (cherry picked from commit 18f9fc91)
-
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: 731dfc60 ("pan/bi: Allow vertex txl with lod=0 as compact") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <mesa/mesa!7081> (cherry picked from commit 93f90529)
-
And fix the bad assertion that let this slip. Like combines, nir_op_vec can be vector, and we need to lower this ourselves. Thankfully, the lowering is simple. Fixes dEQP-GLES2.functional.shaders.loops.for_uniform_iterations.nested_tricky_dataflow_1_* Fixes: b2c6cf2b ("pan/bi: Eliminate writemasks in the IR") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <mesa/mesa!7081> (cherry picked from commit a204eac7)
-
Fixes rendering corruption in the shadowmappingcascade Sascha Willems Vulkan demo. To see the corruption, I adjusted the demo options as follows: 1. Enable "Display depth map" 2. Set "Split lambda" to 0.100 3. Make "Cascade" non-zero. Fixes: 80ffbe91 ("anv: Add support for HiZ+CCS") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <mesa/mesa!7046> (cherry picked from commit cce6fc3b)
-
In 53bfcdee, we added load/store_scratch instructions which deviate a little bit from most memory load/store instructions in that we can't use the normal untyped read/write instructions which can read and write up to a vec4 at a time. Instead, we have to use the DWORD scattered read/write instructions which are scalar. To handle this, we added code to brw_nir_lower_mem_access_bit_sizes to cause them to be scalarized. However, one case was missing: the load-as-larger-vector case. In this case, we take small bit-sized constant-offset loads replace it with a 32-bit load and shuffle the result around as needed. For scratch, this case is much trickier to get right because it often emits vec2 or wider which we would then have to lower again. We did this for other load and store ops because, for lower bit-sizes we have to scalarize thanks to the byte scattered read/write instructions being scalar. However, for scratch we're not losing as much because we can't vectorize 32-bit loads and stores either. It's easier to just disallow it whenever we have to scalarize. Fixes: 53bfcdee "intel/fs: Implement the new load/store_scratch..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!6872> (cherry picked from commit fd04f858)
-
Dylan Baker authored
sys and string are unused, os is needed but not imported fixes: 412472da ("glsl: Add utility to convert text files to C strings") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <mesa/mesa!7034> (cherry picked from commit 3ff513ee)
-
Free the dummy texture descriptor BO on context destroy. Fixes: eda73d71 (etnaviv: GC7000: Texture descriptors) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Part-of: <mesa/mesa!6986> (cherry picked from commit 9d5ec7f6)
-
Fixes: 24f2b0a8 ("gallium/video: remove pipe_video_buffer.chroma_format") Closes: mesa/mesa#3595 Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <mesa/mesa!7026> (cherry picked from commit 8b205402)
-
discovered by valgrind Fixes: fd6a5e11 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <mesa/mesa!6952> (cherry picked from commit 3dc00c33)
-
When support for multi-slice fast-clears was introduced for color surfaces, an existing optimization for skipping fast-clears was not updated (this optimization assumed single-slice fast-clears). As a result, the driver began to skip multi-layer fast-clears if just the first slice was in the CLEAR state (ignoring the state of the others). A Civilization VI trace was the only workload I found to make use of this optimization and it did so for 2D, non-array textures. Therefore, this fix simply checks that the depth of the clear box is 1. It also moves the single-slice aux-state query closer to the optimization to clarify the need for the depth check. Enables iris to pass a case of the fcc-write-after-clear piglit test, [fast-clear tracking across layers 0 -> 1 -> (0,1)]. Fixes: 393f659e ("iris: Enable fast clears on other miplevels and layers than 0.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <mesa/mesa!6973> (cherry picked from commit 3f3a5f34)
-
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: ec1fa1d5 ("intel/perf: fix raw query kernel metric selection") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <mesa/mesa!7024> (cherry picked from commit 79f35444)
-
Dylan Baker authored
-
Dylan Baker authored
-
Dylan Baker authored
-
- Oct 05, 2020
-
-
This fixes 6 tests that were crashing on VC4 since EGL_KHR_swap_buffers_with_damage was enabled. dEQP-EGL.functional.swap_buffers_with_damage.*.buffer_age_render Cc: 20.2 <mesa-stable> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <mesa/mesa!6976> (cherry picked from commit 961a8d71)
-
Fix defect reported by Coverity Scan. Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking rsc suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Fixes: 6173cc19 ("freedreno: gallium driver for adreno") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!6903> (cherry picked from commit 0a7bd14d)
-
When I copied and pasted the code from MOV_INDIRECT for handling the dependency controls, I missed a subtle difference between MOV_INDIRECT and SHUFFLE. Specifically, MOV_INDIRECT gets lowered to a narrow instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to split it in the generator. Therefore, the check safety check for whether or not we can use dependency control has to be based on the lowered width rather than the width of the original instruction. Fixes: a8ac61b0 "intel/fs: NoMask initialize the address..." Closes: mesa/mesa#3593 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!6989> (cherry picked from commit 8427e560)
-
The volatile pattern gives me flaky results for 32-bit builds on ChromeOS Android. This is because on 32-bit the volatile 64-bit loads gets split into 2 32-bit loads each. So if we read the lower dword first and then the upper dword, it can happen that the upper dword is already changed but the lower dword isn't yet. In particular for occlusion queries this gives false readings, as the upper dword commonly only constains the ready bit. With the GCC atomic intrinsics we get a call to __atomic_load_8 in libatomic.so which does the right thing. An alternative fix would be to explicitly split the 32-bit loads in the right order and do a bunch of retries if things change, though that gets messy quickly and for 32-bit builds only doesn't feel worth it that much. CC: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!6933> (cherry picked from commit 7568c97d)
-
Without this, it was checking bit size compatibility with bit sizes such as 96 which is clearly invalid. No shader-db changes on Ice Lake Fixes: ce9205c0 "nir: add a load/store vectorization pass" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <mesa/mesa!6871> (cherry picked from commit 57e7c5f0)
-
Do not throw a deprecation warning if the power8 option is set to the new 'disabled' value. Instead, warn if it is still set to the legacy value 'false'. Fixes: 138c003d ("meson: deprecated 'true' and 'false' in combo options for 'enabled' and 'disabled'") Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <!6370> (cherry picked from commit 03bea54e)
-
The linker was adding all state vars as uniforms, doubling the storage size for shaders using only builtin uniforms, which increased CPU overhead for constant buffer uploads. When this code was originally ported from the GLSL IR linker we forgot to exclude builtins because the check was not done in the add_uniform_to_shader class but rather a check was done when passing variables to this class for processing. Fixes: 664e4a61 ("glsl/nir: Fill in the Parameters in NIR linker") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Tested-by: Marek Olšák <marek.olsak@amd.com> Part-of: <mesa/mesa!6958> (cherry picked from commit 038fcbca)
-
Cc: mesa-stable@lists.freedesktop.org Closes: mesa/mesa#2979 Tested-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <mesa/mesa!6825> (cherry picked from commit a8ac61b0)
-
Workaround # 22011374674 Applied to i965, iris and anv drivers No performance impact is observed with WA. Cc: mesa-stable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit 545d852a)
-
After disable SDMA on Arcturus(gfx9), dead lock with aux_context_lock is detected since si_screen_clear_buffer is called recursively before release lock. The call trace is: si_clear_render_target->si_compute_clear_render_target-> si_launch_grid_internal->si_launch_grid->si_emit_cache_flush-> si_prim_discard_signal_next_compute_ib_start->u_suballocator_alloc-> si_resource_create->si_buffer_create->si_alloc_resource-> si_screen_clear_buffer->simple_mtx_lock-> si_sdma_clear_buffer->si_pipe_clear_buffer-> si_clear_buffer->si_compute_do_clear_or_copy-> si_launch_grid_internal->si_launch_grid->si_emit_cache_flush-> si_prim_discard_signal_next_compute_ib_start->u_suballocator_alloc-> si_resource_create->si_buffer_create->si_alloc_resource-> si_screen_clear_buffer->simple_mtx_lock Fixes: 07a49bf5 "radeonsi: disable SDMA on gfx9" Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <mesa/mesa!6941> (cherry picked from commit 5e8791a0)
-
Dylan Baker authored
-
- Sep 30, 2020
-
-
If we want to use HTILE correctly we need to communicate extra stuff like clear colors. (Unlike DCC there is no HTILE FCE) CC: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit d78df70c) Part-of: <mesa/mesa!6877>
-