- 20 Jul, 2020 22 commits
-
-
Icecream95 authored
Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!5892>
-
Icecream95 authored
This will be needed for 8x MRT with 128-bit framebuffer formats. Adds support for 256-bit, 1024-bit, and 2048-bit tilebuffer allocations, depending on the amount of data required. v2: Squash commits (Alyssa) Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!5892>
-
Icecream95 authored
For most GPUs RGTC is disabled, so it needs to be emulated, using the fake_rgtc option of u_transfer_helper. Passes the rgtc-teximage tests in piglit. v2: Update docs/features.txt (Alyssa) Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!5975>
-
Rhys Perry authored
If the SPIR-V had a shared+image memory barrier, we would emit two NIR barriers: a shared barrier and an image barrier. Unlike a single barrier, two barriers allows transformations such as: intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1) intrinsic memory_barrier_shared () () intrinsic memory_barrier_image () () intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0) -> intrinsic memory_barrier_shared () () intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0) intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1) intrinsic memory_barrier_image () () This commit fixes two dEQP-VK.memory_model.* CTS tests with ACO. Signed-off-by:
Rhys Perry <pendingchaos02@gmail.com> Reviewed-by:
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <!5951>
-
Samuel Pitoiset authored
When the VRAM size is small and the preferred heap only VRAM, the kernel tries to always honor the requested heap and it does a ton of evictions which is a disaster for performance. On APUs, VRAM and GTT have similar performance, so allow the kernel to choose the best placement (GTT or VRAM) itself. This gives a huge performance boost with Doom Eternal on RAVEN. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <!5665>
-
Samuel Pitoiset authored
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads are unexpected (because they aren't cached). Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <!5959>
-
Pierre-Eric Pelloux-Prayer authored
This fix si_compute_copy_image for yuyv image (so using PIPE_FORMAT_R8G8_R8B8_UNORM). With this change, the following gst pipeline produce the expected results for various image sizes (with or without AMD_DEBUG=nodma): gst-launch-1.0 filesrc location=input.jpg ! jpegparse ! vaapijpegdec ! filesink location=output.yuv Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!5841>
-
Pierre-Eric Pelloux-Prayer authored
Otherwise we might get VM_L2_PROTECTION_FAULT_STATUS errors. Fixes: 8275dc1e ("ac/surface: fix epitch when modifying surf_pitch") Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!5841>
-
Gert Wollny authored
Fixes: arb_gpu_shader5-xfb-streams Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
This fixes all the piglits from arb_sample_shading "samplemask * *" with the nir backend. Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
The sample mask must be applied when more then one sample is available or multisamplig is not enabled, so add a shader key to track this. Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
This makes sure no components are written that shouldn't be written. Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Setting the offset must happen in the same CF like using it, so don't emit ALU instruction between the tex instructions. Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Gert Wollny authored
Signed-off-by:
Gert Wollny <gert.wollny@collabora.com> Part-of: <!5963>
-
Dave Airlie authored
I hadn't realised these were disabled, llvmpipe now exposes this extension. One additional failure is fine to get the added testing coverage. Reviewed-by:
Michel Dänzer <mdaenzer@redhat.com> Part-of: <!5973>
-
Dave Airlie authored
v1.1: Merge two if blocks (Roland) Reviewed-by:
Roland Scheidegger <sroland@vmware.com> Part-of: <!5914>
-
- 19 Jul, 2020 1 commit
-
-
Dave Airlie authored
Running complete CTS turned up a missing cond render. Fixes KHR-GL45.compute_shader.conditional-dispatching Reviewed-by:
Roland Scheidegger <sroland@vmware.com> Part-of: <!5944>
-
- 18 Jul, 2020 17 commits
-
-
Rob Clark authored
Properly handle the difference between split and merged register file when determining where arrays can fit without conflicting with other arrays or pre-colored instructions. 1) if not mergedregs, only consider other things with same precision as potentially conflicting 2) if mergedregs, calculate everything in therms of half-regs and convert back to fullregs in the end Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!5957>
-
Rob Clark authored
We shouldn't divide-by-two for half-reg arrays. We set the proper node interference class, based on `arr->half`. Fixes a RA fail with 16b arrays: src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed. Caused by use/def iterators returning `arr->length` vreg namess, but only assigning the array half that many names. Also, since we are assigning unique vreg names to each array element, there is no need to try and convert from half-reg to it's conflicting full reg when pre-coloring the array elements. Getting us closer to having half-arrays work sanely with split-register-file (a5xx and earlier). Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <!5957>
-
Rob Clark authored
Print out the assigned vreg names earlier. Also print the few special nodes. Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <!5957>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <!5957>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <!5957>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!5957>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <!5957>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@chromium.org> Part-of: <!5957>
-
Mike Blumenkrantz authored
these are all fairly large sources of leaks Reviewed-by:
Antonio Caggiano <antonio.caggiano@collabora.com> Reviewed-by:
Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <!5887>
-
Mike Blumenkrantz authored
more leaks Reviewed-by:
Antonio Caggiano <antonio.caggiano@collabora.com> Reviewed-by:
Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <!5887>
-
Mike Blumenkrantz authored
this is a big leak Reviewed-by:
Antonio Caggiano <antonio.caggiano@collabora.com> Reviewed-by:
Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <!5887>
-
Mike Blumenkrantz authored
there's no sense in having these objects sitting around when they can never be used again requires adding a zink_context* pointer to each program in order to prune the hash table entry Reviewed-by:
Antonio Caggiano <antonio.caggiano@collabora.com> Reviewed-by:
Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <!5887>
-
maurossi authored
Fixes the following build error: In file included from external/mesa/src/panfrost/encoder/pan_blit.c:34: In file included from external/mesa/src/panfrost/encoder/../midgard/midgard_compile.h:27: external/mesa/src/compiler/nir/nir.h:52:10: fatal error: 'nir_opcodes.h' file not found ^~~~~~~~~~~~~~~ 1 error generated. Fixes: 293f2518 ("panfrost: Use Midgard-specific reloads") Signed-off-by:
Mauro Rossi <issor.oruam@gmail.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!5961>
-
Icecream95 authored
The function now takes a bool flush_readers instead of an access type, but some calls were not updated. Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!5962>
-
Icecream95 authored
panfrost_bo_wait is often used after panfrost_flush_batches_accessing_bo, so make them take similar arguments for consistency. Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!5962>
-
Eric Anholt authored
Since I was going back to look at fine derivs again, add some tests of instruction encoding. Part-of: <!5699>
-
Eric Anholt authored
legalize_block() can get run multiple times, which I didn't notice when adding fine derivs support. Other instruction clones change things such that the legalization won't trigger again, but that didn't apply to the DS.PP legalization. To keep someone else from tripping over this, split the one-shot legalization out of the iterative sync flag application. Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.* Closes: #3198 Part-of: <!5699>
-