- Dec 01, 2019
-
-
Connor Abbott authored
-
Connor Abbott authored
-
- Nov 24, 2019
-
-
Connor Abbott authored
-
Connor Abbott authored
While it's guaranteed that we can always eventually schedule any node, there were some rare cases where we keep unspilling other nodes so that we never actually get anything done. Particularly in a scenario like this: 1. The only fully-ready node would increase value register pressure by 2, but there are 11 live value registers (the other 10 are partially ready). 2. We try to speculatively schedule the fully-ready node, but we can only spill one other node so it fails. 3. Because there's now one slot free, we schedule a register store. 4. Now we're back to 2. Fix this by disallowing any register stores from when spilling fails until the next node is scheduled.
-
- Nov 23, 2019
-
-
Connor Abbott authored
The usual linear-scan register allocation algorithm can't handle preallocated registers, since we might be forced to choose a color for a non-preallocated variable that overlaps with a pre-allocated variable. But in such cases we can simply split the live range of the offending variable when we reach the beginning of the pre-allocated variable's live range. This is still optimal in the sense that it always finds a coloring whenever one is possible, but we may not insert the smallest possible number of moves. However, since it's actually the scheduler which splits live ranges afterwards, we can simply fold in the move while keeping its fake dependencies, and then everything still works! In other words, inserting a live range split for a value register during register allocation is pretty much free. This means that we can split register allocation in two. First globally allocate the cross-block registers accessed through load_reg and store_reg instructions, which is still done via graph coloring, and then run a linear scan algorithm over each block, treating the load_reg and store_reg nodes as referring to pre-allocated registers. This makes the existing RA more complicated, but it has two benefits: first, using round-robin with the linear scan allocator results in much fewer fake dependencies, resulting in around 15 less instructions in the glmark2 jellyfish shader and fixing a regression in instruction count since branching support went in. Second, it will simplify handling spilling. With just graph coloring for everything, every time we spill a node, we have to create new value registers which become new nodes in the graph and re-run RA. This is worsened by the fact that when writing a value to a temporary, we need to have an extra register available to load the write address with a load_const node. With the new scheme, we can ignore this entirely in the first part and then in the second part we can just reserve an extra register in sections where we know we have to spill. So no re-running RA many times, and we can get a good result quickly. The current implementation does linear scan backwards, so that we can insert the fake dependencies while allocating and avoid creating any move nodes at all when we have to split a live range. However, it turns out that this makes handling schedule_first nodes a bit more complicated, so it's not clear if that was worth it.
-
Connor Abbott authored
and use it with vertex shaders.
-
Connor Abbott authored
We also add a DCE pass to cleanup the result of this pass, which turns out to also be necessary to cleanup the result of nir->gpir in some cases that we didn't hit until the next commit.
-
- Nov 03, 2019
-
-
Connor Abbott authored
-
- Oct 13, 2019
-
-
Connor Abbott authored
We weren't using this function before. The name is confusing, but it changes the child while also fixing up the dependence link, if you don't have access to it already. Or at least, I think that's what the intention is, and what we'll need to change the branch condition in the next commit. Adding a dependency between the new and old source doesn't make any sense for this, and we also need to change the actual source.
-
- Oct 09, 2019
-
-
Daniel Schürmann authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
-
Daniel Schürmann authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
-
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
-
Vasily Khoruzhick authored
Cloning texture loads isn't a good idea since we may move it into a block that is not shared between all the invocations of the shader. We'd like to avoid that since it may result in undefined behavior. Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Without this, the test jobs could spuriously run after the container job failed or was cancelled, even if the build job didn't run at all. Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Samuel Pitoiset authored
The spec has probably been misinterpreted during RADV bringup. This fixes GPU hangs with dEQP-VK.binding_model.*offset_nonzero*. Fixes: f4e499ec ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
CC: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
-
Erik Faye-Lund authored
This allows drivers to communicate that they prefer R8 textures rather than A8 for glBitmap usage. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
If it's not available, we fall back to A8. This should work on all drivers, because we depend on it in the display-list code already. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Samuel Pitoiset authored
DOOM fails to handle more images than expected when the adaptative sync mode is enabled. Closes: mesa/mesa#1902 Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
NIR->LLVM and ACO already support nir_intrinsic_shader_clock. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
-
Kenneth Graunke authored
This should help avoid stalls in the pixel mask array in certain non-promoted depth cases. It especially helps for Z16, as each bit in the PMA corresponds to two pixels when using Z16, as opposed to the usual one pixel. Improves performance in GFXBench5 TRex by 22% (n=1).
-
- Oct 08, 2019
-
-
Caio Oliveira authored
-
Caio Oliveira authored
Anvil now supports and passes Vulkan CTS tests matching dEQP-VK.subgroups.*.ext_shader_subgroup_ballot.* dEQP-VK.subgroups.*.ext_shader_subgroup_vote.* and crucible tests matching func.shader-ballot.* func.shader-subgroup-vote.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Kenneth Graunke authored
Fixes Piglit's gl-2.1-polygon-stipple-fs on iris. Fixes: 63f24c3c ("gallium: Enable MESA_framebuffer_flip_y") Reviewed-by: Fritz Koenig <frkoenig@google.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
-
Fritz Koenig authored
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
-
Fritz Koenig authored
Implement glFramebufferParameteriMESA on GLES 3 so that the extension is not dependant on GLES 3.1 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
-
Fritz Koenig authored
GetFramebufferParameteriv was incorrectly spelled as GetFramebufferParameteri. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
-
Fritz Koenig authored
Bring in glFramebufferParameteriMESA/glGetFramebufferParameterivMESA Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
-
Clément Guérin authored
Fixes corruption on game startup. Closes: mesa/mesa#1888 Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
-
Kenneth Graunke authored
bound_vertex_buffers doesn't include extra draw parameters buffers. Tracking this correctly is kind of complicated, and iris_destroy_state isn't exactly in a hot path, so just loop over all VBO bindings. Fixes: 4122665d (iris: Enable ARB_shader_draw_parameters support) Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
-
Eric Engestrom authored
On Solaris, sys/sysmacros.h has long-deprecated copies of major() & minor() but not makedev(). sys/mkdev.h has all three and is the preferred choice. Let's make sure we check for all 3 major(), minor() and makedev(). Reported-by: Alan Coopersmith <alan.coopersmith@oracle.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alan Coopersmith <alan.coopersmith@oracle.com> Tested-by: Alan Coopersmith <alan.coopersmith@oracle.com>
-
Eric Engestrom authored
`drm.h` was missing a `#include <stdint.h>`, which was completely breaking the non-linux builds after 272f9cfe ("dri: Use DRM_FORMAT_* instead of defining our own copy.") started making use of it. Fixes: 272f9cfe ("dri: Use DRM_FORMAT_* instead of defining our own copy.") Closes: mesa/mesa#950 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
The list of AMD/ATI devices supported by radeon/r200/r300/r600 is complete, so anything else must use radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Bas Nieuwenhuizen authored
[212/893] Compiling C object 'src/amd/llvm/ce8261c@@amd_common_llvm@sta/ac_nir_to_llvm.c.o'. ../mesa/src/amd/llvm/ac_nir_to_llvm.c: In function ‘visit_image_atomic’: ../mesa/src/amd/llvm/ac_nir_to_llvm.c:2636:17: warning: unused variable ‘format’ [-Wunused-variable] 2636 | const GLenum format = nir_intrinsic_format(instr); | ^~~~~~ Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Boris Brezillon authored
When only the depth/stencil bufs are cleared, we should make sure the color content is reloaded into the tile buffers if we want to preserve their content. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Boris Brezillon authored
glClear()s are expected to be the first thing GL apps do before drawing new things. If there's already an existing batch targetting the same FBO that has draws attached to it, we should make sure the new clear gets a new batch assigned to guaranteed that the FB content is actually cleared with the requested color/depth/stencil values. We create a panfrost_get_fresh_batch_for_fbo() helper for that and call it from panfrost_clear(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Kenneth Graunke authored
You can't render to PIPE_BUFFER so there's no reason to prefer RGBX. PBO upload would like to use proper RGB textures as source data.
-
Hitting any fallback path on Broxton as we require clflushing the whole buffer even for an upload of a subtexture. However, since gallium provides a pbo upload path, allow it to sample packed RGB if supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
Tapani Pälli authored
This fixes a case where user first creates image and then later binds it with memory created from AHW buffer. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Bas Nieuwenhuizen authored
It gets used by the gallium auxiliary draw module, which gets used pretty much always when LLVM is used as JIT. At the same time most builds don't hit the issue here because the shared library of LLVM contains all modules. Fixes: d32690b4 ("gallivm: add coroutine pass manager support") Closes: mesa/mesa#951 Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
-