- Sep 13, 2019
-
-
Add a ppir dummy node for nir_ssa_undef_instr, create a reg for it and mark it as undefined, so that regalloc can set it non-interfering to avoid register pressure. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khozuzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
-
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
-
Building w/ AOSP, I was hitting the following error: external/mesa3d/src/amd/Android.common.mk:95: error: missing separator. Which was due to the changes to mesa-build-with-llvm missing a line continuation. Fixes: 96b59269 Signed-off-by: John Stultz <john.stultz@linaro.org>
-
Boris Brezillon authored
We are about to patch panfrost_flush() to flush all pending batches, not only the current one. In order to do that, we need to move the 'flush single batch' code to panfrost_batch_submit(). While at it, we get rid of the existing pipelining logic, which is currently unused and replace it by an unconditional wait at the end of panfrost_batch_submit(). A new pipeline logic will be introduced later on. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
panfrost_flush() is about to be reworked to flush all pending batches, but we want the fence to block on the last one. Let's move the fence creation logic in panfrost_flush() to prepare for this situation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
panfrost_draw_vbo() Might call the primeconvert/without_prim_restart helpers which will enter the ->draw_vbo() again. Let's delay payloads[].offset_start initialization so we don't initialize them twice. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
panfrost_attach_vt_xxx() functions are now passed a batch, and the generated FB desc is kept in panfrost_batch so we can switch FBs without forcing a flush. The postfix->framebuffer field is restored on the next attach_vt_framebuffer() call if the batch already has an FB desc. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
So we can emit SET_VALUE jobs for a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
We'll soon be able to flush a batch that's not currently bound to the context, which means ctx->pipe_framebuffer will not necessarily be the FBO targeted by the wallpaper draw. Let's prepare for this case and use ctx->wallpaper_batch in panfrost_blit_wallpaper(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
So we can emit such jobs to a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
We need that if we want to upload transient buffers to a batch that's not currently bound to the context, which in turn will be needed if we want to relax the batch serialization we have right now (only flush batches when we need to: on a flush request, or when one batch depends on the result of other batches). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it a batch instead of a context and move the code to pan_job.c. With this in place, we can now test if a batch is targeting a scanout FB even if this batch is not bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
Will be replaced by something similar but using a BOs as keys instead of resources. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Boris Brezillon authored
This way we have all the fb_state information directly attached to a batch and can pass only the batch to functions emitting CMDs, which is needed if we want to be able to queue CMDs to a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
-
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>
-
Eric Engestrom authored
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
-
Boris Brezillon authored
mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe() which is designed to protect against removal of the current entry, but removing the entry placed just after the current one will lead to a use-after-free situation. Luckily, the midgard_pair_load_store() logic guarantees that the instruction being removed (if any) is never placed just after ins which in turn guarantees that the hidden __next variable always points to a valid object. Took me a bit of time to realize that this code was safe, so I'm suggesting to get rid of the inner mir_foreach_instr_in_block_from() loop and rework the code so that the removed instruction is always the current one (which is what the list_for_each_entry_safe() API was initially designed for). While at it, we also get rid of the unecessary insert(ins)/remove(ins) dance by simply moving the instruction around. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Boris Brezillon authored
list_for_each_entry() does not allow modifying the current item pointer. Let's rework the skip-instructions logic in schedule_block() to not break this rule. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
The V3D documentation states that primitive counters are reset when we emit Tile Binning Mode Configuration items, which we do at the start of each draw call, however, in the actual hardware this doesn't seem to take effect when transform feedback is not active (this doesn't happen in the simulator). This causes a problem in the following scenario: glBeginTransformFeedback() glDrawArrays() glPauseTransformFeedback() glDrawArrays() glResumeTransformFeedback() glEndTransformFeedback() The TF pause will trigger a flush of the primitive counters, which results in a correct number of primitives up to that point. In theory, the counter should then be reset when we execute the draw after pausing TF, but that doesn't happen, and since TF is enabled again by the resume command before we end recording, by the time we end the transform feedback recording we again check the counters, but instead of reading 0, we read again the same value we read at the time we paused, incorrectly accumulating that value again. In theory, we should be able to avoid this by using the other method to reset the primitive counters: using operation 1 instead of 0 when we flush the counts to the buffer at the time we pause, but again, this doesn't seem to be work and we still see obsolete counts by the time we end transform feedback. This patch fixes the problem by not accumulating TF primitive counts unless we know we have actually queued draw calls during transform feedback, since that seems to effectively reset the counters. This should also be more performant, since it saves unnecessary stalls for the primitive counters to be updated when we know there haven't been any new primitives drawn. Fixes CTS tests: dEQP-GLES3.functional.transform_feedback.* Reviewed-by: Eric Anholt <eric@anholt.net>
-
This was updating the counter for the indexed draw path only, but we are already updating the counter for all paths a bit later, so this is only duplicating counts for indexed paths. Reviewed-by: Eric Anholt <eric@anholt.net>
-
Fixes: 0f2d1dfe ("v3d: use the GPU to record primitives written to transform feedback") Reviewed-by: Eric Anholt <eric@anholt.net>
-
Reviewed-by: Eric Anholt <eric@anholt.net>
-
Tomeu Vizoso authored
So we can better correlate different results to versions of the runner. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
We haven't updated in a long time, so better do it now and again when 5.3 is released. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
Instead of running it with the Wayland platform, which introduces unwanted dependencies and complexity. Makes tests run 30% faster, as well. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Samuel Pitoiset authored
streamout_buffers is assigned after that function, so the previous fix was completely wrong. This probably fix something when streamout buffers and push constants are used/inlined in the same shader. Fixes: 378e2d24 ("radv: fix computing number of user SGPRs for streamout buffers") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Faith Ekstrand authored
When the UNDEF instruction was added, we didn't do anything special in split_virtual_grfs. This mean that anything with an UNDEF wasn't getting split which causes problems for the compiler. Among other things, it makes RA harder because things are in bigger chunks. It also meant that dvec4s weren't getting split which means that they are larger than the maximum register size. Shader-db results on Kaby Lake: total instructions in shared programs: 14959202 -> 14960035 (<.01%) instructions in affected programs: 96197 -> 97030 (0.87%) helped: 140 HURT: 128 helped stats (abs) min: 1 max: 17 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.09% max: 6.15% x̄: 0.65% x̃: 0.45% HURT stats (abs) min: 1 max: 825 x̄: 8.28 x̃: 1 HURT stats (rel) min: 0.13% max: 139.83% x̄: 1.70% x̃: 0.50% 95% mean confidence interval for instructions value: -2.96 9.18 95% mean confidence interval for instructions %-change: -0.56% 1.51% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4372 -> 4372 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 352646771 -> 352840997 (0.06%) cycles in affected programs: 218600800 -> 218795026 (0.09%) helped: 21167 HURT: 21411 helped stats (abs) min: 1 max: 2924 x̄: 36.89 x̃: 10 helped stats (rel) min: <.01% max: 41.90% x̄: 2.97% x̃: 0.98% HURT stats (abs) min: 1 max: 26027 x̄: 45.54 x̃: 10 HURT stats (rel) min: <.01% max: 324.46% x̄: 3.88% x̃: 1.06% 95% mean confidence interval for cycles value: 2.87 6.26 95% mean confidence interval for cycles %-change: 0.40% 0.55% Cycles are HURT. total spills in shared programs: 8840 -> 8953 (1.28%) spills in affected programs: 126 -> 239 (89.68%) helped: 1 HURT: 2 total fills in shared programs: 21782 -> 21914 (0.61%) fills in affected programs: 431 -> 563 (30.63%) helped: 1 HURT: 3 LOST: 0 GAINED: 5 Shader-db results on Haswell: total instructions in shared programs: 13320918 -> 13320769 (<.01%) instructions in affected programs: 40998 -> 40849 (-0.36%) helped: 146 HURT: 56 helped stats (abs) min: 1 max: 8 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.16% max: 8.60% x̄: 2.52% x̃: 2.22% HURT stats (abs) min: 2 max: 23 x̄: 4.45 x̃: 4 HURT stats (rel) min: 0.21% max: 10.26% x̄: 6.83% x̃: 10.26% 95% mean confidence interval for instructions value: -1.26 -0.21 95% mean confidence interval for instructions %-change: -0.62% 0.77% Inconclusive result (%-change mean confidence interval includes 0). total loops in shared programs: 4373 -> 4373 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 374518258 -> 374384193 (-0.04%) cycles in affected programs: 231101954 -> 230967889 (-0.06%) helped: 21427 HURT: 19438 helped stats (abs) min: 1 max: 2035 x̄: 31.09 x̃: 8 helped stats (rel) min: <.01% max: 40.95% x̄: 2.42% x̃: 0.86% HURT stats (abs) min: 1 max: 20875 x̄: 27.38 x̃: 8 HURT stats (rel) min: <.01% max: 59.09% x̄: 2.49% x̃: 0.80% 95% mean confidence interval for cycles value: -4.49 -2.07 95% mean confidence interval for cycles %-change: -0.14% -0.04% Cycles are helped. total spills in shared programs: 23406 -> 23411 (0.02%) spills in affected programs: 3 -> 8 (166.67%) helped: 0 HURT: 2 total fills in shared programs: 34845 -> 34850 (0.01%) fills in affected programs: 3 -> 8 (166.67%) helped: 0 HURT: 2 LOST: 0 GAINED: 0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111566 Fixes: f4ef34f2 "intel/fs: Add an UNDEF instruction to avoid..." Reviewed-by: Francisco Jerez <currojerez@riseup.net>
-
Jiadong Zhu authored
_mesa_texstore_z32f_x24s8 calculates source rowStride at a pace of 64-bit, this will make inaccuracy offset if the width of src image is an odd number. Modify src pointer to int_32* as source image format is gl_float which is 32-bit per pixel. Reviewed by Ilia Mirkin Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
-
Rob Clark authored
The AnTuTu "garden" benchmark overflows the fixed size constbuffer stateobject, so lets be more clever and calculate (a potentially slightly pessimistic) actual size. Signed-off-by: Rob Clark <robdclark@chromium.org>
-
Adam Jackson authored
Accidentally dropped in 4fdd455e. Fixes: 4fdd455e ("gallium: Require LLVM >= 3.4) Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
-
- Sep 12, 2019
-
-
Emma Anholt authored
Fixes: 272f9cfe ("dri: Use DRM_FORMAT_* instead of defining our own copy.") Reviewed-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
-
Emma Anholt authored
It started showing up as unreliable post-merge. There's a valgrind complaint, but even fixing that doesn't make it stable.
-
Rob Clark authored
fd6_blitter.c:724:31: warning: passing argument 1 of ‘fd_resource_level_linear’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>
-
Emma Anholt authored
Since freedreno's kernel and GPU reset seem to be totally solid, we don't need to have the complexity of the LAVA setup that panfrost has. Instead, we can register some boards as shared gitlab runners and have the jobs run out of a docker container just like we do for llvmpipe. Just make sure that the DRI device node is passed through to the containers in the gitlab config ('devices = ["/dev/dri"]' under runners.docker). If a runner fails (networking dies, kernel panic, etc.) it'll take out one build but the rest can keep going since gitlab-runner is what pulls jobs. Since the runner pulls jobs, it also means that they can live behind firewalls instead of needing some public address to be accessed by gitlab.fd.o. For now, enable it just on db410c (A307) and cheza (A630) as those are the hardware that I have plenty of. A307 is only testing GLES2 since running all of GLES3 takes too long for the number of boards I've brought up. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
-
Emma Anholt authored
Sometimes you just want confirmation that dEQP really picked up the driver we built you thought. This is not as good as one might like, because git isn't present in the cross-build image. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
-
Emma Anholt authored
A handful of tests on freedreno have been close to the watchdog timeout, and now sporadically fail since range analysis has slowed down the compiler for them. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
-
This brings back the fallback previously present in st_nir_lookup_parameter_index(): if there's no parameter associated with the variable, use a parameter from a variable with the same prefix. We'll have to sort out something for SPIR-V, but in the meantime let's fix GLSL. Fixes: b6384e57 ("mesa/st: Lookup parameters without using names") Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Eric Anholt <eric@anholt.net>
-
Adam Jackson authored
This slot is always filled in with __glFillImage. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>
-
Eric Engestrom authored
"partial" because `nir_intrinsics_h` was missing. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
-
Eric Engestrom authored
"partial" because `nir_intrinsics_h` was missing. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
-