Commits · lima-pp-undef · Andreas Baierl / mesa

Sep 13, 2019

lima/ppir: Add undef handling · 4b1a14fd

Andreas Baierl authored Aug 20, 2019 and

Erico Nunes committed Sep 13, 2019



Add a ppir dummy node for nir_ssa_undef_instr, create a reg for it and mark
it as undefined, so that regalloc can set it non-interfering to avoid
register pressure.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khozuzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>

4b1a14fd

lima/ppir: Rename ppir_op_dummy to ppir_op_undef · 4ddadd63

Andreas Baierl authored Sep 12, 2019 and

Erico Nunes committed Sep 13, 2019



Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>

4ddadd63

Android.mk: Fix missing \ from recent llvm change · 3976c86e

John Stultz authored Sep 12, 2019 and

Adam Jackson committed Sep 13, 2019



Building w/ AOSP, I was hitting the following error:
external/mesa3d/src/amd/Android.common.mk:95: error: missing separator.

Which was due to the changes to mesa-build-with-llvm  missing
a line continuation.

Fixes: 96b59269
Signed-off-by: John Stultz <john.stultz@linaro.org>

3976c86e

panfrost: Move the batch submission logic to panfrost_batch_submit() · 6ddfd37c

Boris Brezillon authored Sep 05, 2019



We are about to patch panfrost_flush() to flush all pending batches,
not only the current one. In order to do that, we need to move the
'flush single batch' code to panfrost_batch_submit().

While at it, we get rid of the existing pipelining logic, which is
currently unused and replace it by an unconditional wait at the end of
panfrost_batch_submit(). A new pipeline logic will be introduced later
on.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

6ddfd37c

panfrost: Move the fence creation in panfrost_flush() · 2fc91a16

Boris Brezillon authored Sep 05, 2019

panfrost_flush() is about to be reworked to flush all pending batches,
but we want the fence to block on the last one. Let's move the fence
creation logic in panfrost_flush() to prepare for this situation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

2fc91a16

panfrost: Delay payloads[].offset_start initialization · 835439b8

Boris Brezillon authored Sep 05, 2019



panfrost_draw_vbo() Might call the primeconvert/without_prim_restart
helpers which will enter the ->draw_vbo() again. Let's delay
payloads[].offset_start initialization so we don't initialize them
twice.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

835439b8

panfrost: Prepare things to avoid flushes on FB switch · 4166ca92

Boris Brezillon authored Sep 05, 2019



panfrost_attach_vt_xxx() functions are now passed a batch, and the
generated FB desc is kept in panfrost_batch so we can switch FBs
without forcing a flush. The postfix->framebuffer field is restored
on the next attach_vt_framebuffer() call if the batch already has an
FB desc.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

4166ca92

panfrost: Pass a batch to panfrost_set_value_job() · e5c7701a

Boris Brezillon authored Sep 05, 2019



So we can emit SET_VALUE jobs for a batch that's not currently bound
to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

e5c7701a

panfrost: Use ctx->wallpaper_batch in panfrost_blit_wallpaper() · bc0f6c0b

Boris Brezillon authored Sep 05, 2019

We'll soon be able to flush a batch that's not currently bound to the
context, which means ctx->pipe_framebuffer will not necessarily be the
FBO targeted by the wallpaper draw. Let's prepare for this case and
use ctx->wallpaper_batch in panfrost_blit_wallpaper().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

bc0f6c0b

panfrost: Pass a batch to functions emitting FB descs · aa851a62

Boris Brezillon authored Sep 01, 2019



So we can emit such jobs to a batch that's not currently bound to the
context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

aa851a62

panfrost: Pass a batch to panfrost_{allocate,upload}_transient() · 07a68835

Boris Brezillon authored Sep 01, 2019



We need that if we want to upload transient buffers to a batch that's
not currently bound to the context, which in turn will be needed if we
want to relax the batch serialization we have right now (only flush
batches when we need to: on a flush request, or when one batch depends
on the result of other batches).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

07a68835

panfrost: Allow testing if a specific batch is targeting a scanout FB · e46d95d5

Boris Brezillon authored Sep 01, 2019



Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it
a batch instead of a context and move the code to pan_job.c.

With this in place, we can now test if a batch is targeting a scanout
FB even if this batch is not bound to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

e46d95d5

panfrost: Get rid of the unused 'flush jobs accessing res' infra · 40e20324

Boris Brezillon authored Sep 05, 2019



Will be replaced by something similar but using a BOs as keys instead
of resources.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

40e20324

panfrost: Use a pipe_framebuffer_state as the batch key · 1b5873b7

Boris Brezillon authored Sep 01, 2019



This way we have all the fb_state information directly attached to a
batch and can pass only the batch to functions emitting CMDs, which is
needed if we want to be able to queue CMDs to a batch that's not
currently bound to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

1b5873b7

radeon/vcn: exclude raven2 from vcn 2.0 encode initialization · 92765f85
Indrajit Kumar Das authored Sep 10, 2019 and Leo Liu committed Sep 13, 2019
```
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
```
92765f85

gitlab-ci: rename stages to something simpler · a0f8a073

Eric Engestrom authored Sep 11, 2019



Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>

a0f8a073

panfrost: Rework midgard_pair_load_store() to kill the nested foreach loop · c9bebae2

Boris Brezillon authored Aug 27, 2019



mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe()
which is designed to protect against removal of the current entry, but
removing the entry placed just after the current one will lead to a
use-after-free situation.

Luckily, the midgard_pair_load_store() logic guarantees that the
instruction being removed (if any) is never placed just after ins which
in turn guarantees that the hidden __next variable always points to a
valid object.
Took me a bit of time to realize that this code was safe, so I'm
suggesting to get rid of the inner mir_foreach_instr_in_block_from()
loop and rework the code so that the removed instruction is always the
current one (which is what the list_for_each_entry_safe() API was
initially designed for).

While at it, we also get rid of the unecessary insert(ins)/remove(ins)
dance by simply moving the instruction around.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

c9bebae2

panfrost: Fix a list_assert() in schedule_block() · 0e513ccc

Boris Brezillon authored Aug 27, 2019



list_for_each_entry() does not allow modifying the current item pointer.
Let's rework the skip-instructions logic in schedule_block() to not
break this rule.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

0e513ccc

v3d: fix TF primitive counts for resume without draw · 2eace10c

Iago Toral authored Sep 09, 2019 and

Iago Toral committed Sep 13, 2019



The V3D documentation states that primitive counters are reset when
we emit Tile Binning Mode Configuration items, which we do at the start
of each draw call, however, in the actual hardware this doesn't seem to
take effect when transform feedback is not active (this doesn't happen in
the simulator). This causes a problem in the following scenario:

glBeginTransformFeedback()
   glDrawArrays()
   glPauseTransformFeedback()
   glDrawArrays()
   glResumeTransformFeedback()
glEndTransformFeedback()

The TF pause will trigger a flush of the primitive counters, which results
in a correct number of primitives up to that point. In theory, the counter
should then be reset when we execute the draw after pausing TF, but that
doesn't happen, and since TF is enabled again by the resume command before
we end recording, by the time we end the transform feedback recording we
again check the counters, but instead of reading 0, we read again the same
value we read at the time we paused, incorrectly accumulating that value
again.

In theory, we should be able to avoid this by using the other method to
reset the primitive counters: using operation 1 instead of 0 when we
flush the counts to the buffer at the time we pause, but again, this
doesn't seem to be work and we still see obsolete counts by the time we
end transform feedback.

This patch fixes the problem by not accumulating TF primitive counts
unless we know we have actually queued draw calls during transform
feedback, since that seems to effectively reset the counters. This should
also be more performant, since it saves unnecessary stalls for the
primitive counters to be updated when we know there haven't been any
new primitives drawn.

Fixes CTS tests:
dEQP-GLES3.functional.transform_feedback.*

Reviewed-by: Eric Anholt <eric@anholt.net>

2eace10c

v3d: remove redundant update of queued draw calls · ded6ea92

Iago Toral authored Sep 11, 2019 and

Iago Toral committed Sep 13, 2019



This was updating the counter for the indexed draw path only, but we are
already updating the counter for all paths a bit later, so this is only
duplicating counts for indexed paths.

Reviewed-by: Eric Anholt <eric@anholt.net>

ded6ea92

v3d: make sure we have enough space in the CL for the primitive counts packet · b9a07eed
Iago Toral authored Sep 10, 2019 and Iago Toral committed Sep 13, 2019
```
Fixes: 0f2d1dfe ("v3d: use the GPU to record primitives written to transform feedback")

Reviewed-by: Eric Anholt <eric@anholt.net>
```
b9a07eed
v3d: add missing line break for performance debug message · b69f51a5
Iago Toral authored Sep 09, 2019 and Iago Toral committed Sep 13, 2019
```
Reviewed-by: Eric Anholt <eric@anholt.net>
```
b69f51a5

panfrost/ci: Use releases for Volt dEQP · bc79e5c4

Tomeu Vizoso authored Sep 10, 2019



So we can better correlate different results to versions of the runner.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

bc79e5c4

panfrost/ci: Update kernel to 5.3-rc8 · c301fc02

Tomeu Vizoso authored Sep 10, 2019



We haven't updated in a long time, so better do it now and again when
5.3 is released.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

c301fc02

panfrost/ci: Run dEQP with the surfaceless platform · ca4e6637

Tomeu Vizoso authored Sep 10, 2019



Instead of running it with the Wayland platform, which introduces
unwanted dependencies and complexity.

Makes tests run 30% faster, as well.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

ca4e6637

radv: fix allocating number of user sgprs if streamout is used · 8137df3a

Samuel Pitoiset authored Sep 12, 2019

streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.

Fixes: 378e2d24 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

8137df3a

intel/fs: Handle UNDEF in split_virtual_grfs · acfa2340

Faith Ekstrand authored Sep 06, 2019

When the UNDEF instruction was added, we didn't do anything special in
split_virtual_grfs.  This mean that anything with an UNDEF wasn't
getting split which causes problems for the compiler.  Among other
things, it makes RA harder because things are in bigger chunks.  It also
meant that dvec4s weren't getting split which means that they are larger
than the maximum register size.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 14959202 -> 14960035 (<.01%)
    instructions in affected programs: 96197 -> 97030 (0.87%)
    helped: 140
    HURT: 128
    helped stats (abs) min: 1 max: 17 x̄: 1.62 x̃: 1
    helped stats (rel) min: 0.09% max: 6.15% x̄: 0.65% x̃: 0.45%
    HURT stats (abs)   min: 1 max: 825 x̄: 8.28 x̃: 1
    HURT stats (rel)   min: 0.13% max: 139.83% x̄: 1.70% x̃: 0.50%
    95% mean confidence interval for instructions value: -2.96 9.18
    95% mean confidence interval for instructions %-change: -0.56% 1.51%
    Inconclusive result (value mean confidence interval includes 0).

    total loops in shared programs: 4372 -> 4372 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 352646771 -> 352840997 (0.06%)
    cycles in affected programs: 218600800 -> 218795026 (0.09%)
    helped: 21167
    HURT: 21411
    helped stats (abs) min: 1 max: 2924 x̄: 36.89 x̃: 10
    helped stats (rel) min: <.01% max: 41.90% x̄: 2.97% x̃: 0.98%
    HURT stats (abs)   min: 1 max: 26027 x̄: 45.54 x̃: 10
    HURT stats (rel)   min: <.01% max: 324.46% x̄: 3.88% x̃: 1.06%
    95% mean confidence interval for cycles value: 2.87 6.26
    95% mean confidence interval for cycles %-change: 0.40% 0.55%
    Cycles are HURT.

    total spills in shared programs: 8840 -> 8953 (1.28%)
    spills in affected programs: 126 -> 239 (89.68%)
    helped: 1
    HURT: 2

    total fills in shared programs: 21782 -> 21914 (0.61%)
    fills in affected programs: 431 -> 563 (30.63%)
    helped: 1
    HURT: 3

    LOST:   0
    GAINED: 5

Shader-db results on Haswell:

    total instructions in shared programs: 13320918 -> 13320769 (<.01%)
    instructions in affected programs: 40998 -> 40849 (-0.36%)
    helped: 146
    HURT: 56
    helped stats (abs) min: 1 max: 8 x̄: 2.73 x̃: 2
    helped stats (rel) min: 0.16% max: 8.60% x̄: 2.52% x̃: 2.22%
    HURT stats (abs)   min: 2 max: 23 x̄: 4.45 x̃: 4
    HURT stats (rel)   min: 0.21% max: 10.26% x̄: 6.83% x̃: 10.26%
    95% mean confidence interval for instructions value: -1.26 -0.21
    95% mean confidence interval for instructions %-change: -0.62% 0.77%
    Inconclusive result (%-change mean confidence interval includes 0).

    total loops in shared programs: 4373 -> 4373 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 374518258 -> 374384193 (-0.04%)
    cycles in affected programs: 231101954 -> 230967889 (-0.06%)
    helped: 21427
    HURT: 19438
    helped stats (abs) min: 1 max: 2035 x̄: 31.09 x̃: 8
    helped stats (rel) min: <.01% max: 40.95% x̄: 2.42% x̃: 0.86%
    HURT stats (abs)   min: 1 max: 20875 x̄: 27.38 x̃: 8
    HURT stats (rel)   min: <.01% max: 59.09% x̄: 2.49% x̃: 0.80%
    95% mean confidence interval for cycles value: -4.49 -2.07
    95% mean confidence interval for cycles %-change: -0.14% -0.04%
    Cycles are helped.

    total spills in shared programs: 23406 -> 23411 (0.02%)
    spills in affected programs: 3 -> 8 (166.67%)
    helped: 0
    HURT: 2

    total fills in shared programs: 34845 -> 34850 (0.01%)
    fills in affected programs: 3 -> 8 (166.67%)
    helped: 0
    HURT: 2

    LOST:   0
    GAINED: 0

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111566


Fixes: f4ef34f2 "intel/fs: Add an UNDEF instruction to avoid..."
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

acfa2340

mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT · 33aa039a

Jiadong Zhu authored Jul 30, 2019



_mesa_texstore_z32f_x24s8 calculates source rowStride at a
pace of 64-bit, this will make inaccuracy offset if the width
of src image is an odd number. Modify src pointer to int_32* as
source image format is gl_float which is 32-bit per pixel.

Reviewed by Ilia Mirkin

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>

33aa039a

freedreno/a6xx: pre-calculate userconst stateobj size · b4df115d

Rob Clark authored Sep 11, 2019



The AnTuTu "garden" benchmark overflows the fixed size constbuffer
stateobject, so lets be more clever and calculate (a potentially
slightly pessimistic) actual size.

Signed-off-by: Rob Clark <robdclark@chromium.org>

b4df115d

gallium: Restore VSX for llvm >= 4 · 5a9dec75

Adam Jackson authored Sep 12, 2019



Accidentally dropped in 4fdd455e.

Fixes: 4fdd455e ("gallium: Require LLVM >= 3.4)
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

5a9dec75

Sep 12, 2019

egl/android: Fix build since the DRI fourcc removal. · 2efc8048

Emma Anholt authored Sep 12, 2019



Fixes: 272f9cfe ("dri: Use DRM_FORMAT_* instead of defining our own copy.")

Reviewed-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>

2efc8048

gitlab-ci/a630: Disable flappy layout_binding.ssbo.fragment_binding_array · 89e840ec

Emma Anholt authored Sep 12, 2019

It started showing up as unreliable post-merge.  There's a valgrind
complaint, but even fixing that doesn't make it stable.

89e840ec

freedreno: fix compiler warning · 966b7c3e

Rob Clark authored Sep 11, 2019

fd6_blitter.c:724:31: warning: passing argument 1 of ‘fd_resource_level_linear’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>

966b7c3e

freedreno: Introduce gitlab-based CI. · 6f0dc087

Emma Anholt authored Jun 28, 2019



Since freedreno's kernel and GPU reset seem to be totally solid, we
don't need to have the complexity of the LAVA setup that panfrost has.
Instead, we can register some boards as shared gitlab runners and have
the jobs run out of a docker container just like we do for llvmpipe.
Just make sure that the DRI device node is passed through to the
containers in the gitlab config ('devices = ["/dev/dri"]' under
runners.docker).

If a runner fails (networking dies, kernel panic, etc.) it'll take out
one build but the rest can keep going since gitlab-runner is what
pulls jobs.  Since the runner pulls jobs, it also means that they can
live behind firewalls instead of needing some public address to be
accessed by gitlab.fd.o.

For now, enable it just on db410c (A307) and cheza (A630) as those are
the hardware that I have plenty of.  A307 is only testing GLES2 since
running all of GLES3 takes too long for the number of boards I've
brought up.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

6f0dc087

gitlab-ci: Log the driver version that got tested. · 0b6b0c09

Emma Anholt authored Aug 26, 2019



Sometimes you just want confirmation that dEQP really picked up the
driver we built you thought.  This is not as good as one might like,
because git isn't present in the cross-build image.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

0b6b0c09

gitlab-ci: Disable dEQP's watchdog timer. · 8d4742fe

Emma Anholt authored Sep 03, 2019



A handful of tests on freedreno have been close to the watchdog
timeout, and now sporadically fail since range analysis has slowed
down the compiler for them.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

8d4742fe

mesa/st: Fallback to name lookup when the variable have no Parameter · f479878c

Caio Oliveira authored Sep 11, 2019 and

Emma Anholt committed Sep 12, 2019



This brings back the fallback previously present in
st_nir_lookup_parameter_index(): if there's no parameter associated
with the variable, use a parameter from a variable with the same
prefix.

We'll have to sort out something for SPIR-V, but in the meantime let's
fix GLSL.

Fixes: b6384e57 ("mesa/st: Lookup parameters without using names")
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>

f479878c

glx: Remove unused indirection for glx_context->fillImage · ad9c1838

Adam Jackson authored Aug 23, 2019



This slot is always filled in with __glFillImage.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

ad9c1838

meson/v3d: replace partial list of nir dep files with idep_nir_headers · f812cbfd

Eric Engestrom authored Sep 11, 2019



"partial" because `nir_intrinsics_h` was missing.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

f812cbfd

meson/iris: replace partial list of nir dep files with idep_nir_headers · f418de54

Eric Engestrom authored Sep 11, 2019



"partial" because `nir_intrinsics_h` was missing.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

f418de54

Admin message