Commits · drop_superfluous_mmap · Rohan Garg / mesa

Aug 14, 2019

Drop superfluous panfrost_drm_mmap_bo · 3fe89a83

Rohan Garg authored Aug 15, 2019



We mmap the import BO as required now, so we can drop the TODO
as well as mmap'ing it right after the import.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>

3fe89a83

pan/midgard: Allocate spill_slot once · 6c84a266

Alyssa Rosenzweig authored Aug 13, 2019



Multiple spill moves share a single spill slot. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

6c84a266

pan/midgard: Use hint on midgard_instruction for spill_move · 2a9031ea

Alyssa Rosenzweig authored Aug 13, 2019

This allows us to have multiple spill moves, whereas otherwise for N
spill moves, the first N-1 would be clobbered. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

2a9031ea

panfrost: Remove panfrost_add_dependency asserts · 3e6f2e7a

Alyssa Rosenzweig authored Aug 13, 2019



It doesn't... make a ton of sense to need to assert and this routine is
hotter than you might expect. Doesn't matter for release builds, of
course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

3e6f2e7a

radeonsi: add support for Renoir · aafc95ce
Marek Olšák authored Jan 02, 2019
```
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
```
aafc95ce

meson: add nir tests to the compiler/nir test suite · a3d60241

Eric Engestrom authored Jul 20, 2019



Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

a3d60241

EGL: sync headers with Khronos · d0916edf
Eric Engestrom authored Aug 08, 2019
```
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
```
d0916edf

relnotes: Add new ext on etnaviv for 19.2. · 2c4fe6af

Christian Gmeiner authored Aug 14, 2019



Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>

2c4fe6af

etnaviv: fix weird indentation · 17200bb6

Christian Gmeiner authored Aug 14, 2019



Fixes: 797a2e4f ("etnaviv: update logic to determine uniform limits")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>

17200bb6

nir/algebraic: Reassociate shift-by-constant of shift-by-constant · 0e6581b8

Ian Romanick authored Aug 06, 2019



v2: After some review discussion with Alyssa, the replacements now
correct account for cases where (b+c) >= bitsize.

v3: Use a temporary to simplify the Python code quite a bit.  Suggested
by Jason.

Haswell and all Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16251155 -> 16249576 (<.01%)
instructions in affected programs: 232627 -> 231048 (-0.68%)
helped: 547
HURT: 1
helped stats (abs) min: 1 max: 15 x̄: 2.89 x̃: 3
helped stats (rel) min: 0.04% max: 7.84% x̄: 1.14% x̃: 1.06%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
95% mean confidence interval for instructions value: -3.12 -2.65
95% mean confidence interval for instructions %-change: -1.20% -1.06%
Instructions are helped.

total cycles in shared programs: 365924392 -> 365372103 (-0.15%)
cycles in affected programs: 59207053 -> 58654764 (-0.93%)
helped: 497
HURT: 34
helped stats (abs) min: 1 max: 29300 x̄: 1118.16 x̃: 16
helped stats (rel) min: <.01% max: 10.59% x̄: 1.82% x̃: 1.82%
HURT stats (abs)   min: 2 max: 424 x̄: 101.03 x̃: 63
HURT stats (rel)   min: 0.07% max: 46.17% x̄: 4.72% x̃: 2.06%
95% mean confidence interval for cycles value: -1426.41 -653.77
95% mean confidence interval for cycles %-change: -1.66% -1.15%
Cycles are helped.

total spills in shared programs: 8870 -> 8871 (0.01%)
spills in affected programs: 104 -> 105 (0.96%)
helped: 0
HURT: 1

Ivy Bridge and all pre-Gen7 platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11956236 -> 11955635 (<.01%)
instructions in affected programs: 94110 -> 93509 (-0.64%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 5.67 x̃: 4
helped stats (rel) min: 0.12% max: 4.71% x̄: 1.96% x̃: 0.76%
95% mean confidence interval for instructions value: -6.62 -4.72
95% mean confidence interval for instructions %-change: -2.27% -1.64%
Instructions are helped.

total cycles in shared programs: 179296340 -> 178788044 (-0.28%)
cycles in affected programs: 51009603 -> 50501307 (-1.00%)
helped: 82
HURT: 7
helped stats (abs) min: 5 max: 27820 x̄: 6199.00 x̃: 16
helped stats (rel) min: 0.30% max: 8.16% x̄: 2.58% x̃: 3.11%
HURT stats (abs)   min: 2 max: 8 x̄: 3.14 x̃: 2
HURT stats (rel)   min: 0.02% max: 1.40% x̄: 0.34% x̃: 0.10%
95% mean confidence interval for cycles value: -7649.38 -3773.00
95% mean confidence interval for cycles %-change: -2.71% -1.99%
Cycles are helped.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v2]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

0e6581b8

nir/algebraic: Reassociate add-and-shift to be shift-and-add · 73aaeac0

Ian Romanick authored Jul 10, 2019



A common thing in many shaders:

    uniform vs { vec4 bones[...]; };

    ...

    x = some_calculation(bones[i + 0]);
    y = some_calculation(bones[i + 1]);
    z = some_calculation(bones[i + 2]);

This turns into stuff like

    vec1 32 ssa_12 = iadd ssa_11, ssa_0
    vec1 32 ssa_13 = ishl ssa_12, ssa_3
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_15 = iadd ssa_11, ssa_1
    vec1 32 ssa_16 = ishl ssa_15, ssa_3
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_18 = iadd ssa_11, ssa_2
    vec1 32 ssa_19 = ishl ssa_18, ssa_3
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

By reassociating the shift and the add, we can reduce this to

    vec1 32 ssa_12 = ishl ssa_11, ssa_3
    vec1 32 ssa_13 = iadd ssa_12, ssa_0
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_16 = iadd ssa_12, ssa_1
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_19 = iadd ssa_12, ssa_2
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

v2: Add some commentary from Rhys Perry's nearly identical patch.

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16277758 -> 16250704 (-0.17%)
instructions in affected programs: 1440284 -> 1413230 (-1.88%)
helped: 4920
HURT: 6
helped stats (abs) min: 1 max: 69 x̄: 5.50 x̃: 4
helped stats (rel) min: 0.10% max: 18.33% x̄: 2.21% x̃: 1.79%
HURT stats (abs)   min: 1 max: 12 x̄: 4.50 x̃: 3
HURT stats (rel)   min: 0.18% max: 3.23% x̄: 1.91% x̃: 2.55%
95% mean confidence interval for instructions value: -5.67 -5.31
95% mean confidence interval for instructions %-change: -2.26% -2.16%
Instructions are helped.

total cycles in shared programs: 367118526 -> 365895358 (-0.33%)
cycles in affected programs: 93504145 -> 92280977 (-1.31%)
helped: 2754
HURT: 1269
helped stats (abs) min: 1 max: 47039 x̄: 460.66 x̃: 16
helped stats (rel) min: <.01% max: 34.93% x̄: 3.77% x̃: 1.12%
HURT stats (abs)   min: 1 max: 1500 x̄: 35.85 x̃: 9
HURT stats (rel)   min: 0.01% max: 17.35% x̄: 2.18% x̃: 0.75%
95% mean confidence interval for cycles value: -387.31 -220.78
95% mean confidence interval for cycles %-change: -2.11% -1.68%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

73aaeac0

nir/find_array_copies: Reject copies with mismatched lengths · ff2225cf

Andrii Simiklit authored Aug 07, 2019 and

Faith Ekstrand committed Aug 14, 2019



copy_deref for wildcard dereferences requires the same
arrays lengths otherwise it leads to a crash in optimizations
like 'nir_opt_copy_prop_vars' because these optimizations expect
'copy_deref' just for arrays with the same lengths.

v2: check was moved to 'try_match_deref' to fix aoa cases
                 (Jason Ekstrand <jason@jlekstrand.net>)
v3: -fixed comment
    -the condition merged with other one
                 (Jason Ekstrand <jason@jlekstrand.net>)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111286


Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>

ff2225cf

pan/midgard: Prefix blobber-db output for grepping · c4a4f3db
Alyssa Rosenzweig authored Aug 14, 2019
```
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
```
c4a4f3db

pan/midgard: Implement blobber-db · 5f0f9e13

Alyssa Rosenzweig authored Aug 14, 2019



We wire through some shader-db-style stats on the current shader in the
disassemble so we can get a quick estimate of shader complexity from a
trace.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Rob Clark <robdclark@chromium.org>

5f0f9e13

pan/midgard: Break, not return, in disassembler · 863bdd1f

Alyssa Rosenzweig authored Aug 14, 2019



We'll want to dump some stats after the shader, and I refuse to use one
teensy little goto.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

863bdd1f

nir/range-analysis: Fail gracefully on non-SSA sources · f2965fde
Ian Romanick authored Aug 12, 2019
```
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
```
f2965fde

etnaviv: split destroy_shader · 1290cc3e

Christian Gmeiner authored Aug 12, 2019



Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>

1290cc3e

etnaviv: split link_shader · f90b23b8

Christian Gmeiner authored Aug 12, 2019



Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>

f90b23b8

etnaviv: split dump_shader · 0765a1dd

Christian Gmeiner authored Aug 14, 2019



Also this adds the missing impl for etna_dump_shader_nir(..).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>

0765a1dd

etnaviv: mv etnaviv_compiler.c etnaviv_compiler_tgsi.c · a36d04da

Christian Gmeiner authored Aug 14, 2019



Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>

a36d04da

etnaviv: correct PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE handling · b2da8a83

Christian Gmeiner authored Aug 14, 2019



Have a correct answer to GL_MAX_FRAGMENT_UNIFORM_VECTORS and
GL_MAX_VERTEX_UNIFORM_VECTORS.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

b2da8a83

etnaviv: update logic to determine uniform limits · 797a2e4f

Christian Gmeiner authored Aug 14, 2019



Taken 1:1 from the header file.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

797a2e4f

etnaviv: put uniform limit determination into own function · 45cb5eee

Christian Gmeiner authored Aug 14, 2019



Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

45cb5eee

etnaviv: Use reentrant screen lock around flush · 8f97262c

Marek Vasut authored Jun 04, 2019 and

Lucas Stach committed Aug 14, 2019



The flush callback may be called on the same pipe context, and thus
the same stream, from two different threads of execution. However,
etna_cmd_stream_flush{,2}() must not be called on the same stream
from two different threads of execution as that would mess up the
etna_bo refcounting and likely have other ugly side effects.

Fix this by using a reentrant screen lock around the flush callback.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

8f97262c

etnaviv: Add valgrind support · 6bb4b6d0

Marek Vasut authored Jun 09, 2019 and

Lucas Stach committed Aug 14, 2019



Add Valgrind support for etnaviv to track BO leaks.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

6bb4b6d0

etnaviv: Use hash table to track BO indexes · cf920742

Marek Vasut authored Jun 03, 2019 and

Lucas Stach committed Aug 14, 2019



Use hash table instead of ad-hoc arrays.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

cf920742

etnaviv: Fix double-free in etna_bo_cache_free() · 23f5f126

Marek Vasut authored Jun 02, 2019 and

Lucas Stach committed Aug 14, 2019



The following situation can happen in a multithreaded OpenGL application.
A BO is submitted from etna_cmd_stream #1 with flags set for read.
A BO is submitted from etna_cmd_stream #2 with flags set for write.
This triggers a flush on stream #1 and clears the BO's current_stream
pointer. If at this point, stream #2 attempts to queue BO again, which
does happen, the BO will be added to the submit list twice. The Linux
kernel driver correctly detects this and warns about it with "BO at
index %u already on submit list" kernel message.

However, when cleaning the BO cache in etna_bo_cache_free(), the BO
which was submitted twice will also be free()d twice, this triggering
a glibc double free detector.

The fix is easy, even if the BO does not have current_stream set,
iterate over current streams' list of BOs before adding the BO to it
and verify that the BO is not yet there.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

23f5f126

kmsro: Add missing definitions to Android.mk · 1ea95e37

Roman Stratiienko authored Aug 06, 2019



Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Rob Herring <robh@kernel.org>

1ea95e37

softpipe: Add support for ARB_derivative_control · 742d3c91

Gert Wollny authored Aug 13, 2019 and

Gert Wollny committed Aug 14, 2019



Enables and passes piglits:

spec/ARB_drivative_control/
        dfdx-coarse
        dfdx-dfdy
        dfdx-fine
        dfdy-coarse
        dfdy-fine

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

742d3c91

lima/ppir: print srcs and dests in ppir_node_print_prog() · b579af77

Vasily Khoruzhick authored Aug 01, 2019



Now we have an accessors for ppir src, so it's possible to easily
print all srcs and dests while dumping ppir representation.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

b579af77

lima/ppir: use src accessors in ppir regalloc · 6920710a

Vasily Khoruzhick authored Aug 01, 2019



Get rid of most switch/case by using src accessors

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

6920710a

lima/ppir: add ppir_node to ppir_src · a5e7c12c

Vasily Khoruzhick authored Jul 24, 2019



We'll need it if we want to walk through node sources

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

a5e7c12c

lima/ppir: introduce accessors for ppir_node sources · afa64a21

Vasily Khoruzhick authored Jul 24, 2019



Sometimes we need to walk through ppir_node sources, common
accessor for all node types will simplify code a lot.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

afa64a21

Aug 13, 2019

iris: Expose aux buffer as 2nd plane w/modifiers · 0f5be81e

Jordan Justen authored Jun 24, 2019



Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

0f5be81e

iris: Export and import surfaces with modifiers that have aux data · 246eebba

Jordan Justen authored Jul 09, 2019



The DRI interface for modifiers with aux data treats the aux data as a
separate plane of the main surface.

When the dri layer requests the plane associated with the aux data, we
save the required information into the dri aux plane image.

Later when the image is used, the dri plane image will be available in
the pipe_resource structure's `next` field. Therefore in iris, we
reconstruct the aux setup from this separate dri plane image when the
image is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

246eebba

iris: Do proper format checks for Y+CCS modifier support · 99c8eb99
Kenneth Graunke authored Mar 26, 2019 and Jordan Justen committed Aug 13, 2019
```
We need to ensure that the DRI image format supports CCS.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
```
99c8eb99

iris: Create single bo for surfaces with modifiers and aux data · 51f941c2

Jordan Justen authored Jun 23, 2019



Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

51f941c2

iris: Split iris_resource_alloc_aux to enable aux modifiers · 2c7b577e

Jordan Justen authored Jun 23, 2019



Reworks:

 * If the aux-state is not ISL_AUX_STATE_AUX_INVALID, then use memset
   even when memset_value is zero. The hiz buffer initial aux-state
   will be set to invalid, and therefore we can skip the memset. But,
   for CCS it will be set to ISL_AUX_STATE_PASS_THROUGH, and therefore
   the aux data must be cleared to 0 with the memset. Previously we
   would use BO_ALLOC_ZEROED with the CCS aux data, so this memset
   wasn't required. Now, the CCS aux data may be part of the main
   surface. We prefer to not use BO_ALLOC_ZEROED excessively, so the
   memset is needed for the CCS case. (Nanley)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

2c7b577e

iris: Add aux offset into hiz_address · aad36dfd

Jordan Justen authored Aug 13, 2019



This is not currently required because the hiz buffer is in a separate
buffer, and therefore the offset is 0. If we combine the aux buffer
with the main surface buffer, then the hiz offset may become non-zero.

Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

aad36dfd

tgsi_to_nir: add assertions for max varying slots · f5e1f9cc
Marek Olšák authored Aug 13, 2019
```
Nine uses GENERIC slots > 31.

Trivial.
```
f5e1f9cc

Admin message