Commits · panfrost-deqp-gles3 · Tomeu Vizoso / mesa

Jan 13, 2020

gles3 · 9abd7c8d
Tomeu Vizoso authored Jan 13, 2020

9abd7c8d
temp · 188865a9
Tomeu Vizoso authored Jan 08, 2020

188865a9

panfrost: Use DBG macro to avoid noise in the console · 47f76c15

Tomeu Vizoso authored Jan 06, 2020



It pollutes the output of programs that use Panfrost and can confuse its
callers, such as test runners.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

47f76c15

pan/midgard: Handle nir_intrinsic_load_barycentric_centroid · a0e63347

Tomeu Vizoso authored Jan 03, 2020



To avoid hitting the assert in the default case, add a nop for this
intrinsic.

dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.3

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

a0e63347

TEMP: decode all of midgard_format · 954cf46b
Tomeu Vizoso authored Dec 20, 2019

954cf46b
panfrost: Add more info to some assertions · ba27209d
Tomeu Vizoso authored Dec 19, 2019
```
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
```
ba27209d
panfrost: Print intended field when decoding · db8257f0
Tomeu Vizoso authored Dec 19, 2019
```
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
```
db8257f0
temp · 86e7437e
Tomeu Vizoso authored Jan 07, 2020

86e7437e

gitlab-ci: Run GLES3 tests in dEQP on Panfrost · e3902e74

Tomeu Vizoso authored Dec 18, 2019

We are able to run only 1/5th of the tests in around the same time that
dEQP-GLES2 takes, so do that for now while more DUTs are installed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

e3902e74

gitlab-ci: Consolidate container and build stages for LAVA · c34c73a6

Tomeu Vizoso authored Dec 17, 2019



Use the normal build job to also prepare the artifacts for LAVA jobs.

For that, the build container needs to also build the test suites,
kernel, ramdisk, etc.

Then the build job will place the just-built Mesa in the ramdisk and the
test job can generate a LAVA job and point to those artifacts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

c34c73a6

panfrost: fix transform feedback · babb97ae

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020



Fix different use cases for transform feedback by setting:

 - PIPE_CAP_PACKED_STREAM_OUTPUT=0
 - PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED=1
 - PIPE_CAP_PSIZ_CLAMPED=1

This is enough for all dEQP xfb-related test cases to run
successfully.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

babb97ae

gallium: add PIPE_CAP_PSIZ_CLAMPED · c85c67c5

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020

This new capability indicates that the point size has been clamped.
This also means that the gl_PointSize has been modified and that
its value should be lowered for transform feedback, if needed.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

c85c67c5

gallium: add PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED · 8fccc861

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020

This new capability indicates that the nir_lower_viewport_transform
pass is enabled. This also means that the gl_Position value is
modified and should be lowered for transform feedback, if needed.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

8fccc861

gallium: add PIPE_CAP_PACKED_STREAM_OUTPUT · 5f16fde3

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020



Setting this cap to 0 (default is 1) should disable packing
optimization for stream output (e.g. GL transform feedback captured
variables).

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

5f16fde3

glsl/linker: add xfb workaround for modified built-in variables · ee27cc38

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020



Some lowering passes modify the value of built-in variables in
order for drivers to work properly. However, modifying such values
will also break transform feedback as the captured value won't
match what's expected.

For example, on some hardware, the vertex shaders are expected to
output gl_Position in screen space. However, the transform
feedback captured value is still supposed to be the world-space
coordinates (see nir_lower_viewport_transform).

To fix that, we create a new variable that contains the
pre-transformation value and use it for transform feedback instead
of the built-in one.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

ee27cc38

glsl/linker: handle array/struct members for DisableXfbPacking · 39dec3c8

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020



When varying packing is disabled for transform feedback and a xfb
declaration points to an array element or structure member, the
element/member should be aligned to the start of a slot as well.
If that's not the case, a new varying is created and the
element/member value is copied.

There might a way to further optimize the number of slots allocated
or the number of copies necessary if the performance cost is
problematic. For example, in cases where simply padding the top
level variable might correctly align all the captured values.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

39dec3c8

glsl/linker: add DisableTransformFeedbackPacking workaround · f46cfc14

Louis-Francis Ratté-Boulianne authored Oct 12, 2019 and

Tomeu Vizoso committed Jan 13, 2020



Some drivers (e.g. Panfrost) don't support packing of varyings when
used for transform feedback. This new constant ensures that any
varying used for xfb is aligned at the start of a slot and won't be
packed with other varyings.

Scenarios where transform feedback declarations are related to an
array element or a struct member will be handled in a subsequent
patch.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>

f46cfc14

panfrost: Fix headers and gpu_headers memory leak · 63288574

Daniel Ogorchock authored Jan 07, 2020 and

Tomeu Vizoso committed Jan 13, 2020



The per-batch headers/gpu_headers dynarrays need to be freed during the
batch cleanup to prevent leaking.

Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <!3308>
Part-of: <!3308>

63288574

panfrost: Fix panfrost_bo_access memory leak · 2848edc0

Daniel Ogorchock authored Jan 06, 2020 and

Tomeu Vizoso committed Jan 13, 2020



The bo access needs to be freed prior to removing it from its hash
table. This prevents leaking them over time.

Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!3308>

2848edc0

radv/gfx10: improve performance for TES using PrimID but not exporting it · ecace268

Samuel Pitoiset authored Jan 08, 2020



This field is for the primitive ID export to the fragment shader.
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

ecace268

radv/gfx10: add support for NGG passthrough mode · 1db276ba

Samuel Pitoiset authored Jan 09, 2020



Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

1db276ba

radv/gfx10: do not declare LDS for NGG if useless · 471738e9

Samuel Pitoiset authored Jan 08, 2020



Only needed for NGG without passthrough mode or for NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

471738e9

radv/gfx10: determine if a pipeline is eligible for NGG passthrough · 0758f645

Samuel Pitoiset authored Jan 09, 2020



It can't be enabled for geometry shaders, for NGG streamout and
for vertex shaders that export the primitive ID. NGG passthrough
requires that LDS isn't used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

0758f645

radv/gfx10: disable vertex grouping · c65015f8

Samuel Pitoiset authored Jan 07, 2020



RadeonSI and AMDVLK does that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

c65015f8

Jan 12, 2020

nvc0: treat all draws without color0 broadcast as MRT · 201b88a9

Ilia Mirkin authored Jan 12, 2020



Per the semi-recently-released NVIDIA docs, when this bit is not
enabled, then the result for RT[0] will be used. So if e.g. only a
single RT is drawn to and it's not RT[2], the results will not be
visible. Fixes
GTF-GL45.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline
which was failing due to a frag shader outputting only to location=2.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

201b88a9

gm107/ir: avoid combining geometry shader stores at 0x60 · 3e9aacb1

Ilia Mirkin authored Jan 06, 2020

This corresponds to gl_PrimitiveID and gl_Layer. When both of these are
stored in a single AST.64 or AST.128 operation, then it appears as
though the whole store fails. Fixes the recently extended
glsl-1.50-transform-feedback-builtins piglit, and also
gtf30.GL3Tests.transform_feedback.transform_feedback_builtins.

The issue was reproduced on GM107 and GP108 but not GK208 nor GK104.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

3e9aacb1

nvc0: add dummy reset status support · 3be708eb

Ilia Mirkin authored Dec 30, 2019



Perhaps in a future implementation, such events could be passed back to
the driver, or queried directly. However for now, this is required for
GL 4.3 robustness contexts.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

3be708eb

nv50,nvc0: fix destination coordinates of blit · 83811846

Ilia Mirkin authored Dec 29, 2019



The fix was found by Karol Herbst a long time ago, but it was unclear
why it helped or if it would create additional problems. This change
adds a comment that explains what's going on, and in the process also
normalizes the nv50 implementation to match.

The coordinates which are fed to gl_Position map directly to pixel
coordinates, since the viewport transform is disabled. If the
framebuffer is MSAA, then that doesn't affect the pixel coordinates at
all, it's just that each pixel has multiple samples.

Note that this makes it really clear that this approach is inappropriate
for EXT_framebuffer_multisample_blit_scaled, and also the 3d path will
fail terribly for direct copies. Thankfully the 2d path normally takes
care of this.

Fixes KHR-GL43.packed_depth_stencil.blit.depth32f_stencil8 as well as
scaling issues in a number of EXT_framebuffer_multisample-related piglit
tests (although they continue to fail due to inaccuracies).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

83811846

radv: Use new scanout gfx9 metadata flag. · bfd9e7ff

Bas Nieuwenhuizen authored Dec 31, 2019



This updates for the new metadata ABI in radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <mesa/mesa!3244>
Part-of: <mesa/mesa!3244>

bfd9e7ff

lima: fix PIPE_CAP_* to mark features that aren't supported yet · f06be794

Vasily Khoruzhick authored Jan 10, 2020



lima doesn't support alpha test, flat shading, two-sided color nor
clip planes. We can enable these caps when corresponding hw features
are implemented in the driver.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

f06be794

lima: implement polygon offset · 8a421135

Vasily Khoruzhick authored Jan 10, 2020



Fixes some of dEQP-GLES2.functional.polygon_offset.* tests and shadows in Q3A.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

8a421135

lima: fix viewport clipping · b936b1f9

Vasily Khoruzhick authored Jan 10, 2020



Apparently Mali4x0 doesn't do viewport clipping, so anything rendered beyond viewport
is still rendered. Looks like we need to use scissors to do clipping.

Fixes most of dEQP-GLES2.functional.clipping.*, 6 out of 7 remaining failures
fail on blob as well. Remaining [1] fails on many other gallium drivers.

[1] dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

b936b1f9

lima: fix PLBU_CMD_PRIMITIVE_SETUP command · 997a30d7

Vasily Khoruzhick authored Jan 10, 2020



Apparently it doesn't depend on primitive type, the value
only depends on whether we specify point size via PLBU command --
bit 12 is set in this case

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

997a30d7

glsl: fix potential bug in nir uniform linker · 6bafd230

Timothy Arceri authored Jan 10, 2020



The state value of main_uniform_storage_index will be wrong for
add_parameter() when find_and_update_previous_uniform_storage()
finds a uniform if there is more than 1 uniform used in
multiple shader stages.

The new code is also simpler.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>

6bafd230

Jan 11, 2020

etnaviv: add deqp debug option · db7967ef

Christian Gmeiner authored Jan 10, 2020



This new debug option will fake some driver CAPs to be able to run dEQP
for GLES3.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <mesa/mesa!3351>
Part-of: <mesa/mesa!3351>

db7967ef

aco/wave32: Set the definitions of v_cmp instructions to the lane mask. · 44a6b17d

Timur Kristóf authored Jan 10, 2020



The output of v_cmp instructions is s1 (a single SGPR) in wave32 mode,
as opposed to s2 (an SGPR-pair) in wave64 mode.
A couple of cases where this should have been fixed were omitted from
the previous patch by mistake.

Fixes: e0bcefc3
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

44a6b17d

Jan 10, 2020

pan/midgard: Support indirect UBO offsets · 59d30fd4

Alyssa Rosenzweig authored Jan 10, 2020



...in case we have arrays in a UBO block that we'd like to access
indirectly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <mesa/mesa!3352>
Part-of: <mesa/mesa!3352>

59d30fd4

intel/fs: Make implied_mrf_writes() an fs_inst method. · c20dc9b8

Francisco Jerez authored Dec 27, 2019



This will be convenient in a later commit enabling SIMD32 fragment
shaders, and happens to fix the calculation for MATH instructions
which is currently inaccurate for SIMD-lowered instructions on Gen4-5
platforms (all of them on Gen4 in SIMD16 mode), since it was based on
the shader's dispatch width rather than on the actual execution size
of the instruction.

This causes some shader-db noise on Gen4 due to the more compact
register allocation interacting with the SEND dependency workarounds,
but otherwise no major changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

c20dc9b8

intel/fs/cse: Fix non-deterministic behavior due to inaccurate liveness calculation. · 591f146f

Francisco Jerez authored Dec 29, 2019



The liveness calculation done by the local CSE pass in order to prune
AEB entries whose sources are no longer live is currently inaccurate,
because the live intervals are calculated once at the beginning of the
pass, so they don't take into account any of the copy instructions
inserted by the CSE pass as it makes progress.  However the IP counter
used in that calculation is based on the start_ip of the basic block,
which is updated automatically whenever any instructions are inserted
into the CFG.  This causes the IP counter and liveness intervals to
get out of sync in programs with multiple basic blocks, causing the
CSE pass to toss AEB entries prematurely, which can lead to missed
optimization opportunities rather non-deterministically.

On BDW this leads to the following shader-db changes:

 total instructions in shared programs: 14952488 -> 14951763 (-0.00%)
 instructions in affected programs: 45416 -> 44691 (-1.60%)
 helped: 40
 HURT: 4

 total spills in shared programs: 20989 -> 20970 (-0.09%)
 spills in affected programs: 103 -> 84 (-18.45%)
 helped: 3
 HURT: 0

 total fills in shared programs: 24981 -> 24926 (-0.22%)
 fills in affected programs: 127 -> 72 (-43.31%)
 helped: 3
 HURT: 0

In addition it avoids a number of regressions in combination with some
of the optimization changes I'm working on for SIMD32, which would
have made CSE more effective...  Causing it to be less effective
elsewhere in the program astonishingly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

591f146f

intel/fs: Fix nir_intrinsic_load_barycentric_at_sample for SIMD32. · cc0ea482

Francisco Jerez authored Dec 27, 2019

For uniform sample ID, only the first channel of msg_data will be
initialized. We need to pass that component only to the SEND message
for SIMD lowering to unzip the descriptor source correctly.

Fixes several dozens of conformance test failures with SIMD32 fragment
shaders enabled, including:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.dynamic_sample_number.*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

cc0ea482

Admin message