Commits · refactor/panfrost_job · Rohan Garg / mesa

Jun 12, 2019
- Fixes 88ae2f58: "panfrost/midgard: Remove unnecessary variables" · 782ae2b2
  Rohan Garg authored Jun 08, 2019
```
Make sure we link the last vertex job to the first tiler job.
```
  782ae2b2
Jun 07, 2019
- panfrost/midgard: Move requirement setting into panfrost_job · f0aa7522
  Rohan Garg authored Jun 07, 2019
```
Move panfrost_job_set_requirements into panfrost_get_job_for_fbo,
requirements should be set when acquiring a job from a context.
```
  f0aa7522
- panfrost/midgard: Move draw_count into panfrost_job · f91fd5c1
  Rohan Garg authored Jun 06, 2019
```
Refactor code to use draw_counts from a panfrost_job
```
  f91fd5c1
- panfrost/midgard: Remove duplicated header · 1e5b3aa6
  Rohan Garg authored Jun 05, 2019
  
  1e5b3aa6
- panfrost/midgard: Remove unnecessary variables · 88ae2f58
  Rohan Garg authored Jun 05, 2019
```
These are not required anymore since mali jobs are
now linked lists i.e. u_vertex_jobs and u_tiler_jobs
```
  88ae2f58
- panfrost/midgard: Move clearing logic into pan_job · 4d072c06
  Rohan Garg authored Jun 05, 2019
  
  4d072c06
Jun 05, 2019
- panfrost/midgard: Figure out job requirements in pan_job.c · 6439c9b2
  Rohan Garg authored Jun 05, 2019
```
Requirements for a job should be figured out in pan_job.c
```
  6439c9b2
- panfrost/midgard: Reset job counters once the job is submitted · b084aab2
  Rohan Garg authored Jun 05, 2019
```
Move the reset out of frame invalidation into job submission
```
  b084aab2
- panfrost/midgard: Initial implementation of panfrost_job_submit · 66e41e43
  Rohan Garg authored Jun 05, 2019
```
Start fleshing out panfrost_job
```
  66e41e43
Jun 04, 2019

panfrost/midgard: .pos propagation · 4a03d378

Alyssa Rosenzweig authored May 23, 2019



A previous optimization converts fmax(x, 0.0) instructions to fmov.pos.
This pass then propagates the .pos from the move up to the source
instruction (when possible). From there, copy propagation will eliminate
the move.

In the future, we might prefer to do this in common NIR code like we do
for saturate, as Bifrost can also benefit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

4a03d378

panfrost/midgard: Cleanup copy propagation · 5da0a33f

Alyssa Rosenzweig authored May 23, 2019



Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

5da0a33f

panfrost/midgard: Implement "pipeline register" prepass · 33800f46

Alyssa Rosenzweig authored May 23, 2019

This prepass, run after scheduling but before RA, specializes to
pipeline registers where possible. It walks the IR, checking whether
sources are ever used outside of the immediate bundle in which they are
written. If they are not, they are rewritten to a pipeline register (r24
or r25), valid only within the bundle itself. This has theoretical
benefits for power consumption and register pressure (and performance by
extension). While this is tested to work, it's not clear how much of a
win it really is, especially without an out-of-order scheduler (yet!).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

33800f46

panfrost/midgard: Helpers for pipeline · 2a79afc5

Alyssa Rosenzweig authored May 23, 2019



Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

2a79afc5

panfrost/midgard: Refactor schedule/emit pipeline · 3c7abbfb

Alyssa Rosenzweig authored May 22, 2019



First, this moves the scheduler and emitter out of midgard_compile.c
into their own dedicated files.

More interestingly, this slims down midgard_bundle to be essentially an
array of _pointers_ to midgard_instructions (plus some bundling
metadata), rather than the instructions and packing themselves. The
difference is critical, as it means that (within reason, i.e. as long as
it doesn't affect the schedule) midgard_instrucitons can now be modified
_after_ scheduling while having changes updated in the final binary.

On a more philosophical level, this removes an IR. Previously, the IR
before scheduling (MIR) was separate from the IR after scheduling
(post-schedule MIR), requiring a separate set of utilities to traverse,
using different idioms. There was no good reason for this, and it
restricts our flexibility with the RA. So unify all the things!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

3c7abbfb

panfrost/midgard: Cleanup RA (stylistic changes) · 0524ab9c

Alyssa Rosenzweig authored May 22, 2019



Trivial.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

0524ab9c

panfrost/midgard: Share MIR utilities · debc29b9

Alyssa Rosenzweig authored May 22, 2019

These are more generally useful than the files they were constrained to.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

debc29b9

panfrost/midgard: Misc. cleanup for readibility · 1bfa0d6c

Alyssa Rosenzweig authored May 21, 2019



Mostly, this fixes a number of instances of lines >> 80 chars,
refactoring them into something legible.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

1bfa0d6c

panfrost/midgard: Extend RA to non-vec4 sources · 2d980223

Alyssa Rosenzweig authored May 22, 2019



This represents a major break with the former RA design. We now use
conflicting register classes to represent the subdivision of Midgard's
128-bit registers into varying sizes and arrangement. We determine class
based on the number of components in the instructions' masks. To support
this, we include a number of helpers in the RA to allow composing
swizzles and masks, such that MIR written implicitly assuming .xyzw
sources can be transformed to use actual (non-aligned) sources.

The net result is a marked decrease in register pressure on
non-vec4-exclusive shaders. We could still be doing much better. Not
implemented yet are:

   - Register spilling
   - Per-component liveness

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

2d980223

panfrost/midgard: Set masks on ld_vary · c1715b55

Alyssa Rosenzweig authored May 22, 2019

These masks distinguish scalar/vec2/vec3 loads from the default vec4,
which helps with assembly readability (since it's immediately obvious
how many components are _actually_ affected, rather than doing
mysterious things to an unknown number of unused components). Later in
the series, this will enable smarter register allocation, as the unused
components will not be interpreted abnormally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

c1715b55

panfrost/midgard: Fix liveness analysis bugs · 550be763

Alyssa Rosenzweig authored May 22, 2019



This fixes liveness analysis with respect to inline constants and
branching. in practice, the symptom is abnormally high register
pressure.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

550be763

panfrost/midgard: Set int outmod for "pasted" code · c54f3f42

Alyssa Rosenzweig authored May 22, 2019



These snippets of integer assembly are injected for various purposes.
Eventually, we'll want to implement these in NIR directly. Regardless,
the "default" output modifier is different between floats and ints, so
let's set the right one.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

c54f3f42

panfrost/midgard: Hoist some utility functions · 51196c35

Alyssa Rosenzweig authored May 22, 2019



These were static to midgard_compile.c but are more generally useful
across the compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

51196c35

panfrost/midgard: Remove pinning · 005d9b1a

Alyssa Rosenzweig authored May 20, 2019

This mechanism is only used by blend shaders, so just use a move here.
Ideally, it'll be copy-propped and DCE'd away; this removes a source of
considerable indirection and will simplify RA logic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

005d9b1a

nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a) · d2d3cc66

Alyssa Rosenzweig authored Jun 04, 2019



This pattern was noticed in glmark's jellyfish scene.

v2: Add inexact qualifier due to NaN behaviour.

Minimal shader-db changes (slightly helped).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>

d2d3cc66

mesa: prevent common string formatting security issues · c9c1e261

Mark Janes authored Jun 03, 2019

Adds a compile-time error for obvious security issues like:

  printf(string_var);

The proposed flag is more tolerant than -Wformat-nonliteral.
Specifically, it tolerates common mesa formatting like:

  static const char *shader_template = "really long string %d";
  printf(shader_template, uniform_number);

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110833


Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>

c9c1e261

intel/fs: Add an UNDEF instruction to avoid excess live ranges · f4ef34f2

Faith Ekstrand authored May 29, 2019

With 8 and 16-bit types and anything where we have to use non-trivial
strides registersto deal with restrictions, we end up with things that
look like partial writes even though we don't care about any values in
the register except those written by that instruction. This is
particularly important when dealing with loops because liveness sees
is_partial_write and the fact that an old version from a previous loop
iteration may be valid at that point and extends all purely partially
written values to the entire loop.

This commit adds a new UNDEF instruction which does nothing (the
generator doesn't emit anything) but which does a fake write to the
register. This informs liveness that we don't care about any values
before that point so it won't consider those registers to be falsely
live. We can safely emit UNDEF instructions for all SSA values that
come in from NIR and nearly all temporaries generated by various stages
of the compiler. In particular, we need to insert UNDEF instructions
when we handle region restrictions because the newly allocated registers
are almost guaranteed to be partially written.

No shader-db changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110432

Reviewed-by: Matt Turner <mattst88@gmail.com>

f4ef34f2

spirv: Update the OpenCL.std.h header · d482a8f6

Caio Oliveira authored Jun 03, 2019



This corresponds to commit 8b911bd2ba37677037b38c9bd286c7c05701bcda on
GitHub.

We previously tweaked OpenCL.std.h from upstream to be included in C
code.  Now upstream header can be included, however the symbol names
are slightly different (include an OpenCLstd_ prefix), so this patch
also fixes vtn_opencl.c to use those.

Reviewed-by: Karol Herbst <kherbst@redhat.com>

d482a8f6

radv: Use bo metadata for imported image tiling on Android. · 9701cb10
Bas Nieuwenhuizen authored May 13, 2019
```
This way we handle linear images etc. correctly.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
```
9701cb10

vl: Enable DRM by default. · 392c6092

Bas Nieuwenhuizen authored May 30, 2019



If libdrm is found the pipe loader enables drm anyway, and that is
pretty much the only extra dependency this code has.

This enables creating libva display using a drm fd without having
to enable the DRM (GBM really) backend of EGL, which is completely
unrelated.

Leaving the X11 platforms alone as they would still result in the
additional inclusion of extra deps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

392c6092

anv: Advertise support for VK_EXT_fragment_shader_interlock · c2a0335b
Faith Ekstrand authored May 17, 2019
```
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
```
c2a0335b
spirv: Implement SPV_EXT_fragment_shader_interlock · 51768054
Faith Ekstrand authored May 17, 2019
```
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
```
51768054

spirv: Update the headers from latest Khronos master · b5aa76b1

Faith Ekstrand authored May 17, 2019

This corresponds to 8b911bd2ba37677037b38c9bd286c7c05701bcda in
https://github.com/KhronosGroup/SPIRV-Headers

.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

b5aa76b1

vulkan: Update the XML and headers to 1.1.110 · 8339e3f0
Faith Ekstrand authored May 17, 2019
```
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
```
8339e3f0

ac/nir: mark some texture intrinsics as convergent · 73dda855

Rhys Perry authored May 29, 2019



Otherwise LLVM can sink them and their texture coordinate calculations
into divergent branches.

v2: simplify the conditions on which the intrinsic is marked as convergent
v3: only mark as convergent in FS and CS with derivative groups

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

73dda855

radv: fix some compiler warnings · d4a2f8b3

Rhys Perry authored May 30, 2019



Fixes -Woverflow warnings with GCC 9.1.1

v2: use a cast instead of a bitwise and

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

d4a2f8b3

intel/fs: Skip registers faster when setting spill costs · a84de3fb

Faith Ekstrand authored Jun 03, 2019

This might be slightly faster since we're doing one read rather than
two before we decide to skip.  The more important reason, however, is
because no_spill prevents us from re-spilling spill registers.  In the
new world in which we don't re-calculate liveness every spill, we may
not have valid liveness for spill registers so we shouldn't even look
their live ranges up.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110825


Fixes: e99081e7 "intel/fs/ra: Spill without destroying the..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>

a84de3fb

radeonsi/nir: Fix type in bindless address computation · d68218db

Connor Abbott authored May 24, 2019



Bindless handles in GL are 64-bit. This fixes an assert failure in LLVM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

d68218db

etnaviv: implement set_active_query_state(..) for hw queries · a6e87998

Christian Gmeiner authored May 28, 2019



Clear w/ quad uses a normal draw which adds up to OQ. st/meta
uses set_active_query_state(..) to tell the driver to pause
queries in such cases.

Fixes spec@arb_occlusion_query@occlusion_query_meta_save piglit.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>

a6e87998

radv: do not use gfx fast depth clears for layered depth/stencil images · 8a35eb06

Samuel Pitoiset authored Jun 03, 2019



The driver should only fast depth clears with the graphics path
when the view covers all image layers, otherwise this might
corrupt layers when HTILE is enabled.

Cc: 19.0 19.1 mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

8a35eb06

ac,radv: do not emit vec3 for raw load/store on SI · 33f4e04d

Samuel Pitoiset authored Jun 03, 2019



It's unsupported, only load/store format with vec3 are supported.

Fixes: 6970a9a6 ("ac,radv: remove the vec3 restriction with LLVM 9+")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

33f4e04d

Admin message