Commits · mesa-18.3.4 · Max Verevkin / mesa

Feb 18, 2019
- docs: add release notes for 18.3.4 · b26488de
  Emil Velikov authored 5 years ago
  
  Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
  mesa-18.3.4
  
  b26488de
- Update version to 18.3.4 · a41881fc
  Emil Velikov authored 5 years ago
  
  Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
  a41881fc
Feb 16, 2019

vc4: Fix copy-and-paste fail in backport of NEON asm fixes. · 55f3a4fa

Emma Anholt authored 5 years ago and

Emil Velikov committed 5 years ago

One of the cpu pointers wasn't marked as read-write, causing gcc to complain:

../src/gallium/drivers/vc4/vc4_tiling_lt.c:181:17: error: output operand constraint lacks ‘=’
                 __asm__ volatile (

Cc: Emil Velikov <emil.l.velikov@gmail.com>
Fixes: 813f0a82 ("vc4: Declare the cpu pointers as being modified in NEON asm.")

55f3a4fa

Feb 15, 2019

meson: Add dependency on genxml to anvil · d000488c

Dylan Baker authored 5 years ago and

Emil Velikov committed 5 years ago


Currently the Intel "anvil" driver races with the generation of genxml
files, while i965 has an explicit dependency. This patch adds the same
dependency to anvil.

Fixes: d1992255
       ("meson: Add build Intel "anv" vulkan driver")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 279060cd)

d000488c

radv: always export gl_SampleMask when the fragment shader uses it · 4aa92b54

Samuel Pitoiset authored 5 years ago and

Emil Velikov committed 5 years ago

For some reasons, this breaks trees rendering in Project Cars.

Fixes: 85010585 ("radv: only enable gl_SampleMask if MSAA is enabled too")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401


Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 334da034)

4aa92b54

get-pick-list: Add --pretty=medium to the arguments for Cc patches · 08ab660b

Dylan Baker authored 5 years ago and

Emil Velikov committed 5 years ago


Because none of them have been picked up for 19.0 due to this bug
being reintroduced.

v2: - Fix fixes tags

Fixes: e6b3a3b2
       ("bin/get-pick-list.sh: handle "typod" usecase.")
Fixes: fac10169
       ("bin/get-pick-list.sh: prefix output with "[stable] "")
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit aff52dd2)

08ab660b

radeonsi: Fix guardband computation for large render targets · 4bb51927

Oscar Blumberg authored 5 years ago and

Emil Velikov committed 5 years ago


Stop using 12.12 quantization for viewports that are not contained in
the lower 4k corner of the render target as the hardware needs to keep
both absolute and relative coordinates representable.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3c540e0a)

4bb51927

anv/cmd_buffer: check for NULL framebuffer · 7662965c

Juan A. Suárez authored 5 years ago and

Emil Velikov committed 5 years ago


This can happen when we record a VkCmdDraw in a secondary buffer that
was created inheriting from the primary buffer, but with the framebuffer
set to NULL in the VkCommandBufferInheritanceInfo.

Vulkan 1.1.81 spec says that "the application must ensure (using scissor
if neccesary) that all rendering is contained in the render area [...]
[which] must be contained within the framebuffer dimesions".

While this should be done by the application, commit 465e5a86 added the
clamp to the framebuffer size, in case of application does not do it.
But this requires to know the framebuffer dimensions.

If we do not have a framebuffer at that moment, the best compromise we
can do is to just apply the scissor as it is, and let the application to
ensure the rendering is contained in the render area.

v2: do not clamp to framebuffer if there isn't a framebuffer

v3 (Jason):
- clamp earlier in the conditional
- clamp to render area if command buffer is primary

v4: clamp also x and y to render area (Jason)

v5: rename used variables (Jason)

Fixes: 465e5a86 ("anv: Clamp scissors to the framebuffer boundary")
CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1ad26f94)

7662965c

cherry-ignore: radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8 · 6cea56e2
Emil Velikov authored 5 years ago
```
stable The commit addresses functionality not present in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
```
6cea56e2

intel: Add more PCI Device IDs for Coffee Lake and Ice Lake. · 5b48a260

Rodrigo Vivi authored 5 years ago and

Emil Velikov committed 5 years ago


Align with kernel commits:

5e0f5a58b167 ("drm/i915/cfl: Adding another PCI Device ID.")
03ca3cf8e9aa ("drm/i915/icl: Adding few more device IDs for Ice Lake")

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 56c3b497)

5b48a260

egl/wayland-drm: Only announce formats via wl_drm which the driver supports. · d3f49ece

Mario Kleiner authored 6 years ago and

Emil Velikov committed 5 years ago


Check if a pixel format is supported by the Wayland servers gpu driver
before exposing it to the client via wl_drm, so we avoid reporting formats
to the client which the server gpu can't handle.

Restrict this reporting to the new color depth 30 formats for now, as the
ARGB/XRGB8888 and RGB565 formats are probably supported by every gpu under
the sun.

Atm. this is mostly useful to allow proper PRIME renderoffload for depth
30 formats on the typical Intel iGPU + NVidia dGPU "NVidia Optimus" laptop
combo.

Tested on Intel, AMD, NVidia with single-gpu setup and on a Intel + NVidia
Optimus setup.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 820dfcea)

d3f49ece

egl/wayland: Allow client->server format conversion for PRIME offload. (v2) · ecad528a

Mario Kleiner authored 6 years ago and

Emil Velikov committed 5 years ago


Support PRIME render offload between a Wayland server gpu and a Wayland
client gpu with different channel ordering for their color formats,
e.g., between Intel drivers which currently only support ARGB2101010
and XRGB2101010 import/display and nouveau which only supports ABGR2101010
rendering and display on nv-50 and later.

In the wl_visuals table, we also store for each format an alternate
sibling format which stores colors at the same precision, but with
different channel ordering, e.g., ARGB2101010 <-> ABGR2101010.

If a given client-gpu renderable format is not supported by the server
for import, but the alternate format is supported by the server, expose
the client-gpu renderable format as a valid EGLConfig to the client. At
eglSwapBuffers time, during the blitImage() detiling blit from the client
backbuffer to the linear buffer, the client format is converted to the
server supported format. As we have to do a copy for PRIME anyway,
this channel swizzling conversion comes essentially for free.

Note that even if a server gpu in principle does support sampling
from the clients native format, this conversion will be a performance
advantage if it allows to convert to the servers preferred format
for direct scanout, as the Wayland compositor may then be able to
directly page-flip a fullscreen client wl_buffer onto the primary
plane, or onto a hardware overlay plane, avoiding an extra data copy
for desktop composition.

Tested so far under Weston with: nouveau single-gpu, Intel single-gpu,
AMD single-gpu, "Optimus" Intel server iGPU for display + NVidia
client dGPU for rendering.

v2: Implement minor review comments by Eric Engestrom: Add some
    comment and assert, and some style fixes for clarity.
    No functional change.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit a34b0d68)

ecad528a

intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments · f036a040

Iago Toral authored 6 years ago and

Emil Velikov committed 5 years ago


The implementation of these opcodes in the generator assumes that their
arguments are packed, and it generates register regions based on that
assumption.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 39189432)

f036a040

radv: fix compiler issues with GCC 9 · 5694279c

Samuel Pitoiset authored 5 years ago and

Emil Velikov committed 5 years ago

"The C standard says that compound literals which occur inside of
the body of a function have automatic storage duration associated
with the enclosing block. Older GCC releases were putting such
compound literals into the scope of the whole function, so their
lifetime actually ended at the end of containing function. This
has been fixed in GCC 9. Code that relied on this extended lifetime
needs to be fixed, move the compound literals to whatever scope
they need to accessible in."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543


Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 129a9f49)

5694279c

st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048 · 75340edb

Kenneth Graunke authored 5 years ago and

Emil Velikov committed 5 years ago


Piglit's vp-max-array test creates a vertex program containing a uniform
array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB.  Mesa
will then add additional state-var parameters for things like the MVP
matrix.

radeonsi currently exposes a value of 4096, derived from constant buffer
upload size.  This means the array will have 4096 elements, and the
extra MVP state-vars would get a prog_src_register::Index of over 4096.

Unfortunately, prog_src_register::Index is a signed 13-bit integer, so
values beyond 4096 end up turning into negative numbers.  Negative
source indexes are only valid for relative addressing, so this ends up
generating illegal IR.

In prog_to_nir, this would cause an out of bounds array access.
st_mesa_to_tgsi checks for a negative value, assumes it's bogus,
and remaps it to parameter 0 in order to get something in-range.
This isn't right - instead of reading the MVP matrix, it would read
the first element of the vertex program's large array.  But the test
only checks that the program compiles, so we never noticed that it
was broken.

This patch limits the size of the program limits, with the understanding
that we may need to generate additional state-vars internally.  i965 has
exposed 1024 for this limit for years, so I don't expect lowering it to
2048 will cause any practical problems for radeonsi or other drivers.

Fixes vp-max-array with prog_to_nir.c.

Cc: "19.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit f45dd6d3)

75340edb

st/va/vp9: set max reference as default of VP9 reference number · dafa02c9

Leo Liu authored 5 years ago and

Emil Velikov committed 5 years ago


If there is no information about number of render targets

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0a52a03)

dafa02c9

st/va: fix the incorrect max profiles report · 36258308

Leo Liu authored 5 years ago and

Emil Velikov committed 5 years ago

Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will
be correct when adding more profiles in the future.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107



Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 21cdb828)

36258308

winsys/amdgpu: don't drop manually added fence dependencies · f1eccd09

Marek Olšák authored 5 years ago and

Emil Velikov committed 5 years ago


wow, it's hard to believe that fence and syncobjs dependencies were ignored.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ddfe209a)

f1eccd09

radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0 · 945aa874

Marek Olšák authored 5 years ago and

Emil Velikov committed 5 years ago


Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 61c678d4)

945aa874

gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0 · b3b0a97f

Marek Olšák authored 5 years ago and

Emil Velikov committed 5 years ago


Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4522f01d)

b3b0a97f

nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks · 35459869

Faith Ekstrand authored 5 years ago and

Emil Velikov committed 5 years ago

When nir_rematerialize_derefs_in_use_blocks_impl was first written, I
attempted to optimize things a bit by not bothering to re-materialize
the sources of deref instructions figuring that the final caller would
take care of that.  However, in the case of more complex deref chains
where the first link or two lives in block A and then another link and
the load/store_deref intrinsic live in block B it doesn't work.  The
code in rematerialize_deref_in_block looks at the tail of the chain,
sees that it's already in block B and skips it, not realizing that part
of the chain also lives in block A.

The easy solution here is to just rematerialize deref sources of deref
instructions as well.  This may potentially lead to a few more deref
instructions being created by the conditions required for that to
actually happen are fairly unlikely and, thanks to the caching, it's all
linear time regardless.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603


Fixes: 7d1d1208 "nir: Add a small pass to rematerialize derefs per-block"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 9e6a6ef0)

35459869

nvc0: we have 16k-sized framebuffers, fix default scissors · a9c0e146

Ilia Mirkin authored 5 years ago and

Emil Velikov committed 5 years ago


For some reason we don't use view volume clipping by default, and use
scissors instead. These scissors were set to an 8k max fb size, while
the driver advertises 16k-sized framebuffers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cc79a148)

a9c0e146

cherry-ignore: add more 19.0 only nominations from Ilia · 541eb984
Emil Velikov authored 5 years ago
```
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
```
541eb984

Feb 14, 2019

freedreno/a6xx: Emit blitter dst with OUT_RELOCW · fb63b1b3

Kristian Høgsberg authored 5 years ago and

Emil Velikov committed 5 years ago


We're writing to the bo and the kernel needs to know for
fd_bo_cpu_prep() to work.

Fixes: f93e4312 ("freedreno/a6xx: Enable blitter")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
(cherry picked from commit 357ea7da)

fb63b1b3

amd/common: Use correct writemask for shared memory stores. · 08834a37

Bas Nieuwenhuizen authored 6 years ago and

Emil Velikov committed 5 years ago


The check was for 1 bit being set, which is clearly not what we want.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3c24fc64)

08834a37

radv: Only look at pImmutableSamples if the descriptor has a sampler. · f04d57ff

Bas Nieuwenhuizen authored 6 years ago and

Emil Velikov committed 5 years ago


Equivalent of ANV patch c7f4a286

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 39ab4e12)

f04d57ff

xvmc: fix string comparison · 45c3bf14

Eric Engestrom authored 5 years ago and

Emil Velikov committed 5 years ago


Fixes: 6fca1869 "g3dvl: Update XvMC unit tests."
Cc: Younes Manton <younes.m@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 40b53a72)

45c3bf14

xvmc: fix string comparison · 2180aa1b

Eric Engestrom authored 5 years ago and

Emil Velikov committed 5 years ago


Fixes: c7b65dca "xvmc: Define some Xv attribs to allow users
                             to specify color standard and procamp"
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 110a6e18)

2180aa1b

gallium-xlib: query MIT-SHM before using it. · fdb66dd1

Bart Oldeman authored 5 years ago and

Emil Velikov committed 5 years ago

When Mesa is compiled for gallium-xlib using e.g.
./configure --enable-glx=gallium-xlib --disable-dri --disable-gbm
-disable-egl
and is used by an X server (usually remotely via SSH X11 forwarding)
that does not support MIT-SHM such as XMing or MobaXterm, OpenGL
clients report error messages such as
Xlib:  extension "MIT-SHM" missing on display "localhost:11.0".
ad infinitum.

The reason is that the code in src/gallium/winsys/sw/xlib uses
MIT-SHM without checking for its existence, unlike the code
in src/glx/drisw_glx.c and src/mesa/drivers/x11/xm_api.c.
I copied the same check using XQueryExtension, and tested with
glxgears on MobaXterm.

This issue was reported before here:
https://lists.freedesktop.org/archives/mesa-users/2016-July/001183.html



Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a203eaa4)

fdb66dd1

cherry-ignore: nv50,nvc0: add explicit settings for recent caps · e868c776
Emil Velikov authored 5 years ago
```
stable Explicit 19.0 only nomination.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
```
e868c776

Feb 12, 2019

meson: drop the xcb-xrandr version requirement · a19ddce9

Marek Olšák authored 6 years ago and

Emil Velikov committed 5 years ago


autotools doesn't have any requirement. This fixes meson on Ubuntu 16.04.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 1e85cfb9)

a19ddce9

intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode · 7bf9cf29

Faith Ekstrand authored 6 years ago and

Emil Velikov committed 5 years ago


Previously, we only applied the fix to shaders with a dispatch mode of
SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16
instructions.  If you have a SIMD8 instruction in a SIMD16 shader,
neither would trigger and the restriction could still be hit.

Fixes: 232ed898 "i965/fs: Register allocator shoudn't use grf127..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b4f0d062)

7bf9cf29

v3d: Fix leak in resource setup error path · e0eba40a

Ernestas Kulik authored 6 years ago and

Emil Velikov committed 5 years ago


Reported by Coverity: in the case of unsupported modifier request, the
code does not jump to the “fail” label to destroy the acquired resource.

CID: 1435704
Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Fixes: 45bb8f29 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
(cherry picked from commit 90458bef)

e0eba40a

vc4: Fix leak in HW queries error path · 1a2b227f

Ernestas Kulik authored 6 years ago and

Emil Velikov committed 5 years ago


Reported by Coverity: in the case where there exist hardware and
non-hardware queries, the code does not jump to err_free_query and leaks
the query.

CID: 1430194
Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com>
Fixes: 9ea90ffb ("broadcom/vc4: Add support for HW perfmon")
(cherry picked from commit f6e49d5a)

1a2b227f

intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf() · 6beaa2d7

Faith Ekstrand authored 6 years ago and

Emil Velikov committed 5 years ago


Like all the other sends, it's just mlen * REG_SIZE.

Fixes: 3cbc02e4 "intel: Use TXS for image_size when we have..."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit cf42b0f9)

6beaa2d7

freedreno: stop frob'ing pipe_resource::nr_samples · 434f19a8

Rob Clark authored 6 years ago and

Emil Velikov committed 5 years ago


Previously we tried to normalize nr_samples to MAX2(1, nr_samples) to
avoid having to deal with 0 vs 1 everywhere.  But this causes problems
in mesa/st, for example st_finalize_texture() will think there is a
nr_samples mismatch and recreate the texture.  Somehow this manifests
as corrupt x11 font rendering on generations that do not support MSAA
(but apparently works fine on a5xx and a6xx which do support MSAA.)

Fixes: cf0c7258 freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit c3baa077)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/freedreno/freedreno_batch_cache.c

434f19a8

Jan 31, 2019

docs: add sha256 checksums for 18.3.3 · 7475d772
Emil Velikov authored 6 years ago
```
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
```
7475d772
docs: add release notes for 18.3.3 · 190a79f4
Emil Velikov authored 6 years ago
```
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
```
mesa-18.3.3

190a79f4

glsl: Fix copying function's out to temp if dereferenced by array · 871aea89

Danylo Piliaiev authored 6 years ago and

Emil Velikov committed 6 years ago


Function's out variable could be an array dereferenced by an array:
 func(v[w[i]]);
or something more complicated.

Copy index in any case.

Fixes: 76c27e47 ("glsl: Copy function out to temp if we don't directly ref a variable")

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0862929b)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109488


Nominated-by: Matt Turner <mattst88@gmail.com>

871aea89

glsl: Copy function out to temp if we don't directly ref a variable · f2c1d7ac

Timothy Arceri authored 6 years ago and

Emil Velikov committed 6 years ago


Otherwise we can end up with IR that looks like this:

    (
      (declare (temporary ) vec4 f@8)
      (assign  (xyzw) (var_ref f@8)  (var_ref f) )
      (call f16  ((swiz y (var_ref f@8) )))

      (assign  (xyzw) (var_ref f)  (var_ref f@8) )
    ))

When we really need:

      (declare (temporary ) float inout_tmp)
      (assign  (x) (var_ref inout_tmp)  (swiz y (var_ref f) ))
      (call f16  ((var_ref inout_tmp) ))

      (assign  (y) (var_ref f)  (swiz y (swiz xxxx (var_ref inout_tmp) )))
      (declare (temporary ) void void_var)

The GLSL IR function inlining code seemed to produce correct code
even without this but we need the correct IR for GLSL IR -> NIR to
be able to understand whats going on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 76c27e47)
Nominated-by: Matt Turner <mattst88@gmail.com>

f2c1d7ac

Admin message

Admin message