Commits · mesa-19.2.7 · Trigger Huang / mesa

Dec 04, 2019

VERSION: bump version for 19.2.7 · 65d255cd
Dylan Baker authored 5 years ago

mesa-19.2.7

65d255cd
docs: Add release notes for 19.2.7 · d8e767ed
Dylan Baker authored 5 years ago

d8e767ed

radv: set writes_memory for global memory stores/atomics · 4a0199b6

Rhys Perry authored 5 years ago and

Dylan Baker committed 5 years ago


Fixes: 13ab63bb ('radv: Implement VK_EXT_buffer_device_address.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 35fab1ba)

4a0199b6

radv: fix compute pipeline keys when optimizations are disabled · 3ed8c942

Samuel Pitoiset authored 5 years ago and

Dylan Baker committed 5 years ago


If an app first creates a compute pipeline with
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it
without that flag, the driver should re-compile the compute shader.
Otherwise, it will return the unoptimized one.

Fixes: ce188813 ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 9ab27647)

3ed8c942

radv/gfx10: fix implementation of exclusive scans · 5c98b365

Samuel Pitoiset authored 5 years ago and

Dylan Baker committed 5 years ago

This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl



This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.

Fixes: 227c29a8 ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit c9aa8439)
Conflicts resolved by Dylan Baker

5c98b365

radv: fix enabling sample shading with SampleID/SamplePosition · a3869c14

Samuel Pitoiset authored 5 years ago and

Dylan Baker committed 5 years ago


When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.

Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 86a5fbfd)
Conflicts resolved by Dylan Baker

a3869c14

meson: Fix linkage of libgallium_nine with libgalliumvl · bda6890f

Yevhenii Kolesnikov authored 5 years ago and

Dylan Baker committed 5 years ago

Do not link libgallium_nine with libgalliumvl_stub if it's already
linked with libgalliumvl. Linking with stub leads to "duplicate
symbol" errors.

Fixes: 6b4c7047
       ("meson: build gallium nine state_tracker")
Closes: mesa/mesa#2040



Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 9af22ccd)
Conflicts resolved by Dylan Baker

bda6890f

anv: Set up SBE_SWIZ properly for gl_Viewport · 2bf47550

Faith Ekstrand authored 5 years ago


gl_Viewport is also in the VUE header so we need to whack the read
offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that
case as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit b1f37688)

2bf47550

i965: update Makefile.sources for perf changes · b4e83559

Jonathan Gray authored 5 years ago and

Dylan Baker committed 5 years ago


brw_performance_query_metrics.h was removed in
134e750e and
brw_performance_query.h was removed in
8ae66679

remove reference to these files from Makefile.sources

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Fixes: 134e750e ("i965: extract performance query metrics")
Fixes: 8ae66679 ("intel/perf: move query_object into perf")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 34dda0ca)

b4e83559

gallium: Fix the ->set_damage_region() implementation · a39d364a

Boris Brezillon authored 5 years ago and

Dylan Baker committed 5 years ago


BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a
damage region update on the wrong pipe_resource object.
Let's delay the ->set_damage_region() call until the attachments are
updated when we're in that case.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 492ffbed ("st/dri2: Implement DRI2bufferDamageExtension")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b196e1a8)

a39d364a

winsys/amdgpu: avoid double simple_mtx_unlock() · c166435d

Jonathan Gray authored 5 years ago and

Dylan Baker committed 5 years ago


pthread_mutex_unlock() when unlocked is documented by posix as
being undefined behaviour.  On OpenBSD pthread_mutex_unlock() will call
abort(3) if this happens.

This occurs in amdgpu_winsys_create() after
cb446dc0
winsys/amdgpu: Add amdgpu_screen_winsys

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3fe3bde4)

c166435d

radv: Unify max_descriptor_set_size. · 52a8d43a

Bas Nieuwenhuizen authored 5 years ago and

Dylan Baker committed 5 years ago


They were out of sync. Besides syncing, lets ensure they never diverge
again.

Fixes: 8d2654a4 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4cde0e04)

52a8d43a

radv: Allocate cmdbuffer space for buffer marker write. · 2e379b0a

Bas Nieuwenhuizen authored 5 years ago and

Dylan Baker committed 5 years ago


Fixes: 946193ae "radv: add support for VK_AMD_buffer_marker"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 25bc9102)

2e379b0a

Revert "draw: revert using correct order for prim decomposition." · 336f59f8

Zeb Figura authored 5 years ago and

Dylan Baker committed 5 years ago

This reverts commit f97b731c.

Closes: mesa/mesa#250



Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit a3c8bc10)

336f59f8

intel/fs: Disable conditional discard optimization on Gen4 and Gen5 · 38c8af9e

Ian Romanick authored 5 years ago and

Dylan Baker committed 5 years ago


The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of
valid data and 31 bits of junk.  Results of comparisons that are used as
Boolean values need to have a fixup applied to generate the proper 0/~0
values.

Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup
code from being generated.  This results in a sequence like:

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
(+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD     g8<8,8,1>UD

instead of

        cmp.l.f0.0(16)  g8<1>F          g14<8,8,1>F     0x0F  /* 0F */
        ...
        cmp.l.f0.0(16)  g4<1>F          g6<8,8,1>F      0x0F  /* 0F */
        or(16) g4<1>UD g4<8,8,1>UD     g8<8,8,1>UD
(+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD     1UD

I examined a couple of the shaders hurt by this change, and ALL of them
would have been affected by this bug. :(

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Closes: mesa/mesa#1836
Fixes: 0ba9497e ("intel/fs: Improve discard_if code generation")

Iron Lake
total instructions in shared programs: 8122757 -> 8122957 (<.01%)
instructions in affected programs: 8307 -> 8507 (2.41%)
helped: 0
HURT: 100
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.58% 3.03%
Instructions are HURT.

total cycles in shared programs: 188510100 -> 188510376 (<.01%)
cycles in affected programs: 76018 -> 76294 (0.36%)
helped: 0
HURT: 55
HURT stats (abs)   min: 2 max: 12 x̄: 5.02 x̃: 4
HURT stats (rel)   min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56%
95% mean confidence interval for cycles value: 4.33 5.71
95% mean confidence interval for cycles %-change: 0.60% 1.12%
Cycles are HURT.

GM45
total instructions in shared programs: 4994403 -> 4994503 (<.01%)
instructions in affected programs: 4212 -> 4312 (2.37%)
helped: 0
HURT: 50
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72%
95% mean confidence interval for instructions value: 2.00 2.00
95% mean confidence interval for instructions %-change: 2.45% 3.07%
Instructions are HURT.

total cycles in shared programs: 128928750 -> 128928982 (<.01%)
cycles in affected programs: 67442 -> 67674 (0.34%)
helped: 0
HURT: 47
HURT stats (abs)   min: 2 max: 12 x̄: 4.94 x̃: 4
HURT stats (rel)   min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53%
95% mean confidence interval for cycles value: 4.19 5.68
95% mean confidence interval for cycles %-change: 0.50% 1.00%
Cycles are HURT.

(cherry picked from commit e51eda99)

38c8af9e

Nov 22, 2019
- VERSION: bumpre to 19.2.6 · 5836dd66
  Dylan Baker authored 5 years ago
  
  mesa-19.2.6
  
  5836dd66
- docs: Add release notes for 19.2.6 · 264d1187
  Dylan Baker authored 5 years ago
  
  264d1187
Nov 21, 2019

meson: generate .pc files for gles and gles2 with old glvnd · aa620fdf
Dylan Baker authored 5 years ago
```
Closes: mesa/mesa#1921
```
aa620fdf

glsl: Enable textureSize for samplerExternalOES · 05d5784e

Yevhenii Kolesnikov authored 5 years ago and

Dylan Baker committed 5 years ago

From OES_EGL_image_external_essl3

Closes: mesa/mesa#1901



Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>

05d5784e

llvmpipe/ppc: fix if/ifdef confusion in backport. · b1f50546

Dave Airlie authored 5 years ago and

Dylan Baker committed 5 years ago


Fixes: 32aba91c (llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Closes: mesa/mesa#2131

b1f50546

freedreno/ir3: fix printing output registers of FS. · c2488d81

Hyunjun Ko authored 5 years ago and

Dylan Baker committed 5 years ago


Fixes: cea39af2 ("freedreno/ir3: Generalize ir3_shader_disasm()")

Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit d0f38394)

c2488d81

v3d: adds an extra MOV for any sig.ld* · e594e4ce

Alejandro Piñeiro authored 5 years ago and

Dylan Baker committed 5 years ago


Specifically when we are in non-uniform control flow, as we would need
to set the condition for the last instruction. If (for example) a
image atomic load stores directly their value on a NIR register,
last_inst would be a nop, and would fail when set the condition.

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test

Fixes: 6281f26f ("v3d: Add support for shader_image_load_store.")

v2: (Changes suggested by Eric Anholt)
   * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of
     them have the same restriction.
   * Update comment explaining why we add a MOV in that case
   * Tweak commit message.

v3:
   * Drop extra set of parens (Eric)
   * Add missing ld signal to is_ld_signal to fix shader-db regression.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b4bc59e3)

e594e4ce

v3d: Fix predication with atomic image operations · 8f526ee7

José María Casanova Crespo authored 5 years ago and

Dylan Baker committed 5 years ago


Fixes dEQP test:
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read

Fixes piglit test:
spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test

Fixes: 6281f26f ("v3d: Add support for shader_image_load_store.")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit d9830551)

8f526ee7

vulkan: delete typo'd header · be8a46d0

Eric Engestrom authored 5 years ago

Two files exist in that directory:
- vulkan_xlib_randr.h
- vulkan_xlib_xrandr.h

Both were imported in 205c2715 ("vulkan: Update the XML and
headers to 1.1.70") with identical contents (ie. the
VK_EXT_acquire_xlib_display extension), but the former was never
included anywhere and can't be found upstream [1], while the latter is
included in vulkan.h and found upstream.

[1] https://github.com/KhronosGroup/Vulkan-Headers/tree/master/include/vulkan



Fixes: 205c2715 ("vulkan: Update the XML and headers to 1.1.70")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 344859c3)

be8a46d0

Nov 20, 2019

docs/relnotes/19.2.5: Add SHA256 sum · 4c86eda2
Dylan Baker authored 5 years ago

4c86eda2
VERSION: bump for 19.2.5 · f418e923
Dylan Baker authored 5 years ago

mesa-19.2.5

f418e923
docs: Add relnotes for 19.2.5 · 9e0a0d2e
Dylan Baker authored 5 years ago

9e0a0d2e

anv: Stop bounds-checking pushed UBOs · e10851ff

Faith Ekstrand authored 5 years ago


The bounds checking is actually less safe than just pushing the data.
If the bounds checking actually ever kicks in and it's not on the last
UBO push range, then the shrinking will cause all subsequent ranges to
be pushed to the wrong place in the GRF.  One of the behaviors we
definitely don't want is for OOB UBO access to result in completely
unrelated UBOs returning garbage values.  It's safer to just push the
UBOs as-requested.  If we're really concerned about robustness, we can
emit shader code to do bounds checking which should be stupid cheap (a
CMP followed by SEL).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

e10851ff

Call shmget() with permission 0600 instead of 0777 · 023ddb01

Brian Paul authored 5 years ago and

Dylan Baker committed 5 years ago


A security advisory (TALOS-2019-0857/CVE-2019-5068) found that
creating shared memory regions with permission mode 0777 could allow
any user to access that memory.  Several Mesa drivers use shared-
memory XImages to implement back buffers for improved performance.

This path changes the shmget() calls to use 0600 (user r/w).

Tested with legacy Xlib driver and llvmpipe.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
(cherry picked from commit 02c3dad0)

023ddb01

i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround · 3199172e

Danylo Piliaiev authored 5 years ago

Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting
3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in
SuperTuxCart and Tropico 6 which was seen only on Haswell.
The reason for this is unknown and fix was found empirically.

The closest mention in PRM is that it should improve performance.
From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS):
 "When the BLEND_STATE pointer changes but not the CC_STATE pointer,
  driver needs to force a CC_STATE pointer change to improve
  blend performance in pixel backend."

Closes: mesa/mesa#1834


Fixes: eca4a654 ("i965: Disable dual source blending when shader doesn't support it on gen8+")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6f17fe06)

3199172e

llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders · ae071434

Ben Crocker authored 5 years ago and

Dylan Baker committed 5 years ago

Large programs, e.g. gnome-shell and firefox, may tax the
addressability of the Medium code model once a (potentially unbounded)
number of dynamically generated JIT-compiled shader programs are
linked in and relocated.  Yet the default code model as of LLVM 8 is
Medium or even Small.

The cost of changing from Medium to Large is negligible:
- an additional 8-byte pointer stored immediately before the shader entrypoint;
- change an add-immediate (addis) instruction to a load (ld).

Testing with WebGL Conformance
(https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html)
yields clean runs with this change (and crashes without it).

Testing with glxgears shows no detectable performance difference.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226

Closes: mesa/mesa#223



Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com>

CC: mesa-stable@lists.freedesktop.org

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
(cherry picked from commit 9c3be6d2)
Conflicts resolved Dylan (PIPE_ARCH -> UTIL_ARCH rename)

ae071434

radeonsi: fix shader disk cache key · 60c299c5

Pierre-Eric Pelloux-Prayer authored 5 years ago

Use unsigned values otherwise signed extension will produce a 64 bits value where
the 32 left-most bits are 1.

Fixes: 2afeed30 ("radeonsi: tell the shader disk cache what IR is used")

60c299c5

radeonsi: tell the shader disk cache what IR is used · b5b09acb

Pierre-Eric Pelloux-Prayer authored 5 years ago


Until 8bef4df1 the IR (TGSI or NIR) was used in disk_cache driver_flags.
This commit restores this features to avoid crashing when switching from
one IR to the other.

As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI
to keep the same driver_flags.

Fixes: 8bef4df1 ("radeonsi: add si_debug_options for convenient adding/removing of options")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

b5b09acb

radeonsi: disable sdma for gfx10 · 38bd621f

Pierre-Eric Pelloux-Prayer authored 5 years ago

Disable sdma on gfx10 until all timeouts bugs are fixed.

See:
    mesa/mesa#1907
    https://bugs.freedesktop.org/show_bug.cgi?id=111481



Reviewed-by: Marek Olšák <marek.olsak@amd.com>

38bd621f

tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes · 0e7e56aa

Marek Olšák authored 5 years ago


radeonsi doesn't use the format and internal shaders don't set it.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
(cherry picked from commit f704fb7f)
Closes: mesa/mesa#2112

0e7e56aa

tgsi_to_nir: fix masked out image loads · 2353a63a

Marek Olšák authored 5 years ago


This caused a failure in NIR validation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3906fce8)

2353a63a

mesa/main: Ignore filter state for MS texture completeness · 856c7edd

Illia Iorin authored 5 years ago and

Dylan Baker committed 5 years ago

After the discussion in
https://github.com/KhronosGroup/OpenGL-API/issues/45


the section 8.17 (texture completeness) of the OpenGL 4.6 core profile
was changed to explicitly say that multisample texture completeness
ignores filter state of the texture.

"Using the preceding definitions, a texture is complete unless any of the
 following conditions hold true:
   ...
  - The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
    the texture is not multisample, and the texture is not mipmap complete.
  - The texture is not multisample; either the magnification filter is not
    NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
    MIPMAP_NEAREST; and any of
    – The internal format of the texture is integer (see table 8.12).
    – The internal format is STENCIL_INDEX.
    – The internal format is DEPTH_STENCIL, and the value of DEPTH_-
      STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6b672e34)

856c7edd

nir/algebraic: Mark other comparison exact when removing a == a · 4c82d426

Ian Romanick authored 5 years ago and

Dylan Baker committed 5 years ago

This prevents some additional optimizations that would change the
original result.  This includes things like (b < a && b < c) => b <
min(a, c) and !(a < b) => b >= a.  Both of these optimizations were
specifically observed in the piglit tests added in piglit!160.

This was discovered while investigating
mesa/mesa#1958

.  However, the
problem in that issue was Chrome or Angle is replacing calls to isnan()
with some stuff that we (correctly) optimize to false.  If they had left
the calls to isnan() alone, everything would have just worked.

No shader-db changes on any Intel platform.

I also tried marking the comparison generated by the isnan() function
precise.  The precise marker "infects" every computation involved in
calculating the parameter to the isnan() function, and this severely
hurt all of the (few) shaders in shader-db that use isnan().

I also considered adding a new ir_unop_isnan opcode that would implement
the functionality.  During GLSL IR-to-NIR translation, the resulting
comparison operation would be marked exact (and the samething would need
to happen in SPIR-V translation).

This approach taken by this patch seemed easier, but we may want to do
the ir_unop_isnan thing anyway.

Fixes: d55835b8 ("nir/algebraic: Add optimizations for "a == a && a CMP b"")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit 9be4a422)

4c82d426

nir/algebraic: Add the ability to mark a replacement as exact · d8a37880
Ian Romanick authored 5 years ago and Dylan Baker committed 5 years ago
```
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit ea19f2fb)
```
d8a37880

intel/compiler: fix nir_op_{i,u}*32 on ICL · bd2f6150

Paulo Zanoni authored 5 years ago and

Dylan Baker committed 5 years ago


On ICL we have the src1 restriction which is applied through
fix_byte_src() and potentially changes the type of the operands from 8
to 32 bits. When this change happens, we fall into the "else if
(bit_size < 32)" case and miscompute src_type because it takes into
consideration bit_size (8) instead of the adjusted size of temp_op
(32). This results in the shader reading unused memory, giving us
mostly failures, but occasional passes due to whatever was already in
the registers we were reading.

This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as:
    dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2

This can also be verified by simply changing fix_byte_src() to apply
on all platforms.

Fixes: 5847de6e ("intel/compiler: don't use byte operands for src1 on ICL")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit eb635216)

bd2f6150

Admin message