- Dec 04, 2019
-
-
Dylan Baker authored
-
Dylan Baker authored
-
Fixes: 13ab63bb ('radv: Implement VK_EXT_buffer_device_address.') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit 35fab1ba)
-
If an app first creates a compute pipeline with VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it without that flag, the driver should re-compile the compute shader. Otherwise, it will return the unoptimized one. Fixes: ce188813 ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit 9ab27647)
-
This implementation is loosely based on ROCm. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10. Fixes: 227c29a8 ("amd/common/gfx10: implement scan & reduce operations") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit c9aa8439) Conflicts resolved by Dylan Baker
-
When a fragment shader includes an input variable decorated with SampleId or SamplePosition, sample shading should be enabled because minSampleShadingFactor is expected to be 1.0. Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit 86a5fbfd) Conflicts resolved by Dylan Baker
-
Do not link libgallium_nine with libgalliumvl_stub if it's already linked with libgalliumvl. Linking with stub leads to "duplicate symbol" errors. Fixes: 6b4c7047 ("meson: build gallium nine state_tracker") Closes: mesa/mesa#2040 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (cherry picked from commit 9af22ccd) Conflicts resolved by Dylan Baker
-
Faith Ekstrand authored
gl_Viewport is also in the VUE header so we need to whack the read offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that case as well. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit b1f37688)
-
brw_performance_query_metrics.h was removed in 134e750e and brw_performance_query.h was removed in 8ae66679 remove reference to these files from Makefile.sources Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Fixes: 134e750e ("i965: extract performance query metrics") Fixes: 8ae66679 ("intel/perf: move query_object into perf") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit 34dda0ca)
-
BACK_LEFT attachment can be outdated when the user calls KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a damage region update on the wrong pipe_resource object. Let's delay the ->set_damage_region() call until the attachments are updated when we're in that case. Reported-by: Carsten Haitzler <raster@rasterman.com> Fixes: 492ffbed ("st/dri2: Implement DRI2bufferDamageExtension") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit b196e1a8)
-
pthread_mutex_unlock() when unlocked is documented by posix as being undefined behaviour. On OpenBSD pthread_mutex_unlock() will call abort(3) if this happens. This occurs in amdgpu_winsys_create() after cb446dc0 winsys/amdgpu: Add amdgpu_screen_winsys Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 3fe3bde4)
-
They were out of sync. Besides syncing, lets ensure they never diverge again. Fixes: 8d2654a4 "radv: Support VK_EXT_inline_uniform_block." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit 4cde0e04)
-
Fixes: 946193ae "radv: add support for VK_AMD_buffer_marker" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit 25bc9102)
-
This reverts commit f97b731c. Closes: mesa/mesa#250 Reviewed-by: Roland Scheidegger <sroland@vmware.com> (cherry picked from commit a3c8bc10)
-
The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of valid data and 31 bits of junk. Results of comparisons that are used as Boolean values need to have a fixup applied to generate the proper 0/~0 values. Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup code from being generated. This results in a sequence like: cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F */ ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F /* 0F */ (+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD g8<8,8,1>UD instead of cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F */ ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F /* 0F */ or(16) g4<1>UD g4<8,8,1>UD g8<8,8,1>UD (+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD 1UD I examined a couple of the shaders hurt by this change, and ALL of them would have been affected by this bug. :( Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Closes: mesa/mesa#1836 Fixes: 0ba9497e ("intel/fs: Improve discard_if code generation") Iron Lake total instructions in shared programs: 8122757 -> 8122957 (<.01%) instructions in affected programs: 8307 -> 8507 (2.41%) helped: 0 HURT: 100 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.58% 3.03% Instructions are HURT. total cycles in shared programs: 188510100 -> 188510376 (<.01%) cycles in affected programs: 76018 -> 76294 (0.36%) helped: 0 HURT: 55 HURT stats (abs) min: 2 max: 12 x̄: 5.02 x̃: 4 HURT stats (rel) min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56% 95% mean confidence interval for cycles value: 4.33 5.71 95% mean confidence interval for cycles %-change: 0.60% 1.12% Cycles are HURT. GM45 total instructions in shared programs: 4994403 -> 4994503 (<.01%) instructions in affected programs: 4212 -> 4312 (2.37%) helped: 0 HURT: 50 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.45% 3.07% Instructions are HURT. total cycles in shared programs: 128928750 -> 128928982 (<.01%) cycles in affected programs: 67442 -> 67674 (0.34%) helped: 0 HURT: 47 HURT stats (abs) min: 2 max: 12 x̄: 4.94 x̃: 4 HURT stats (rel) min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53% 95% mean confidence interval for cycles value: 4.19 5.68 95% mean confidence interval for cycles %-change: 0.50% 1.00% Cycles are HURT. (cherry picked from commit e51eda99)
-
- Nov 22, 2019
-
-
Dylan Baker authored
-
Dylan Baker authored
-
- Nov 21, 2019
-
-
Dylan Baker authored
Closes: mesa/mesa#1921
-
From OES_EGL_image_external_essl3 Closes: mesa/mesa#1901 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Tapani Pälli <tapani.palli@intel.com>
-
Fixes: 32aba91c (llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Closes: mesa/mesa#2131
-
Fixes: cea39af2 ("freedreno/ir3: Generalize ir3_shader_disasm()") Reviewed-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit d0f38394)
-
Specifically when we are in non-uniform control flow, as we would need to set the condition for the last instruction. If (for example) a image atomic load stores directly their value on a NIR register, last_inst would be a nop, and would fail when set the condition. Fixes piglit test: spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test Fixes: 6281f26f ("v3d: Add support for shader_image_load_store.") v2: (Changes suggested by Eric Anholt) * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of them have the same restriction. * Update comment explaining why we add a MOV in that case * Tweak commit message. v3: * Drop extra set of parens (Eric) * Add missing ld signal to is_ld_signal to fix shader-db regression. Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit b4bc59e3)
-
Fixes dEQP test: dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read Fixes piglit test: spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test Fixes: 6281f26f ("v3d: Add support for shader_image_load_store.") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit d9830551)
-
Eric Engestrom authored
Two files exist in that directory: - vulkan_xlib_randr.h - vulkan_xlib_xrandr.h Both were imported in 205c2715 ("vulkan: Update the XML and headers to 1.1.70") with identical contents (ie. the VK_EXT_acquire_xlib_display extension), but the former was never included anywhere and can't be found upstream [1], while the latter is included in vulkan.h and found upstream. [1] https://github.com/KhronosGroup/Vulkan-Headers/tree/master/include/vulkan Fixes: 205c2715 ("vulkan: Update the XML and headers to 1.1.70") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit 344859c3)
-
- Nov 20, 2019
-
-
Dylan Baker authored
-
Dylan Baker authored
-
Dylan Baker authored
-
Faith Ekstrand authored
The bounds checking is actually less safe than just pushing the data. If the bounds checking actually ever kicks in and it's not on the last UBO push range, then the shrinking will cause all subsequent ranges to be pushed to the wrong place in the GRF. One of the behaviors we definitely don't want is for OOB UBO access to result in completely unrelated UBOs returning garbage values. It's safer to just push the UBOs as-requested. If we're really concerned about robustness, we can emit shader code to do bounds checking which should be stupid cheap (a CMP followed by SEL). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
A security advisory (TALOS-2019-0857/CVE-2019-5068) found that creating shared memory regions with permission mode 0777 could allow any user to access that memory. Several Mesa drivers use shared- memory XImages to implement back buffers for improved performance. This path changes the shmget() calls to use 0600 (user r/w). Tested with legacy Xlib driver and llvmpipe. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (cherry picked from commit 02c3dad0)
-
Danylo Piliaiev authored
Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting 3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in SuperTuxCart and Tropico 6 which was seen only on Haswell. The reason for this is unknown and fix was found empirically. The closest mention in PRM is that it should improve performance. From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS): "When the BLEND_STATE pointer changes but not the CC_STATE pointer, driver needs to force a CC_STATE pointer change to improve blend performance in pixel backend." Closes: mesa/mesa#1834 Fixes: eca4a654 ("i965: Disable dual source blending when shader doesn't support it on gen8+") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit 6f17fe06)
-
Large programs, e.g. gnome-shell and firefox, may tax the addressability of the Medium code model once a (potentially unbounded) number of dynamically generated JIT-compiled shader programs are linked in and relocated. Yet the default code model as of LLVM 8 is Medium or even Small. The cost of changing from Medium to Large is negligible: - an additional 8-byte pointer stored immediately before the shader entrypoint; - change an add-immediate (addis) instruction to a load (ld). Testing with WebGL Conformance (https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html) yields clean runs with this change (and crashes without it). Testing with glxgears shows no detectable performance difference. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226 Closes: mesa/mesa#223 Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com> CC: mesa-stable@lists.freedesktop.org Signed-off-by: Ben Crocker <bcrocker@redhat.com> (cherry picked from commit 9c3be6d2) Conflicts resolved Dylan (PIPE_ARCH -> UTIL_ARCH rename)
-
Pierre-Eric Pelloux-Prayer authored
Use unsigned values otherwise signed extension will produce a 64 bits value where the 32 left-most bits are 1. Fixes: 2afeed30 ("radeonsi: tell the shader disk cache what IR is used")
-
Pierre-Eric Pelloux-Prayer authored
Until 8bef4df1 the IR (TGSI or NIR) was used in disk_cache driver_flags. This commit restores this features to avoid crashing when switching from one IR to the other. As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI to keep the same driver_flags. Fixes: 8bef4df1 ("radeonsi: add si_debug_options for convenient adding/removing of options") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
-
Pierre-Eric Pelloux-Prayer authored
Disable sdma on gfx10 until all timeouts bugs are fixed. See: mesa/mesa#1907 https://bugs.freedesktop.org/show_bug.cgi?id=111481 Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Marek Olšák authored
radeonsi doesn't use the format and internal shaders don't set it. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> (cherry picked from commit f704fb7f) Closes: mesa/mesa#2112
-
Marek Olšák authored
This caused a failure in NIR validation. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit 3906fce8)
-
After the discussion in https://github.com/KhronosGroup/OpenGL-API/issues/45 the section 8.17 (texture completeness) of the OpenGL 4.6 core profile was changed to explicitly say that multisample texture completeness ignores filter state of the texture. "Using the preceding definitions, a texture is complete unless any of the following conditions hold true: ... - The minification filter requires a mipmap (is neither NEAREST nor LINEAR), the texture is not multisample, and the texture is not mipmap complete. - The texture is not multisample; either the magnification filter is not NEAREST, or the minification filter is neither NEAREST nor NEAREST_- MIPMAP_NEAREST; and any of – The internal format of the texture is integer (see table 8.12). – The internal format is STENCIL_INDEX. – The internal format is DEPTH_STENCIL, and the value of DEPTH_- STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX." Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 6b672e34)
-
This prevents some additional optimizations that would change the original result. This includes things like (b < a && b < c) => b < min(a, c) and !(a < b) => b >= a. Both of these optimizations were specifically observed in the piglit tests added in piglit!160. This was discovered while investigating mesa/mesa#1958 . However, the problem in that issue was Chrome or Angle is replacing calls to isnan() with some stuff that we (correctly) optimize to false. If they had left the calls to isnan() alone, everything would have just worked. No shader-db changes on any Intel platform. I also tried marking the comparison generated by the isnan() function precise. The precise marker "infects" every computation involved in calculating the parameter to the isnan() function, and this severely hurt all of the (few) shaders in shader-db that use isnan(). I also considered adding a new ir_unop_isnan opcode that would implement the functionality. During GLSL IR-to-NIR translation, the resulting comparison operation would be marked exact (and the samething would need to happen in SPIR-V translation). This approach taken by this patch seemed easier, but we may want to do the ir_unop_isnan thing anyway. Fixes: d55835b8 ("nir/algebraic: Add optimizations for "a == a && a CMP b"") Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (cherry picked from commit 9be4a422)
-
Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (cherry picked from commit ea19f2fb)
-
On ICL we have the src1 restriction which is applied through fix_byte_src() and potentially changes the type of the operands from 8 to 32 bits. When this change happens, we fall into the "else if (bit_size < 32)" case and miscompute src_type because it takes into consideration bit_size (8) instead of the adjusted size of temp_op (32). This results in the shader reading unused memory, giving us mostly failures, but occasional passes due to whatever was already in the registers we were reading. This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as: dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2 This can also be verified by simply changing fix_byte_src() to apply on all platforms. Fixes: 5847de6e ("intel/compiler: don't use byte operands for src1 on ICL") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (cherry picked from commit eb635216)
-