- Aug 28, 2018
-
-
Sagar Ghuge authored
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
-
Sagar Ghuge authored
Gen >= 9 have ability to control clamping of depth values separately at near and far plane. z_w is clamped to the range [min(n,f), 0] if clamping at near plane is enabled, [0, max(n,f)] if clamping at far plane is enabled and [min(n,f) max(n,f)] if clamping at both plane is enabled. v2: 1) Use better coding style (Ian Romanick) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
-
Sagar Ghuge authored
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Sagar Ghuge authored
_mesa_set_enable() and _mesa_IsEnabled() extended to accept new two tokens GL_DEPTH_CLAMP_NEAR_AMD and GL_DEPTH_CLAMP_FAR_AMD. v2: Remove unnecessary parentheses (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Sagar Ghuge authored
Enable _mesa_PushAttrib() and _mesa_PopAttrib() to handle GL_DEPTH_CLAMP_NEAR_AMD and GL_DEPTH_CLAMP_FAR_AMD tokens. Remove DepthClamp, because DepthClampNear + DepthClampFar replaces it, as suggested by Marek Olsak. Driver that enables AMD_depth_clamp_separate will only ever look at DepthClampNear and DepthClampFar, as suggested by Ian Romanick. v2: 1) Remove unnecessary parentheses (Marek Olsak) 2) if AMD_depth_clamp_separate is unsupported, TEST_AND_UPDATE GL_DEPTH_CLAMP only (Marek Olsak) 3) Clamp against near and far plane separately (Marek Olsak) 4) Clip point separately for near and far Z clipping plane (Marek Olsak) v3: Clamp raster position zw to the range [min(n,f), 0] for near plane and [0, max(n,f)] for far plane (Marek Olsak) v4: Use MIN2 and MAX2 instead of CLAMP (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Sagar Ghuge authored
Add some basic types and storage for the AMD_depth_clamp_separate extension. v2: 1) Drop unnecessary definition (Marek Olsak) 2) Expose extension in compatibility profile (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Sagar Ghuge authored
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Dylan Baker authored
Currently we run the script but don't actually load any files, even in a tarball where they exist. Fixes: 3218056e ("meson: Build i965 and dri stack") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
-
Caio Oliveira authored
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Adds suppport for INTEL_fragment_shader_ordering. We achieve the fragment ordering by using the same instruction as for beginInvocationInterlockARB() which is by issuing a memory fence via sendc. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
-
This extension provides new GLSL built-in function beginFragmentShaderOrderingIntel() that guarantees (taking wording of GL_INTEL_fragment_shader_ordering extension) that any memory transactions issued by shader invocations from previous primitives mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). One advantage of INTEL_fragment_shader_ordering over ARB_fragment_shader_interlock is that it provides a function that operates as a memory barrie (instead of a defining a critcial section) that can be called under arbitary control flow from any function (in contrast the begin/end of ARB_fragment_shader_interlock may only be called once, from main(), under no control flow. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
-
Andrii Simiklit authored
When the SVBI Payload Enable is false I guess the register R1.4 which contains the Maximum Streamed Vertex Buffer Index is filled by zero and GS stops to write transform feedback when the transform feedback is not active. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107579 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
-
Rhys Perry authored
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewied-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 18.2: <mesa-stable@lists.freedesktop.org>
-
Erik Faye-Lund authored
This is quite useful for debugging shader-transpiling issues in virglrenderer. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
Erik Faye-Lund authored
This adds an environment-varaible that can be used for driver-specific flags, as well as a flag for it to enable verbose output. While we're at it, quiet some overly chatty debug-output by default. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
Erik Faye-Lund authored
This is the only direct call-site for fprintf in virgl; all other call-sites call debug_printf instead. So let's follow in style here. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
Erik Faye-Lund authored
This is just debug-cruft left over. Let's just get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
- Aug 27, 2018
-
-
There's no Vulkan support for arm atm. Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
-
V2: Add one missing @0@ Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
-
INTEL_DEBUG=hex prints 32 bit hex value and due to endianness of CPU byte order is reversed. In order to disassemble binary files, print each byte instead of 32 bit hex value. v2: Print blank spaces in order to vertically align output of compacted instructions hex value with uncompacted instructions hex value. (Matt Turner) v3: Fix line wrap at correct length Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
-
Lionel Landwerlin authored
Gen7.5 has a BLEND_STATE of size 0 which includes a variable length group. We did not deal with that very well, leading to an endless loop. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Rhys Perry authored
Gives a +7.79% increase in FPS with Hitman on lowest quality settings on my GTX 1060. total instructions in shared programs : 5787979 -> 5748677 (-0.68%) total gprs used in shared programs : 669901 -> 669373 (-0.08%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21064 (-0.02%) local shared gpr inst bytes helped 1 0 152 274 274 hurt 0 0 0 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
Rhys Perry authored
Rather than the usual three that would be created. total instructions in shared programs : 5796385 -> 5786560 (-0.17%) total gprs used in shared programs : 670103 -> 669968 (-0.02%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21068 (-0.45%) local shared gpr inst bytes helped 1 0 64 1040 1040 hurt 0 0 27 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
Rhys Perry authored
total instructions in shared programs : 5819319 -> 5796385 (-0.39%) total gprs used in shared programs : 670571 -> 670103 (-0.07%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21164 (0.00%) local shared gpr inst bytes helped 0 0 318 1758 1758 hurt 0 0 63 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
Rhys Perry authored
With this commit, OP_MAD is handled on nv50 too. This commit is also useful for later commits. Also, instead of creating a shladd, it relies on LateAlgebraicOpt to create one. This simplifies the code and helps shader-db slightly overall. total instructions in shared programs : 5820882 -> 5819319 (-0.03%) total gprs used in shared programs : 670595 -> 670571 (-0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21164 (0.00%) local shared gpr inst bytes helped 0 0 18 230 230 hurt 0 0 8 263 263 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
Rhys Perry authored
This hits the shader-db numbers a good bit, though a few xmads is way faster than an imul or imad and the cost is mitigated by the next commit, which optimizes many multiplications by immediates into shorter and less register heavy instructions than the xmads. total instructions in shared programs : 5768871 -> 5820882 (0.90%) total gprs used in shared programs : 669919 -> 670595 (0.10%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21164 (0.46%) local shared gpr inst bytes helped 0 0 38 0 0 hurt 1 0 365 3076 3076 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
Rhys Perry authored
v4: make the immediate field 16 bits v5: don't ever emit h1 flags for immediates Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
Rhys Perry authored
v4: remove uint16_t(...) v4: don't allow immediates outside [0,65535] in insnCanLoad() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
-
>From Section 4.3.4 (Inputs) of the GLSL 1.50 spec: "Only the input variables that are actually read need to be written by the previous stage; it is allowed to have superfluous declarations of input variables." Fixes: * interstage-multiple-shader-objects.shader_test v2: Update comment in ir.h since the usage of "used" field has been extended. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
-
Faith Ekstrand authored
We had two different implementations in different files. May as well have one and put it in nir.h. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
-
Samuel Iglesias Gonsálvez authored
VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1. Fixes Vulkan CTS CL#2849. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
- Aug 25, 2018
-
-
Faith Ekstrand authored
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
This fixes a GPU hang in DOOM 2016 running under wine. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104809 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
The recent commit 4616639b introduced the new function aubinator_error() which is a trivial wrapper around fprintf() to STDERR. The call to fprintf() however is passed the message msg directly: fprintf(stderr, msg); This is a format-security violation and leads to an FTBFS with -Werror=format-security (GCC 8): ../../../src/intel/tools/aubinator.c: In function 'aubinator_error': ../../../src/intel/tools/aubinator.c:74:4: error: format not a string literal and no format arguments [-Werror=format-security] fprintf(stderr, msg); ^~~~~~~ This patch fixes this trivially by introducing a catch-all "%s" format argument. Fixes: 4616639b ("intel: tools: split aub parsing from aubinator") Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
Instead of printing addresses like everyone else, we were accidentally printing the offset from state base address. Also, state_map is a void pointer so we were incrementing in bytes instead of dwords and every state other than the first was wrong. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Faith Ekstrand authored
First of all, setting iter->name in advance_field is unnecessary because it gets set by gen_decode_field which gets called immediately after gen_decode_field in the one call-site. Second, we weren't properly initializing start_bit and end_bit in the initial condition of gen_field_iterator_next so the first field of a struct would get printed wrong if it doesn't start on the first bit. This is fixed by adding a iter_start_field helper which sets the field and also sets up the other bits we need. This fixes decoding of 3DSTATE_SBE_SWIZ. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Kenneth Graunke authored
Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER. Drivers for such hardware would like to advertise support for ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp. This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit, changes the extension enable to be based on that, and enables it in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP (so they continue supporting this mode).
-
- Aug 24, 2018
-
-
Lionel Landwerlin authored
The batch decoder looks for a field with a particular name to decide whether an MI_BB_START leads into a second batch buffer level. Because the names are different between Gen7.5/8 and the newer generation we fail that test and keep on reading (invalid) instructions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-