- 20 Jun, 2019 5 commits
-
-
Timothy Arceri authored
This helps reduce the amount of abstraction in this pass and allows us to retain more information about the src such as any swizzles. Retaining the swizzle information is required for a bugfix in a following patch. Fixes: 6772a17a ("nir: Add a loop analysis pass") Tested-by:
Brian Paul <brianp@vmware.com>
-
Nicolai Hähnle authored
Tested-by:
Dieter Nützel <Dieter@nuetzel-hh.de>
-
Marek Olšák authored
Tested-by:
Dieter Nützel <Dieter@nuetzel-hh.de>
-
Nicolai Hähnle authored
Tested-by:
Dieter Nützel <Dieter@nuetzel-hh.de>
-
Nicolai Hähnle authored
Tested-by:
Dieter Nützel <Dieter@nuetzel-hh.de>
-
- 19 Jun, 2019 35 commits
-
-
Bas Nieuwenhuizen authored
Just as was allowed by autotools. Fixes: 108d257a "meson: build libEGL" Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Bas Nieuwenhuizen authored
Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Bas Nieuwenhuizen authored
Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Acked-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com>
-
Jason Ekstrand authored
For the block BLOCK_TEXEL_VIEW_COMPATIBLE case, this didn't matter because the flags were already more-or-less what we wanted. However, for gen7 stencil shadow images, it still had ISL_SURF_USAGE_STENCIL_BIT so we were getting W-tiled which isn't what we want for the shadow. By passing just ISL_SURF_USAGE_TEXTURE_BIT (and CUBE if we care), we now get something that's actually texturable. Fixes: f3ea0cf8 "anv: Add stencil texturing support for gen7"
-
Jason Ekstrand authored
Copies to a shadow image happen during a VkCmdPipelineBarrier or at subpass transitions. We could potentially be a bit more conservative but these transitions shouldn't happen often and it's better to have our bases covered. Fixes: f3ea0cf8 "anv: Add stencil texturing support for gen7"
-
Jason Ekstrand authored
Most places in NIR, we treat matrices like arrays. The one annoying exception to this has been nir_constant where a matrix is a first-class thing. This commit changes that so a matrix nir_constant is the same as an array nir_constant. This makes matrix nir_constants a tiny bit more expensive but shrinks all others by 96B. Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
Now that nir_const_value is a scalar, there's no reason why we need multiple paths here and it's just extra paths to keep working. While we're here, we also add a vtn_fail_if check that component indices are in-bounds. Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
It only accepts 32-bit integers so it should have a more descriptive name. This patch should not be a functional change. Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
All of the callers for this function are looking at interpolation qualifiers and want to make sure they're declared flat. Any 64-bit integer inputs need to be flat. It's also makes the function make more sense since "integer" is fairly generic. Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Jason Ekstrand authored
All of the callers of this function really just want to know if the type is an integer and don't care about bit size. Reviewed-by:
Karol Herbst <kherbst@redhat.com>
-
Caio Marcelo de Oliveira Filho authored
Even if only variables access flags are changed, the existing NIR infrastructure expects metadata to be explicitly preserved, so do that. Don't care about avoiding preserve to be called twice since the cost is negligible. This scenario can be triggered by dead variables, and also by other intrinsics that read the variables -- but not cause progress to be made when processing the intrinsics. Fixes: f2d0e48d "glsl/nir: Add optimization pass for access flags" Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org>
-
Caio Marcelo de Oliveira Filho authored
Unwrap any array in the variable type so we can get the sampler dim. This fixes piglit test spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test. Fixes: f2d0e48d "glsl/nir: Add optimization pass for access flags" Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org>
-
Jory A. Pratt authored
Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for execinfo.h presence, just check directly. This allows the build to work on musl. Reviewed-by:
Matt Turner <mattst88@gmail.com> Reviewed-by:
Eric Anholt <eric@anholt.net> Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Jory A. Pratt authored
The disk cache code tries to allocate a 256 Kbyte buffer on the stack. Since musl only gives 80 Kbyte of stack space per thread, this causes a trap. See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size (In musl-1.1.21 the default stack size has increased to 128K) [mattst88]: Original author unknown, but I think this is small enough that it is not copyrightable. Reviewed-by:
Matt Turner <mattst88@gmail.com> Reviewed-by:
Eric Anholt <eric@anholt.net> Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Kenneth Graunke authored
%lu is for unsigned long, %zu is for size_t. Just cast the data.
-
Kenneth Graunke authored
We don't execute any of the commands to record snapshots, so we can't actually produce a real result. We do however need to avoid waiting on a syncpt which will never be signalled. So, just return 0.
-
David Riley authored
Support a new virgl bind type for shared buffers. Signed-off-by:
David Riley <davidriley@chormium.org> Reviewed-By:
Gert Wollny <gert.wollny@collabora.com>
-
Lionel Landwerlin authored
Using the existing VK_EXT_debug_report extension. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net>
-
Connor Abbott authored
This brings the nir path in line with the TGSI path. Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 2792 -> 2652 (-5.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 247380 -> 248072 (0.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 121 -> 132 (9.09 %) Wait states: 0 -> 0 (0.00 %) Most of the change came from DiRT: Showdown, and came from sinking SSBO loads. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
No changes with radeonsi shader-db. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
Nothing uses its results yet, that will come with the following commits. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
The spec explicitly says that volatile writes can't be removed and volatile reads do not guarantee that the same value will still be around after the read, as if there were a barrier after each read/write. Just ignore them. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
We were completely ignoring these before, except for putting them on variables. While we're here, don't set access qualifiers when converting to bindless since glsl_to_nir will already have set a more accurate qualifier that includes any qualifiers on struct members that are dereferenced. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Connor Abbott authored
inaccessiblememonly means that it doesn't modify memory accesible via normal LLVM pointers. This lets LLVM's dead store elimination, memcpy forwarding, etc. ignore functions with this attribute. We don't represent descriptors as pointers, so this property is always true of buffer and image stores. There are plans to represent descriptors via pointers, but this just means that now nothing is inaccessiblememonly, as LLVM will then understand loads/stores via its usual alias analysis. Radeonsi was mistakenly only setting it if the driver could prove that there were no reads, and then it was cargo-culted into ac_llvm_build and ac_llvm_to_nir. Rip it out of everything. statistics with nir enabled: Totals from affected shaders: SGPRS: 152 -> 152 (0.00 %) VGPRS: 128 -> 132 (3.12 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 9324 -> 9244 (-0.86 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 17 -> 17 (0.00 %) Wait states: 0 -> 0 (0.00 %) The only difference was a manhattan31 shader. Acked-by:
Timothy Arceri <tarceri@itsqueeze.com> Acked-by:
Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by:
Marek Olšák <marek.olsak@amd.com>
-
Eric Engestrom authored
close() is in <unistd.h> Signed-off-by:
Eric Engestrom <eric.engestrom@intel.com> Reviewed-by:
Tapani Pälli <tapani.palli@intel.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com>
-
Samuel Pitoiset authored
This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-