- Aug 25, 2019
-
-
Tomeu Vizoso authored
If two jobs use the same GEM object at the same time, the job that finishes first will (previous to this commit) close the GEM object, even if there's a job still referencing it. To prevent this, have all jobs use the same panfrost_bo for a given GEM object, so it's only closed once the last job is done with it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
- Aug 20, 2019
-
-
Boris Brezillon authored
The generic list helpers are too restrictive for us: we want to be able to update the instruction pointer within the foreach body, and the list_assert() check done in list_for_each_entry() prevents it. Sometimes we also want to update the next_ins pointer (in case we delete/replace the next instruction by something else). Let's implement our own iterators (still based on the existing list helpers) to address this limitation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
-
- Aug 19, 2019
-
-
Boris Brezillon authored
mir_foreach_instr_in_block_safe() is only needed if the caller intend to remove the current item from the list. Downgrade to mir_foreach_instr_in_block() when this is not the case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
-
Faith Ekstrand authored
Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Rather than using a regalloc based on live internals, computed hastily with repeated invocations of a forward-analysis pass, we switch to compute liveness information on a per-block basis. Within a given basic block, we compute liveness backwards with a linear-time algorithm; for common shaders, this may help RA terminate quicker. Across blocks, we use a work list (really a work set) and check if we're making progress. This isn't terribly efficient, but it gets the job done. Point is, we get the live_in/live_out for each block. From there, it's simple to rerun the linear-time update algorithm to compute the interference graph. The benefit of this technique is the ability to ignore "gaps" in liveness across intermediate blocks that are never executed. On simple shaders like the loops in glmark, this results in a minor reduction in register pressure. The motivation was a complex shader in Krita that failed register allocation due to an unfortunate interaction between texture pipeline registers and control flow. This shader now compiles successfully. total instructions in shared programs: 3439 -> 3438 (-0.03%) instructions in affected programs: 22 -> 21 (-4.55%) helped: 1 HURT: 0 total bundles in shared programs: 2077 -> 2076 (-0.05%) bundles in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total quadwords in shared programs: 3457 -> 3456 (-0.03%) quadwords in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total registers in shared programs: 341 -> 338 (-0.88%) registers in affected programs: 9 -> 6 (-33.33%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
If there's a nontrivial swizzle fed into an extra (shortened) argument, we bail on copyprop. No glmark changes (since it doesn't use fancy texturing/loads). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
It's always been ambiguous which they are, but their primary register is their output, not their input; therefore, they are loads. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Same issue with liveness analysis. If we store out a vec3, we should not reference the .w component. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
The texture coordinate for a 2D texture could be a vec2 or a vec3, depending if it's an array texture or not. If it's vec2 (non-array texture), we should not reference the z component; otherwise, liveness analysis will get very confused when z is never written. v2: Fix typo (Ilia). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
If we need to lower a move for a read from a vec2 texture coordinate, we shouldn't write zw, even incidentally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Fixes shaders with control flow like: out = 0; if (A) { if (B) out = texture(A, ...) } else { out = texture(B, ...) } Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Just as a sanity check. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Better than having pointers flying about. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
This is repeated often enough. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
Now we should be able to walk the control-flow graph naturally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
It's ugly, but c'est la vie. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
The exit block has been 'dangling' in the successors graph, so let's ensure it's linked in. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
The exit block is gauranteed to be empty, signaling the end of the program. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
While we already compute the successors array, for backwards data flow analysis, it is useful to walk the control flow graph backwards based on predecessors, so let's compute that information as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
This will allow us to get some level of automatic memory management. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Alyssa Rosenzweig authored
A block can't have more. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Roman Stratiienko authored
Fixes incremental build with Android Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
-
- Aug 18, 2019
-
-
Erico Nunes authored
This is primarily so that this build gets tested in CI and we don't break it again. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>
-
Connor Abbott authored
By adding one more helper to ac_llvm_build, we can also easily keep vector stores together. Fixes the tests/spec/glsl-1.30/execution/fs-large-local-array-vec4.shader_test piglit test. Fixes: 74470bae ("ac/nir: Lower large indirect variables to scratch") Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
- Aug 17, 2019
-
-
Vasily Khoruzhick authored
Otherwise lima standalone compiler fails when trying to compile fragment shader with: lima_compiler: ../src/compiler/nir/nir.c:55: nir_shader_create: Assertion `si->stage == stage' failed Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Faith Ekstrand authored
Fixes: aebca396 "iris: Fix handling of SIMD32 fragment shaders" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
- Aug 16, 2019
-
-
PIPE_FORMAT_YV12 is not handled so switching to PIPE_FORMAT_IYUV and adding back YVU support. Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
Erico Nunes authored
PIPE_TIMEOUT_INFINITE is unsigned and gets assigned to signed fields where it ends up as -1. When this reaches the kernel as a timeout it gets translated as no timeout, which cause the waiting functions to return immediately and not actually wait for a completion. This seems to cause unstable results with lima where even piglit tests randomly fail. Handle this by setting the signed max value in case of infinite timeout. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>
-
Rhys Perry authored
Helps some Dawn of War 3 and F1 2017 shaders with ACO: Totals from affected shaders: SGPRS: 2136 -> 2128 (-0.37 %) VGPRS: 1624 -> 1628 (0.25 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 168068 -> 164332 (-2.22 %) bytes LDS: 44 -> 44 (0.00 %) blocks Max Waves: 222 -> 221 (-0.45 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
-
- Aug 15, 2019
-
-
Vasily Khoruzhick authored
Fixes: e0aeee94("lima: add summary report for shader-db") Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Bas Nieuwenhuizen authored
Quite useless without DCC for LAYOUT_GENERAL. Fixes: b4dad3af Revert "radv: Do not decompress on LAYOUT_GENERAL." Acked-by: Dave Airlie <airlied@redhat.com>
-
Bas Nieuwenhuizen authored
Causes issues with a bunch of games with DXVK. Fixes: 50add1b3 "radv: Do not decompress on LAYOUT_GENERAL." Acked-by: Dave Airlie <airlied@redhat.com>
-
Dave Airlie authored
Control-flow enforcement technology is a new instructions on x86 processors to denote where indirect jumps can land. Gcc auto adds the instruction (which encodes as a NOP on older CPUs) to entrypoints but assembler files need manual adding. This adds it to all the entry points in the mesa x86/x86-64 assembler files. This will only happen if mesa is built with the -fcf-protection flag to gcc as some distros are wanting to do. Acked-by: Eric Anholt <eric@anholt.net>
-
Alyssa Rosenzweig authored
Fixes errors for some people building Mesa: ../src/panfrost/bifrost/bifrost_sched.c:32:31: error: initializer element is not constant const unsigned max_vec2_reg = max_primary_reg / 2; ../src/panfrost/bifrost/bifrost_sched.c:33:31: error: initializer element is not constant const unsigned max_vec3_reg = max_primary_reg / 4; // XXX: Do we need to align vec3 to vec4 boundary? ../src/panfrost/bifrost/bifrost_sched.c:34:31: error: initializer element is not constant const unsigned max_vec4_reg = max_primary_reg / 4; ../src/panfrost/bifrost/bifrost_sched.c:35:32: error: initializer element is not constant const unsigned max_registers = max_primary_reg + ../src/panfrost/bifrost/bifrost_sched.c:40:28: error: initializer element is not constant const unsigned vec2_base = primary_base + max_primary_reg; ../src/panfrost/bifrost/bifrost_sched.c:41:28: error: initializer element is not constant const unsigned vec3_base = vec2_base + max_vec2_reg; ../src/panfrost/bifrost/bifrost_sched.c:42:28: error: initializer element is not constant const unsigned vec4_base = vec3_base + max_vec3_reg; ../src/panfrost/bifrost/bifrost_sched.c:43:27: error: initializer element is not constant const unsigned vec4_end = vec4_base + max_vec4_reg; Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
Erik Faye-Lund authored
This method returns size_t, but the multiplication multiplies two integers, leading to overflow rather than type widening. Noticed by compiling with MSVC, which emits a warning. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>
-
Erik Faye-Lund authored
On Windows, p_atomic_inc_return returns an unsigned long long rather than the type the pointer refers to, so let's make sure we cast the result to the right type. Otherwise, we'll trigger a warning about the wrong format-string for the type. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>
-
Erik Faye-Lund authored
There was two incompatible definitions of strcasecmp, which lead to a compiler warning. Let's clean this up by only leaving one of them, and using that one all the time. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>
-