- 25 Sep, 2021 4 commits
-
-
Rob Clark authored
bitset encoding tends to have a lot of duplication, for ex. many instructions with the same encoding modulo the fixed pattern. Now that encode_bitset is split out into it's own template, so that we can capture the result, use a hash table to de-duplicate the bitset encoding into "snippet" functions so that bitset cases with identical encoding can re-use the same generated code. Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
Rob Clark authored
In the next patch, we are going to want to be able to capture the result of rendering the template as a py variable, which I don't think you can do otherwise with a <%def>. Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
Rob Clark authored
These were never used, leftovers from an earlier iteration of isaspec which used an RPN based thing for expressions. Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
- 24 Sep, 2021 1 commit
-
-
Caio Oliveira authored
Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Part-of: <mesa/mesa!12951>
-
- 21 Sep, 2021 10 commits
-
-
Caio Oliveira authored
Map it to the existing ACCESS_STREAM_CACHE_POLICY access mode. Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!12945>
-
Christian Gmeiner authored
This helps to get a really nice and aligend disasm output. Just use :align=X to define where in the line the field should be printed. Signed-off-by:
Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by:
Rob Clark <robdclark@chromium.org> Part-of: <!11321>
-
Christian Gmeiner authored
Signed-off-by:
Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by:
Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11321>
-
Christian Gmeiner authored
To support field alignment we need to keep track of how much data we have printed to our out FILE. This is a prep commit. Signed-off-by:
Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by:
Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11321>
-
Christian Gmeiner authored
This commit moves isaspec out of freedreno into a more generic new home - src/compiler/isaspec. Signed-off-by:
Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by:
Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11321>
-
Christian Gmeiner authored
Prep work for the next commit. Signed-off-by:
Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by:
Rob Clark <robdclark@chromium.org> Part-of: <mesa/mesa!11321>
-
Jason Ekstrand authored
Reviewed-by:
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <!12959>
-
Bas Nieuwenhuizen authored
Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!12592>
-
Bas Nieuwenhuizen authored
Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!12592>
-
Bas Nieuwenhuizen authored
That way we can get the address to the entry, which is needed for some nir builtins because extra data in the entry can be used as shader input. Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!12592>
-
- 20 Sep, 2021 1 commit
-
-
Timur Kristóf authored
These are I/O variables which are not going to be removed anyway. However, get_variable_io_mask handles their location incorrectly. Found using the GCC undefined behavior sanitizer. Fixes the following error: runtime error: shift exponent 4294967258 is too large for 64-bit type 'long unsigned int' Closes: #5319 Fixes: cf5f8f55 Signed-off-by:
Timur Kristóf <timur.kristof@gmail.com> Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!12719>
-
- 17 Sep, 2021 2 commits
-
-
Ian Romanick authored
Most modern hardware needs the edge flag added as a hidden vertex input and needs code added to the vertex shader to copy the input to an output. Intel hardware is a little different. Gfx4 and Gfx5 hardware works in the previously described mannter. Gfx6+ hardware needs the edge flag as a specific vertex shader input, and that input is magically processed by fixed-function hardware without need for extra shader code. This flag signals only that the vertex shader input is needed. It would be nice if we could decouple adding the vertex shader input from generating the copy-to-output code, but that has proven to be challenging. Not having that code causes other passes to want to eliminate that shader input. v2: Convert conditional to assertion. This pass is only called for vertex shaders. Suggested by Ken. Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Part-of: <!12858>
-
Rhys Perry authored
This allows for more MAD/FMA instructions to be created. fossil-db (Sienna Cichlid): Totals from 50134 (33.46% of 149839) affected shaders: VGPRs: 2436536 -> 2436000 (-0.02%); split: -0.05%, +0.03% SpillSGPRs: 13136 -> 13135 (-0.01%); split: -0.02%, +0.02% CodeSize: 206621424 -> 206278292 (-0.17%); split: -0.23%, +0.07% MaxWaves: 1116804 -> 1117448 (+0.06%); split: +0.07%, -0.01% Instrs: 38977460 -> 38862886 (-0.29%); split: -0.33%, +0.04% Latency: 832425389 -> 827432260 (-0.60%); split: -0.63%, +0.03% InvThroughput: 184193457 -> 183563350 (-0.34%); split: -0.37%, +0.03% Signed-off-by:
Rhys Perry <pendingchaos02@gmail.com> Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Ian Romanick <ian.d.romanick@intel.com> Part-of: <mesa/mesa!7458>
-
- 16 Sep, 2021 2 commits
-
-
Jason Ekstrand authored
They're no longer ralloc'd. Fixes: 879a5698 "nir: Switch from ralloc to malloc for NIR instructions." Reviewed-by:
Emma Anholt <emma@anholt.net> Reviewed-by:
Ian Romanick <ian.d.romanick@intel.com> Part-of: <!12884>
-
Jason Ekstrand authored
Now that they're no longer ralloc'd, we have to be much more careful about indirects. We have to make sure every time a source or destination is overwritten, its indirect (if any) is freed. We also have to choose a memory ownership convention for the rewrite functions. Assuming that they will be called with the source from some other instruction, we choose to always make a copy of the indirect (if any). It's the responsibility of the caller to ensure its copy of the indirect is freed. Unfortunately, all this extra logic is going to make nir_instr_rewrite/move_src/dest more expensive because they now have all the logic of nir_src/dest_copy instead of a simple struct assignment. Fortunately, the vast majority of rewrite calls are done by nir_ssa_def_rewrite_uses which is an SSA-only fast-path. Fixes: 879a5698 "nir: Switch from ralloc to malloc for NIR instructions." Reviewed-by:
Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!12884>
-
- 14 Sep, 2021 14 commits
-
-
Emma Anholt authored
Now that we don't use ralloc, we don't need this arg to get at the right ralloc ctx. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
By replacing the 48-byte ralloc header with our exec_node gc_node (16 bytes), runtime of shader-db on my system across this series drops -4.21738% +/- 1.47757% (n=5). Inspired by discussion on #5034 . Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <!11776>
-
Emma Anholt authored
With the de-ralloc changes, having the register dest not have its .reg properly initialized caused crashes. Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
Preparation for de-rallocing instrs. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
Right now we're using ralloc to GC our NIR instructions, but ralloc has significant overhead for its recursive nature so it would be nice to use a simpler mechanism for GCing instructions. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
The arg says it's supposed to be the instr, not the shader. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
We were using the ralloc parent in some places, which should work out to be the shader I think, but to de-ralloc the instrs we should just pass the existing shader pointer in. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
This code was being tricky with passing a mem_ctx instead of the shader, then freeing the mem_ctx when the pass was done and all the parallel copies had been removed from the shader. Use the right type for instr creation and do a bit of manual list management to prepare the way for non-ralloc NIR instrs. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
With the de-rallocing, we're going to have some more places that free a list of instrs. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <mesa/mesa!11776>
-
Emma Anholt authored
This will gain another step shortly. Reviewed-by:
Matt Turner <mattst88@gmail.com> Part-of: <!11776>
-
Ian Romanick authored
Calling this lower pass twice in a row would cause spurious set_vertex_and_primitive_count(0, undef) intrinsics after the proper set_vertex_and_primitive_count intrinsic. This pretty much turns any geometry shader into garbage. Fix this by treating nir_intrinsic_emit_vertex_with_counter and nir_intrinsic_end_primitive_with_counter just like the non-_with_counter versions. If no blocks would need set_vertex_and_primitive_count intrinsics added, exit the pass before doing any work. This prevents the need for DCE to do extra clean up later. Since this pass is potentially called multiple times via multiple invocations of a finalize_nir callback, it is (hypothetically?) possible that control flow could be changed to add new blocks that need this intrinsic. The check implemented in this commit should be robust against that possibility. v2: Add a_block_needs_set_vertex_and_primitive_count. Suggested by Timur. Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Part-of: <mesa/mesa!12802>
-
Ian Romanick authored
Reviewed-by:
Timur Kristóf <timur.kristof@gmail.com> Fixes: 542d40d6 ("nir: Add new GS intrinsics that maintain a count of emitted vertices.") Part-of: <mesa/mesa!12802>
-
Mike Blumenkrantz authored
this is really hard to pin down later on, so catch it here instead gotta have those dimensions. Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!12825>
-
Kenneth Graunke authored
It only makes sense to call this pass for fragment shaders, and the first thing the pass does is read a FS-specific field out of a union, so it isn't safe to call it for other shader stages. We could make it early return, but instead we just assert, so that drivers know to only call it when appropriate. (A previous version of this patch, which early returned instead of asserting, was Reviewed-by: Emma Anholt <emma@anholt.net> as well.) Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!12839>
-
- 13 Sep, 2021 1 commit
-
-
Emma Anholt authored
For drivers that don't lower advanced blend to FBFETCH, we need the bitmask to be in the NIR shader so that it gets carried over to TGSI successfully. Reviewed-by:
Rob Clark <robdclark@chromium.org> Reviewed-By:
Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <mesa/mesa!12813>
-
- 09 Sep, 2021 4 commits
-
-
Bas Nieuwenhuizen authored
Sadly need to poke a bit in the src internals to avoid using yet another heap allocated datastructure. Fixes: 52515485 ("nir: Add a nir_instr_remove that recursively removes dead code.") Closes: mesa/mesa#5323 Reviewed-by:
Emma Anholt <emma@anholt.net> Part-of: <mesa/mesa!12726>
-
Rhys Perry authored
Signed-off-by:
Rhys Perry <pendingchaos02@gmail.com> Reviewed-by:
Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: e76ae39a ("nir: add support for user defined select control") Fixes: b56451f8 ("nir: add support for user defined loop control") Part-of: <mesa/mesa!12778>
-
Qiang Yu authored
Driver like radeonsi load varying in a scalar manner, so prefer to pack varying with different interpolation qualifier into same slot to save space. But driver like panfrost/bifrost can load varying in vector manner, so prefer to pack varying with same interpolation qualifier. Driver can add interpolation qualifiers which are able to be packed into same varying slot to pack_varying_options nir option. Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Signed-off-by:
Qiang Yu <yuq825@gmail.com> Part-of: <mesa/mesa!12537>
-
Qiang Yu authored
These qualifiers should be respected for different varying load code generation. Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Signed-off-by:
Qiang Yu <yuq825@gmail.com> Part-of: <mesa/mesa!12537>
-
- 08 Sep, 2021 1 commit
-
-
Marcin Ślusarz authored
Fixes compiler crashes on: struct Foo { float does_exist_member; }; in vec2 tex; out vec4 color; void main(void) { Foo foo; foo.does_not_exist_member %= 3; /* or any of: <<=, >>=, &=, |=, ^= */ color = vec4(tex.xy, tex.xy); } Signed-off-by:
Marcin Ślusarz <marcin.slusarz@intel.com> CC: mesa-stable Reviewed-by:
Matt Turner <mattst88@gmail.com> Reviewed-by:
Tapani Pälli <tapani.palli@intel.com> Part-of: <mesa/mesa!12717>
-