- Nov 09, 2020
-
-
Anuj Phogat authored
v2: Apply the workaround to all gen hardawre Ref: GEN:BUG:1409725701 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <mesa/mesa!7463>
-
Arcady Goldmints-Orlov authored
Since spills and fills use the TMU, special care has to be taken to avoid putting one between a TMU setup instruction and the corresponding reads or writes. This change adds logic to move fills up and move spills down to avoid interrupting such sequences. This allows compiling 6 more programs from shader-db. Other stats: total spills in shared programs: 446 -> 446 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 606 -> 610 (0.66%) fills in affected programs: 38 -> 42 (10.53%) helped: 0 HURT: 2 total instructions in shared programs: 19330 -> 19363 (0.17%) instructions in affected programs: 3299 -> 3332 (1.00%) helped: 0 HURT: 5 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <mesa/mesa!6606>
-
Samuel Pitoiset authored
(src0 & src1) | (~src0 & src2) to (src0 & src1). fossils-db (Polaris10): Totals from 873 (0.63% of 138014) affected shaders: SGPRs: 33781 -> 33733 (-0.14%) VGPRs: 37704 -> 37520 (-0.49%); split: -0.51%, +0.02% CodeSize: 3861460 -> 3853424 (-0.21%); split: -0.21%, +0.00% MaxWaves: 5306 -> 5305 (-0.02%) Instrs: 743798 -> 743486 (-0.04%); split: -0.04%, +0.00% Cycles: 10962244 -> 10960936 (-0.01%); split: -0.01%, +0.00% VMEM: 128309 -> 128350 (+0.03%); split: +0.33%, -0.30% SMEM: 44797 -> 44113 (-1.53%); split: +0.02%, -1.54% Copies: 71875 -> 71674 (-0.28%); split: -0.31%, +0.03% PreSGPRs: 23484 -> 23479 (-0.02%) PreVGPRs: 34582 -> 34529 (-0.15%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!7479>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Linear Z/S buffers should be handled correctly now. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
src1 exists, and must be set to ZERO. If we don't add this source, lane2 refers to src2 which does not exists. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Now that we lower uniforms to UBO we can get rid of bi_emit_ld_uniform(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
The number of src swizzle to initialize depends on the number of source properties (size and number of components) not the destination ones. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
So we can extend bi_emit_ld_vary() to support centroid and sample modes. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7472>
-
Boris Brezillon authored
If we don't do that, pixels might be killed early thus preventing the fragment shader from being called and updating the depth/stencil value. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!7501>
-
Suresh Guttula authored
Currently dpb_size for VP9 profile0 and profile2 is same eventhough for profile2 dpb_size is multiplied by extra 3/2 and we are seeing VM_L2_PROTECTION_FAULT error and ring vcn_dec timeout because of less dpb_size for VP9_2. This patch will correct dpb_size for VP9_2 and fixes the issue. Signed-off-by: SureshGuttula <suresh.guttula@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <mesa/mesa!7480>
-
Faith Ekstrand authored
Intel hardware supports 8-bit arithmetic but it's tricky and annoying: - Byte operations don't actually execute with a byte type. The execution type for byte operations is actually word. (I don't know if this has implications for the HW implementation. Probably?) - Destinations are required to be strided out to at least the execution type size. This means that B-type operations always have a stride of at least 2. This means wreaks havoc on the back-end in multiple ways. - Thanks to the strided destination, we don't actually save register space by storing things in bytes. We could, in theory, interleave two byte values into a single 2B-strided register but that's both a pain for RA and would lead to piles of false dependencies pre-Gen12 and on Gen12+, we'd need some significant improvements to the SWSB pass. - Also thanks to the strided destination, all byte writes are treated as partial writes by the back-end and we don't know how to copy-prop them. - On Gen11, they added a new hardware restriction that byte types aren't allowed in the 2nd and 3rd sources of instructions. This means that we have to emit B->W conversions all over to resolve things. If we emit said conversions in NIR, instead, there's a chance NIR can get rid of some of them for us. We can get rid of a lot of this pain by just asking NIR to get rid of 8-bit arithmetic for us. It may lead to a few more conversions in some cases but having back-end copy-prop actually work is probably a bigger bonus. There is still a bit we have to handle in the back-end. In particular, basic MOVs and conversions because 8-bit load/store ops still require 8-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!7482>
-
Faith Ekstrand authored
We can't really support these directly on any platform. May as well let NIR lower them. The NIR lowering is potentially one more instruction for scan/reduce ops thanks to not being able to do the B->W conversion as part of SEL_EXEC. For imax/imin exclusive scan, it's yet another instruction thanks to the extra imax/imin NIR has to insert to deal with the fact that the first live channel will contain the identity value which, when signed, will cast wrong. However, it does let us drop some complexity from our back-end so it's probably worth it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!7482>
-
Faith Ekstrand authored
We want to use it for more than just ALU. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!7482>
-
Faith Ekstrand authored
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!7482>
-
Faith Ekstrand authored
This way we can start supporting more than just ALU ops. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!7482>
-
Faith Ekstrand authored
Some ALU ops (comparisons being the primary example) have a fixed bit-size destination and, in that case, we don't want to insert a conversion on the destination. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <mesa/mesa!7482>
-
Rhys Perry authored
The extension is only exposed on ACO and LLVM 11+ because of a LLVM bug. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7234>
-
Rhys Perry authored
64-bit image atomics only work with LLVM 11+ because of a LLVM bug. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7234>
-
Rhys Perry authored
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7234>
-
Rhys Perry authored
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7234>
-
Rhys Perry authored
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7234>
-
Louis-Francis Ratté-Boulianne authored
The fmul operation takes the maximum number of components from either of its operands. We only need to use 2 components from the fragment coordinates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <mesa/mesa!7507>
-
Erik Faye-Lund authored
We forgot to initialize the sample_count member here, leading to it being undefined. This causes problems on MSVC when compiling in debug-mode, where we get a run-time error for using an undefined variable. To avoid similar problems in the future if more fields are added, let's initialize the whole struct to zero to start with. This also allows us to remove a no-longer-needed zero-initialization. Fixes: cf170616 ("gallium: Add a util_blitter path for using a custom VS and FS.") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Part-of: <mesa/mesa!7503>
-
Samuel Pitoiset authored
fossils-db (Vega10): Totals from 7786 (5.70% of 136546) affected shaders: SGPRs: 517778 -> 518626 (+0.16%); split: -0.01%, +0.17% VGPRs: 488252 -> 488084 (-0.03%); split: -0.04%, +0.01% CodeSize: 42282068 -> 42250152 (-0.08%); split: -0.16%, +0.09% MaxWaves: 35697 -> 35716 (+0.05%); split: +0.06%, -0.01% Instrs: 8319309 -> 8304792 (-0.17%); split: -0.18%, +0.00% Cycles: 88619440 -> 88489636 (-0.15%); split: -0.16%, +0.01% VMEM: 2788278 -> 2780431 (-0.28%); split: +0.06%, -0.35% SMEM: 570364 -> 569370 (-0.17%); split: +0.12%, -0.30% VClause: 144906 -> 144908 (+0.00%); split: -0.05%, +0.05% SClause: 302143 -> 302055 (-0.03%); split: -0.04%, +0.01% Copies: 579124 -> 578779 (-0.06%); split: -0.14%, +0.08% PreSGPRs: 327695 -> 328845 (+0.35%); split: -0.00%, +0.35% PreVGPRs: 434280 -> 433954 (-0.08%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7438>
-
Faith Ekstrand authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7509>
-
Faith Ekstrand authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7509>
-
Faith Ekstrand authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7509>
-
Faith Ekstrand authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7509>
-
Faith Ekstrand authored
GLSL requires that image atomics have formats and there are rules about things matching properly. We should enforce those in NIR unless we have reason to do otherwise. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7509>
-
Faith Ekstrand authored
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <mesa/mesa!7509>
-
Faith Ekstrand authored
This corresponds to 5ab5c96198f30804a6a29961b8905f292a8ae600 ("Reserve additional loop control bit for Intel extension (NoFusionINTEL) (#175)") in https://github.com/KhronosGroup/SPIRV-Headers . Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <mesa/mesa!7509>
-