 22 Feb, 2022 40 commits


Saves some upload boilerplate.

Switching this over took a bit of cleanup of vc4's special 4to2 upload to match other util_upload_index_buffer's pattern.

This lets drivers call the function at the top of draw_vbo, unref the temp resource at the end, and otherwise forget about user indices and their upload offset. The cost should be minimal  a bit of stack space regardless, and copy of the info and a divide if we uploaded things (but the upload itself will probably drown out that overhead). Avoids threading the extra index offset through freedreno's draw functions.

To disable the new scoreboarding optimizations when debugging. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

Extend our existing bi_scoreboard infrastructure with a simple data flow analysis pass that calculates which dependency slots need waiting. We still lack a heuristic for selecting dependency slots. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

To a limited degree, scoreboarding must be global, so add the data structures for tracking this to the IR. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

Fix minor silly things. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

The "generic" one is a vestige of Midgard. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

Useful for data flow analysis. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

Bifrost postRA dead code elimination can cull the destinations of regular ALU instructions, by weakening from a register write to a temporary write. However, there is no way to suppress staging writes, so culling the destinations will result in invalid code generation. Fixes a regression in dEQPGLES3.functional.shaders.switch.switch_in_for_loop_static_vertex with scoreboarding. The root cause there is the backend dead code elimination not being sufficiently aggressive in the presence of control flow. Usually this does not matter, since the backend optimizations are intended to be local with global optimizations happening in NIR. Unfortunately, our implementation of IDVS hits this hard. That will need to be optimized (probably by specializing IDVS shaders in NIR instead of the backend). In the mean time, let's fix the actual bug affecting scoreboarding. No shaderdb changes. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

They are useless (given the semantics of DTSEL_IMM) and complicate scoreboarding. Just remove them in the pass that removes all the other silly register destinations. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

Barriers need to wait on all outstanding messages. This is more of an API requirement than a hardware requirement, but it's still an invariant the scoreboarding pass must respect. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!14298>

I always intended this to be covered by the MIT license like with the rest of my contributions, but somehow forgot to add it. Let's add that license to make things clear. Reviewedby: Marcin Ślusarz <marcin.slusarz@intel.com> Partof: <!14751>

Adam Jackson authored
This just legalizes a few of the pixelstore pack parameters in GLES2 that are already legal in desktop and GLES3. glamor takes advantage of this in the GetImage and softwarefallback paths. Reviewedby: Zoltán Böszörményi <zboszor@gmail.com> Reviewedby: Marek Olšák <marek.olsak@amd.com> Partof: <mesa/mesa!14977>

total instructions in shared programs: 1939513 > 1935815 (0.19%) instructions in affected programs: 809066 > 805368 (0.46%) helped: 3195 HURT: 865 helped stats (abs) min: 1.0 max: 15.0 x̄: 1.99 x̃: 1 helped stats (rel) min: 0.10% max: 25.00% x̄: 2.26% x̃: 1.28% HURT stats (abs) min: 1.0 max: 22.0 x̄: 3.09 x̃: 2 HURT stats (rel) min: 0.10% max: 83.33% x̄: 2.67% x̃: 1.39% 95% mean confidence interval for instructions value: 1.00 0.82 95% mean confidence interval for instructions %change: 1.34% 1.08% Instructions are helped. total tuples in shared programs: 1523194 > 1521789 (0.09%) tuples in affected programs: 745526 > 744121 (0.19%) helped: 2947 HURT: 1844 helped stats (abs) min: 1.0 max: 18.0 x̄: 2.06 x̃: 1 helped stats (rel) min: 0.15% max: 25.00% x̄: 2.65% x̃: 1.59% HURT stats (abs) min: 1.0 max: 29.0 x̄: 2.54 x̃: 1 HURT stats (rel) min: 0.09% max: 40.00% x̄: 2.32% x̃: 1.52% 95% mean confidence interval for tuples value: 0.39 0.20 95% mean confidence interval for tuples %change: 0.85% 0.62% Tuples are helped. total clauses in shared programs: 329158 > 325350 (1.16%) clauses in affected programs: 111654 > 107846 (3.41%) helped: 2787 HURT: 498 helped stats (abs) min: 1.0 max: 17.0 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.76% max: 40.00% x̄: 6.92% x̃: 5.26% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.14 x̃: 1 HURT stats (rel) min: 0.87% max: 50.00% x̄: 4.73% x̃: 3.77% 95% mean confidence interval for clauses value: 1.21 1.10 95% mean confidence interval for clauses %change: 5.39% 4.93% Clauses are helped. total cycles in shared programs: 172084.50 > 166827.62 (3.05%) cycles in affected programs: 74698.83 > 69441.96 (7.04%) helped: 3706 HURT: 568 helped stats (abs) min: 0.041665999999999315 max: 19.0 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.24% max: 75.00% x̄: 9.48% x̃: 6.90% HURT stats (abs) min: 0.041665999999999315 max: 1.0 x̄: 0.15 x̃: 0 HURT stats (rel) min: 0.25% max: 50.00% x̄: 2.21% x̃: 1.42% 95% mean confidence interval for cycles value: 1.28 1.18 95% mean confidence interval for cycles %change: 8.18% 7.67% Cycles are helped. total arith in shared programs: 57145.04 > 57211.37 (0.12%) arith in affected programs: 27595.12 > 27661.46 (0.24%) helped: 1933 HURT: 2259 helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.09 x̃: 0 helped stats (rel) min: 0.16% max: 33.33% x̄: 2.74% x̃: 1.52% HURT stats (abs) min: 0.04166399999999726 max: 1.3333329999999997 x̄: 0.11 x̃: 0 HURT stats (rel) min: 0.10% max: 100.00% x̄: 2.79% x̃: 1.62% 95% mean confidence interval for arith value: 0.01 0.02 95% mean confidence interval for arith %change: 0.07% 0.40% Arith are HURT. total texture in shared programs: 12857 > 12857 (0.00%) texture in affected programs: 0 > 0 helped: 0 HURT: 0 total vary in shared programs: 11157.75 > 10222 (8.39%) vary in affected programs: 5643 > 4707.25 (16.58%) helped: 3196 HURT: 0 helped stats (abs) min: 0.125 max: 1.875 x̄: 0.29 x̃: 0 helped stats (rel) min: 2.78% max: 75.00% x̄: 18.49% x̃: 15.00% 95% mean confidence interval for vary value: 0.30 0.29 95% mean confidence interval for vary %change: 18.88% 18.11% Vary are helped. total ldst in shared programs: 146420 > 140270 (4.20%) ldst in affected programs: 66027 > 59877 (9.31%) helped: 2942 HURT: 10 helped stats (abs) min: 1.0 max: 19.0 x̄: 2.09 x̃: 2 helped stats (rel) min: 0.90% max: 100.00% x̄: 16.81% x̃: 8.33% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 2.22% max: 50.00% x̄: 13.03% x̃: 3.33% 95% mean confidence interval for ldst value: 2.15 2.02 95% mean confidence interval for ldst %change: 17.53% 15.89% Ldst are helped. total quadwords in shared programs: 1398329 > 1392117 (0.44%) quadwords in affected programs: 704641 > 698429 (0.88%) helped: 3677 HURT: 1299 helped stats (abs) min: 1.0 max: 26.0 x̄: 2.51 x̃: 1 helped stats (rel) min: 0.10% max: 26.92% x̄: 2.64% x̃: 1.89% HURT stats (abs) min: 1.0 max: 20.0 x̄: 2.31 x̃: 1 HURT stats (rel) min: 0.11% max: 44.44% x̄: 2.34% x̃: 1.55% 95% mean confidence interval for quadwords value: 1.34 1.16 95% mean confidence interval for quadwords %change: 1.44% 1.25% Quadwords are helped. total threads in shared programs: 35234 > 35311 (0.22%) threads in affected programs: 119 > 196 (64.71%) helped: 91 HURT: 14 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.60 0.87 95% mean confidence interval for threads %change: 70.08% 89.92% Threads are helped. total loops in shared programs: 125 > 125 (0.00%) loops in affected programs: 0 > 0 helped: 0 HURT: 0 total spills in shared programs: 149 > 144 (3.36%) spills in affected programs: 22 > 17 (22.73%) helped: 1 HURT: 0 total fills in shared programs: 966 > 956 (1.04%) fills in affected programs: 44 > 34 (22.73%) helped: 1 HURT: 0 Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Partof: <mesa/mesa!15090>

It's a bit more code, but it's needed to chew through control flow since we don't have a backend version of dead_cf. Results are really good, meaning I really screwed this up the first time around (hence the cc mesastable). total instructions in shared programs: 1963576 > 1939513 (1.23%) instructions in affected programs: 671053 > 646990 (3.59%) helped: 4436 HURT: 729 helped stats (abs) min: 1.0 max: 43.0 x̄: 5.75 x̃: 6 helped stats (rel) min: 0.21% max: 100.00% x̄: 6.47% x̃: 5.17% HURT stats (abs) min: 1.0 max: 22.0 x̄: 2.01 x̃: 1 HURT stats (rel) min: 0.50% max: 50.00% x̄: 10.45% x̃: 9.09% 95% mean confidence interval for instructions value: 4.77 4.55 95% mean confidence interval for instructions %change: 4.36% 3.80% Instructions are helped. total tuples in shared programs: 1533335 > 1523194 (0.66%) tuples in affected programs: 483167 > 473026 (2.10%) helped: 3414 HURT: 1288 helped stats (abs) min: 1.0 max: 20.0 x̄: 3.73 x̃: 2 helped stats (rel) min: 0.27% max: 100.00% x̄: 4.87% x̃: 3.03% HURT stats (abs) min: 1.0 max: 19.0 x̄: 2.02 x̃: 1 HURT stats (rel) min: 0.24% max: 38.10% x̄: 8.10% x̃: 5.88% 95% mean confidence interval for tuples value: 2.28 2.03 95% mean confidence interval for tuples %change: 1.62% 1.02% Tuples are helped. total clauses in shared programs: 351432 > 329158 (6.34%) clauses in affected programs: 142237 > 119963 (15.66%) helped: 5328 HURT: 3 helped stats (abs) min: 1.0 max: 43.0 x̄: 4.18 x̃: 4 helped stats (rel) min: 0.74% max: 100.00% x̄: 19.44% x̃: 17.24% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 9.09% max: 12.50% x̄: 10.90% x̃: 11.11% 95% mean confidence interval for clauses value: 4.25 4.11 95% mean confidence interval for clauses %change: 19.72% 19.12% Clauses are helped. total cycles in shared programs: 202830.92 > 172084.50 (15.16%) cycles in affected programs: 117078.42 > 86332 (26.26%) helped: 5450 HURT: 1 helped stats (abs) min: 0.083333 max: 49.0 x̄: 5.64 x̃: 5 helped stats (rel) min: 1.42% max: 100.00% x̄: 27.94% x̃: 25.64% HURT stats (abs) min: 0.25 max: 0.25 x̄: 0.25 x̃: 0 HURT stats (rel) min: 2.46% max: 2.46% x̄: 2.46% x̃: 2.46% 95% mean confidence interval for cycles value: 5.74 5.54 95% mean confidence interval for cycles %change: 28.30% 27.58% Cycles are helped. total arith in shared programs: 57274.29 > 57145.04 (0.23%) arith in affected programs: 16418.33 > 16289.08 (0.79%) helped: 2442 HURT: 1784 helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.14 x̃: 0 helped stats (rel) min: 0.23% max: 100.00% x̄: 5.51% x̃: 2.87% HURT stats (abs) min: 0.041665999999999315 max: 0.9166670000000003 x̄: 0.12 x̃: 0 HURT stats (rel) min: 0.00% max: 100.00% x̄: 25.13% x̃: 9.09% 95% mean confidence interval for arith value: 0.04 0.03 95% mean confidence interval for arith %change: 6.61% 8.24% Inconclusive result (value mean confidence interval and %change mean confidence interval disagree). total texture in shared programs: 12857 > 12857 (0.00%) texture in affected programs: 0 > 0 helped: 0 HURT: 0 total vary in shared programs: 11157.75 > 11157.75 (0.00%) vary in affected programs: 0 > 0 helped: 0 HURT: 0 total ldst in shared programs: 177208 > 146420 (17.37%) ldst in affected programs: 117098 > 86310 (26.29%) helped: 5447 HURT: 0 helped stats (abs) min: 1.0 max: 49.0 x̄: 5.65 x̃: 5 helped stats (rel) min: 1.92% max: 100.00% x̄: 27.91% x̃: 25.64% 95% mean confidence interval for ldst value: 5.75 5.55 95% mean confidence interval for ldst %change: 28.27% 27.56% Ldst are helped. total quadwords in shared programs: 1436507 > 1398329 (2.66%) quadwords in affected programs: 515101 > 476923 (7.41%) helped: 5150 HURT: 111 helped stats (abs) min: 1.0 max: 39.0 x̄: 7.46 x̃: 6 helped stats (rel) min: 0.17% max: 100.00% x̄: 10.02% x̃: 8.24% HURT stats (abs) min: 1.0 max: 9.0 x̄: 2.01 x̃: 1 HURT stats (rel) min: 0.43% max: 21.62% x̄: 3.57% x̃: 1.94% 95% mean confidence interval for quadwords value: 7.41 7.11 95% mean confidence interval for quadwords %change: 9.98% 9.49% Quadwords are helped. total threads in shared programs: 35025 > 35228 (0.58%) threads in affected programs: 218 > 421 (93.12%) helped: 208 HURT: 5 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.91 0.99 95% mean confidence interval for threads %change: 93.40% 99.55% Threads are helped. total loops in shared programs: 128 > 125 (2.34%) loops in affected programs: 3 > 0 helped: 3 HURT: 0 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% total spills in shared programs: 158 > 149 (5.70%) spills in affected programs: 15 > 6 (60.00%) helped: 9 HURT: 0 total fills in shared programs: 1133 > 966 (14.74%) fills in affected programs: 197 > 30 (84.77%) helped: 9 HURT: 0 Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesastable Partof: <mesa/mesa!15090>

The important thing isn't the number of words pushed, it's that there are no UBOs required for us to upload. Check that instead. Signedoffby: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesastable Partof: <mesa/mesa!15090>

Signedoffby: Georg Lehmann <dadschoorse@gmail.com> Reviewedby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Partof: <!15083>

Xaver Hugl authored
Signedoffby: Xaver Hugl <xaver.hugl@gmail.com> Reviewedby: Simon Ser <contact@emersion.fr> Partof: <!10906>

based on PAL Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

preparation for a future commit Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently. From `si_make_buffer_descriptor`: *  For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units. *  For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE. so there is a difference between setting vindex = i32_0 and vindex = NULL. Instead of having the `structured` flag, we can just check if vindex is NULL. Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently. From `si_make_buffer_descriptor`: *  For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units. *  For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE. so there is a difference between setting vindex = i32_0 and vindex = NULL. Instead of having the `structured` flag, we can just check if vindex is NULL. Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

This matches PAL and RADV behavior. It's for preemption. Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

according to gfx10SwizzlePattern.h Fixes: 9fabbf21  ac/surface: copy the HTILE equations to the surface Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

It was fixed in LLVM 14. Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

Signedoffby: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com> Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Signedoffby: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com> Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

based on PAL Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

glWaitSemaphoreEXT triggers si_flush_resource callback on pipe buffer resources, which may cause segmentation fault. Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Ackedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com> Partof: <mesa/mesa!15098>
