- 27 Jan, 2023 1 commit
-
-
Meson will already construct these paths for us, so let's reuse them instead of throwing away the result and recontstructing them. Reviewed-by:
Eric Engestrom <eric@igalia.com> Part-of: <!20907>
-
- 26 Jan, 2023 1 commit
-
-
If you're only affecting one or a couple of drivers, it would be nice if your pipeline buttons on the web UI weren't full of manual run buttons for all the other drivers. This is a bunch of duplicated lines, but less than it could have been now that we have !references. In some of these cases (i915g, nouveau, etnaviv), we have no non-manual jobs for those drivers, so I could have just rewritten the original "driver-rules" to "driver-manual-rules". I decided to keep things consistent between drivers, though, because this is all esoteric enough to readers already without making different drivers' rules look different. Fixes: #4891 Acked-by:
David Heidelberg <david.heidelberg@collabora.com> Part-of: <!17445>
-
- 24 Jan, 2023 1 commit
-
-
Since our X servers don't have a compositor, and we run tests in parallel, various swap and frontbuffer tests won't ever be stable. Rather than having every driver have to track those flakes, make a general X11 skips list as a known issue of our CI rather than pointing fingers at drivers. Acked-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by:
Karol Herbst <kherbst@redhat.com> Acked-by:
Martin Roukala <martin.roukala@mupuf.org> Acked-by:
David Heidelberg <david.heidelberg@collabora.com> Part-of: <mesa/mesa!20798>
-
- 19 Jan, 2023 1 commit
-
-
Which is now required, so these are useless Reviewed-by:
Eric Engestrom <eric@igalia.com> Part-of: <mesa/mesa!20752>
-
- 16 Jan, 2023 8 commits
-
-
These should all be unreachable and what's left is dead-code. Signed-off-by:
Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <!19350>
-
Simpler. Small shaderdb regressions from using IR registers instead of SSA, but that's probably what we needed for correctness (given that SSA is violated otherwise) hence the Cc. total instructions in shared programs: 1520220 -> 1518127 (-0.14%) instructions in affected programs: 167437 -> 165344 (-1.25%) helped: 662 HURT: 206 helped stats (abs) min: 1.0 max: 46.0 x̄: 3.65 x̃: 2 helped stats (rel) min: 0.18% max: 22.22% x̄: 2.43% x̃: 1.71% HURT stats (abs) min: 1.0 max: 7.0 x̄: 1.56 x̃: 1 HURT stats (rel) min: 0.17% max: 8.33% x̄: 2.66% x̃: 2.33% 95% mean confidence interval for instructions value: -2.65 -2.18 95% mean confidence interval for instructions %-change: -1.45% -0.99% Instructions are helped. total bundles in shared programs: 649844 -> 649345 (-0.08%) bundles in affected programs: 59278 -> 58779 (-0.84%) helped: 577 HURT: 249 helped stats (abs) min: 1.0 max: 39.0 x̄: 1.56 x̃: 1 helped stats (rel) min: 0.26% max: 30.00% x̄: 3.13% x̃: 2.19% HURT stats (abs) min: 1.0 max: 12.0 x̄: 1.61 x̃: 1 HURT stats (rel) min: 0.58% max: 25.00% x̄: 5.25% x̃: 4.00% 95% mean confidence interval for bundles value: -0.78 -0.43 95% mean confidence interval for bundles %-change: -0.98% -0.23% Bundles are helped. total quadwords in shared programs: 1136767 -> 1134956 (-0.16%) quadwords in affected programs: 141780 -> 139969 (-1.28%) helped: 744 HURT: 311 helped stats (abs) min: 1.0 max: 9.0 x̄: 3.13 x̃: 2 helped stats (rel) min: 0.14% max: 26.67% x̄: 2.77% x̃: 2.13% HURT stats (abs) min: 1.0 max: 8.0 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.35% max: 10.00% x̄: 3.17% x̃: 1.69% 95% mean confidence interval for quadwords value: -1.89 -1.54 95% mean confidence interval for quadwords %-change: -1.27% -0.77% Quadwords are helped. total registers in shared programs: 90461 -> 90273 (-0.21%) registers in affected programs: 2833 -> 2645 (-6.64%) helped: 250 HURT: 82 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.08 x̃: 1 helped stats (rel) min: 6.67% max: 33.33% x̄: 14.06% x̃: 12.50% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 6.67% max: 50.00% x̄: 13.90% x̃: 12.50% 95% mean confidence interval for registers value: -0.67 -0.47 95% mean confidence interval for registers %-change: -8.62% -5.69% Registers are helped. total threads in shared programs: 55685 -> 55686 (<.01%) threads in affected programs: 76 -> 77 (1.32%) helped: 20 HURT: 17 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.30 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.47 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -0.47 0.52 95% mean confidence interval for threads %-change: 5.81% 56.35% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 1387 -> 1379 (-0.58%) spills in affected programs: 283 -> 275 (-2.83%) helped: 5 HURT: 1 total fills in shared programs: 5256 -> 5176 (-1.52%) fills in affected programs: 557 -> 477 (-14.36%) helped: 5 HURT: 1 Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!19350>
-
Otherwise the lowering is fundamentally unsound due to incorrect constant folding, even though it worked by chance with the old pass ordering. We're about to change slightly the way we handle fsin/fcos, which was enough to trigger this unsoundness. shader-db results are mostly a toss-up. total instructions in shared programs: 1520675 -> 1520220 (-0.03%) instructions in affected programs: 96841 -> 96386 (-0.47%) helped: 397 HURT: 3 helped stats (abs) min: 1.0 max: 4.0 x̄: 1.15 x̃: 1 helped stats (rel) min: 0.22% max: 6.25% x̄: 1.15% x̃: 0.40% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.58% max: 2.08% x̄: 1.08% x̃: 0.58% 95% mean confidence interval for instructions value: -1.19 -1.08 95% mean confidence interval for instructions %-change: -1.26% -1.01% Instructions are helped. total bundles in shared programs: 650088 -> 649844 (-0.04%) bundles in affected programs: 31132 -> 30888 (-0.78%) helped: 229 HURT: 23 helped stats (abs) min: 1.0 max: 4.0 x̄: 1.21 x̃: 1 helped stats (rel) min: 0.49% max: 7.14% x̄: 1.28% x̃: 0.71% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.48 x̃: 1 HURT stats (rel) min: 0.83% max: 8.33% x̄: 2.38% x̃: 1.85% 95% mean confidence interval for bundles value: -1.08 -0.86 95% mean confidence interval for bundles %-change: -1.15% -0.74% Bundles are helped. total quadwords in shared programs: 1137388 -> 1136767 (-0.05%) quadwords in affected programs: 71826 -> 71205 (-0.86%) helped: 367 HURT: 17 helped stats (abs) min: 1.0 max: 8.0 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.31% max: 17.24% x̄: 2.27% x̃: 0.96% HURT stats (abs) min: 1.0 max: 6.0 x̄: 2.29 x̃: 2 HURT stats (rel) min: 0.44% max: 11.11% x̄: 2.18% x̃: 1.47% 95% mean confidence interval for quadwords value: -1.76 -1.47 95% mean confidence interval for quadwords %-change: -2.36% -1.78% Quadwords are helped. total registers in shared programs: 90483 -> 90461 (-0.02%) registers in affected programs: 890 -> 868 (-2.47%) helped: 67 HURT: 44 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 8.33% max: 25.00% x̄: 10.52% x̃: 9.09% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.02 x̃: 1 HURT stats (rel) min: 9.09% max: 50.00% x̄: 31.15% x̃: 33.33% 95% mean confidence interval for registers value: -0.39 -0.01 95% mean confidence interval for registers %-change: 1.75% 10.25% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total threads in shared programs: 55694 -> 55685 (-0.02%) threads in affected programs: 21 -> 12 (-42.86%) helped: 1 HURT: 5 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -2.79 -0.21 95% mean confidence interval for threads %-change: -89.26% 39.26% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!19350>
-
We removed this path. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20707>
-
Now unused. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20707>
-
opaque should not be set when logicops are enabled, that needs blending even on Bifrost. Fixes is for when I believe the bug became possible to hit. The logical error is older. Fixes Piglit logicop tests again. Fixes: d849d977 ("panfrost: Avoid blend shader when not blending") Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20685>
-
This would have caught the issue from the previous commit. Split out to make backporting the previous change less onerous. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20683>
-
Future changes to nir_lower_blend cause fsat(reg.yx) instructions to be generated, which correspond to "FCLAMP.v2f16 x.h10" pseudoinstructions. These get their swizzles lowered, but we forgot to clear the swizzle out, so we end up with extra swap (cancelling out the intended swizzle). Fix the lowering logic. Fixes: ac636f5a ("pan/bi: Use FCLAMP pseudo op for clamp prop") Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20683>
-
- 14 Jan, 2023 1 commit
-
-
This isn't allowed for the same reason that AFBC of regular luminance-alpha isn't allowed (and will raise DATA_INVALID_FAULTs). Reorder the checks to ensure these formats are checked. Fixes Piglit texwrap GL_EXT_texture_sRGB-s3tc. Fixes: 476be5cb ("panfrost: Don't use texture format swizzles on v7") Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20686>
-
- 06 Jan, 2023 1 commit
-
-
Now that renderonly.h includes util/simple_mtx.h, which itself includes valgrind.h, dep_valgrind is required by any module that includes renderonly.h. In file included from ../src/gallium/auxiliary/renderonly/renderonly.h:33, from ../src/gallium/winsys/kmsro/drm/kmsro_drm_winsys.c:39: ../src/util/simple_mtx.h:34:12: fatal error: valgrind.h: No such file or directory 34 | # include <valgrind.h> | ^~~~~~~~~~~~ compilation terminated. dep_valgrind is part of idep_mesautil, which should be used instead of copying the list of deps for each util header included (which would have to be updated every time a util header changes its own includes), so let's add idep_mesautil everywhere that includes renderonly.h. Fixes: ad4d7ca8 ("kmsro: Fix renderonly_scanout BO aliasing") Tested-by:
Asahi Lina <lina@asahilina.net> Part-of: <!20530>
-
- 02 Jan, 2023 3 commits
-
-
The goal is to make files at the root of src/compiler/ apply to both Bifrost and Valhall, while ISA-specific code (e.g. instruction packing) code goes in compiler/bifrost/ or compiler/valhall/. This is what Valhall is already doing, the Bifrost specific stuff was just grandfathered in. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20455>
-
This functionality is now available on Linux with drm-shim + shader-db, and I suspect the version bundled here is broken anyway. Strictly this drops Windows/macOS support for the known-broken frontend to the shader compiler but I can't say I'm terribly worried about that. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20455>
-
This is the compiler for both Bifrost and Valhall, and presumably future Mali GPUs too. Give it a more generic name so we can use the bifrost/ path for something a bit more specific. For historical reasons the compiler's name is still "bifrost" and uses the prefix `bi_`. I think that's ok in the same way that i915 in the kernel supports way more than just i915. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20455>
-
- 28 Dec, 2022 3 commits
-
-
`(A & mask) | (B & ~mask)` Part-of: <!20441>
-
Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!20441>
-
Part-of: <!20441>
-
- 26 Dec, 2022 1 commit
-
-
Fixes: 84cd81e1 (panvk: Use common code for command buffer lifecycle management) Signed-off-by:
Hui Ni <shuizhuyuanluo@126.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!20406>
-
- 25 Dec, 2022 1 commit
-
-
When importing a BO, if it is already imported, then the handle will alias an existing BO instance. It is possible for the existing owner to free the BO after the import and leave a dangling handle before we get a chance to increase the refcount, so we need to lock the BO table mutex before importing, to make sure nobody else goes through the free path during that window. Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by:
Asahi Lina <lina@asahilina.net> Part-of: <!20403>
-
- 24 Dec, 2022 6 commits
-
-
Alyssa Rosenzweig authored
Messed up the "clang-format off" for this file. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reported-by:
Aleksey Komarov <q4arus@ya.ru> Part-of: <!20431>
-
This switches us over to Mesa's code style [1], normalizing us within the tree. The results aren't perfect, but they bring us a hell of a lot closer to the rest of the tree. Panfrost doesn't feel so foreign relative to Mesa with this, which I think (in retrospect after a bunch of years of being "different") is the right call. I skipped PanVK because that's paused right now. find panfrost/ -type f -name '*.h' | grep -v vulkan | xargs clang-format -i; find panfrost/ -type f -name '*.c' | grep -v vulkan | xargs clang-format -i; clang-format -i gallium/drivers/panfrost/*.c gallium/drivers/panfrost/*.h ; find panfrost/ -type f -name '*.cpp' | grep -v vulkan | xargs clang-format -i [1] https://docs.mesa3d.org/codingstyle.html Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20425>
-
clang-format will make a mess of these otherwise. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20425>
-
Found shuffling headers with clang format. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20425>
-
We'll use the one in src/panfrost/.clang-format instead, which isn't identical but should be good enough. This way they don't conflict with each other. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20425>
-
Based on freedreno settings, tweaked for panfrost's foreach macros. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20425>
-
- 23 Dec, 2022 5 commits
-
-
I noticed a sequence like the following in a scheduled SuperTuxKart shader: TEX_SINGLE.slot0 @r0:r1, .. LD_VAR.wait0 @r2, ... FMA r1, ... Why do we stall waiting for the TEX_SINGLE instruction when it's not actually read? Because its upper channels are *never* read, leading to a write-after-write dependency when the register allocator puts some unrelated ALU destination in there. By appropriately masking the texture instruction's write, that false dependency disappears, avoiding the stall. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20426>
-
We'll generate nontrivial ones in a moment. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20426>
-
Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20420>
-
Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20420>
-
There are too many problems with indirect draws on v7 that we never got this code path to the finish line, and none of us have a good plan (or reason) to fix this. Proper indirect draws are only possible since v10 on Mali. There was interest in using this path to implement indexed draws in PanVK, that MR is stalled and it's not clear how much sense it makes to do Vulkan on anything older than v9 or v10 at this point. This code isn't *gone*, it'll still be in git history, but I don't see a lot of reason in keeping it in tree if it's unused and complicating e.g. the sysval upload path of the driver. Indirect dispatch remains supported on v7, as that path *is* working and flipped on for end users. Indirect dispatch on v7 is considerably less complicated than indirect draws. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20420>
-
- 17 Dec, 2022 1 commit
-
-
Add the GenXML hardware description for Mali architecture v10, as implemented in Mali-G610. This is not 100% complete but it should be good enough for parity with v9. The XML itself is forked off of v9, with all Job Managerisms replaced with CSFisms. This notably includes a large number of new structures defining the instructions that run on the Command Execution Unit (CEU). This is the first step towards supporting Mali-G610 (i.e. RK3588) upstream. Next up will be pandecode support. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <!20360>
-
- 16 Dec, 2022 6 commits
-
-
../../src/panfrost/vulkan/panvk_descriptor_set.c:67:13: error: variable 'dynoffset_idx' set but not used [-Werror,-Wunused-but-set-variable] unsigned dynoffset_idx = 0, img_idx = 0; ^ Signed-off-by:
Yonggang Luo <luoyonggang@gmail.com> Reviewed-by:
Jesse Natalie <jenatali@microsoft.com> Part-of: <mesa/mesa!19875>
-
They're too restricted for AFBC. Fix up instead. There are two problems at play: 1. We can't just map the format swizzle to the pixel format ordering on v7, because the "reordered" values aren't allowed with compression. 2. We can't just compose the format swizzle with the API swizzle, because the composed swizzle is applied to the border colour, so we need to be able to apply an inverted swizzle to the border colour. That only works for bijective format swizzles. Fortunately, there's a neat solution: decompose the format's swizzle into two swizzles, the first mapping to a reordering that IS allowed for compression, and the second a bijection. Then we use the allowed reordering when texturing, apply the bijective swizzle to the API swizzle, and apply the inverse of the bijective swizzle to the border colour. When we're sampling a border colour, what's now happening mathematically is: (API swizzle o bijective swizzle)((bijective swizzle^-1)(border colour)) = (API swizzle o (bijective swizzle o bijective swizzle^-1))(border colour) = API swizzle(border colour) which is exactly what we wanted. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <mesa/mesa!20311>
-
On v6 and earlier, the hardware supports arbitrary format swizzles for AFBC, so there's no restriction on AFBC. On v8 and newer, the format swizzle gets applied to the *decompressed* interchange format, so we can effectively support BGRA of AFBC images without any special handling. (Confirmed working on v9. Obviously I can't test on v8 but the expression is cleaner if we assume optimistically it's like v9. Without hardware, we get to make that assumption :-p) That just leaves v7 as the only architecture where format swizzles are restricted for compression but there are no plane descriptor. Don't apply the restriction to the newer parts. This gets us AFBC of window surfaces on v9+. As the limiting case, fullscreen glmark2-es2-wayland -btexture (1080p) in sway on Mali-G57 from 1300fps to 2353fps. 45% reduction in frame time is nothing to sneeze at. Achoo. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <mesa/mesa!20311>
-
Introduce an enum to represent an AFBC compression mode. These modes are not formats, on Valhall they are decoupled from the format. As such, it does not make sense to use a pipe_format to represent them. Add an enum that we can use in a straightforward way on Midgard and Bifrost to fallback for texture views, and can map 1:1 to the Valhall hardware enum. In addition to being less overloaded semantically, this lets -Wswitch kick in to ensure that we handle all enums when translating. The straightforward translation raises the following warnings: ../src/panfrost/lib/pan_cs.c:437:9: warning: enumeration value ‘PAN_AFBC_MODE_R5G5B5A1’ not handled in switch [-Wswitch] 437 | switch (panfrost_afbc_format(PAN_ARCH, format)) { | ^~~~~~ ...indicating that some formats were missed, leading to assertion fails "unknown canonical AFBC format" when rendering RGB5A1, which dEQP-GLES31 does. Fixes regressions in dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.* on Valhall. Given how scarce v9 hardware is, that v10 isn't upstream yet, and the offending code was merged a week ago, this should not have actually affected anyone. At any rate, it's a good reminder we really do need CI for v9... Fixes: 8e125b6c ("panfrost: Enable AFBC of more formats") Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!20311>
-
The L8_UNORM, A8_UNORM, and L8A8_UNORM v7 formats do not support AFBC, regardless of swizzling. We're about to lift the restrictions on swizzling with AFBC on v7, so we'll need to handle these cases explicitly to avoid using AFBC in these cases. Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!20311>
-
When calculating legacy WSI strides for tiled AFBC, we need to account for the greater alignment requirement of tiled AFBC, or importing resources will fail later. Since tiled AFBC is only supported on v7 and later, and AFBC of window surfaces isn't being used on Linux on v7 and later, this probably hasn't been hit in practice. Probably. We're about to fix AFBC of window surfaces so we need to fix this side first. Fixes: 0255f554 ("panfrost: Advertise 16x16 tiled AFBC") Signed-off-by:
Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Part-of: <!20311>
-