lima: swap behaviour perhaps does not honour EGL_BUFFER_PRESERVED
System information
Distribution: Maemo Leste (Devuan Beowulf based)
# cat /etc/os-release | grep NAME
PRETTY_NAME="Devuan GNU/Linux 3 (beowulf)"
NAME="Devuan GNU/Linux"
VERSION_CODENAME=beowulf
Device: PinePhone (so no lspci
shown here)
Kernel:
Linux devuan-pinephone 5.14.0-rc1 #1 SMP Fri Jul 16 09:11:17 UTC 2021 aarch64 GNU/Linux
Mesa version 21.2.5 (not distro default):
# glxinfo -B | grep "OpenGL version string"
OpenGL version string: 2.1 Mesa 21.2.5
Xorg version (not distro default):
# X -version
X.Org X Server 1.21.1.1
X Protocol Version 11, Revision 0
Current Operating System: Linux devuan-pinephone 5.14.0-rc1 #1 SMP Fri Jul 16 09:11:17 UTC 2021 aarch64
Kernel command line: console=tty0 console=ttyS0,115200 root=/dev/mmcblk0p2 rw rootwait rootfstype=ext4 fbcon=rotate:1
xorg-server 2:21.1.1-2 (https://www.debian.org/support)
Current version of pixman: 0.36.0
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Window manager: hildon-desktop
Describe the issue
With compositing enabled in X, some form of corruption appears in many applications, it often manifests itself as some parts of a window or buffer not being properly refreshed. We traced this down to likely a problem in egl swap buffer behaviour.
The sample test that I am using here is the Maemo terminal emulator called osso-xterm. What is visible is using a simple finger-swipe to select a part of the terminal, this selected area gets renderer as black. Just this simple gesture is quite a good test for surfacing the problems.
hildon-desktop uses clutter 0.8 which has two ways of re-drawing damaged areas, using a fallback method or texture from pixmap. Both suffer from this behaviour, but the fallback method is a better way to visually see the problems.
clutter does not set EGL_SWAP_BEHAVIOUR, but setting it doesn't seem to make much of a difference. The trace replays with the same problems on lima, but it seems to replay fine on many other drivers that we tested on. (Also the same code works fine on llvmpipe and the proprietary PowerVR driver, mesa intel GL_RENDERER: Mesa DRI Intel(R) UHD Graphics 620 (KBL GT2)
driver, and mesa amdgpu driver GL_RENDERER: AMD RENOIR (DRM 3.40.0, 5.11.7-gentoo-dist, LLVM 13.0.0)
)
Here is a trace which shows the problem when replayed on lima but seems visually OK on other devices (when replayed): https://wizzup.org/dirlist/maemo-leste/lima/new-traces/lima-bug-report.trace
This issue doesn't just affect osso-xterm, but also firefox and many other applications when compositing is enabled (via hildon-desktop).
Log files as attachment
dmesg:
$ dmesg | egrep '(lima|drm)'
[ 1.030154] sun4i-drm display-engine: bound 1100000.mixer (ops 0xffffffc0108d0398)
[ 1.038228] sun4i-drm display-engine: bound 1200000.mixer (ops 0xffffffc0108d0398)
[ 1.045208] sun4i-drm display-engine: No panel or bridge found... RGB output disabled
[ 1.051773] sun4i-drm display-engine: bound 1c0c000.lcd-controller (ops 0xffffffc0108cd6f8)
[ 1.059172] sun4i-drm display-engine: bound 1c0d000.lcd-controller (ops 0xffffffc0108cd6f8)
[ 1.066292] sun4i-drm display-engine: bound 1ca0000.dsi (ops 0xffffffc0108cf1f8)
[ 1.090145] sun4i-drm display-engine: bound 1ee0000.hdmi (ops 0xffffffc0108cf6d0)
[ 1.097216] [drm] Initialized sun4i-drm 1.0.0 20150629 for display-engine on minor 0
[ 6.385997] lima 1c40000.gpu: gp - mali400 version major 1 minor 1
[ 6.396171] lima 1c40000.gpu: pp0 - mali400 version major 1 minor 1
[ 6.409021] lima 1c40000.gpu: pp1 - mali400 version major 1 minor 1
[ 6.418020] lima 1c40000.gpu: l2 cache 64K, 4-way, 64byte cache line, 64bit external bus
[ 6.425369] lima 1c40000.gpu: bus rate = 200000000
[ 6.428990] lima 1c40000.gpu: mod rate = 432000000
[ 6.438909] [drm] Initialized lima 1.2.0 20200215 for 1c40000.gpu on minor 1
Screenshots/video files (if applicable)
This is not a video of the exact same trace, but this videos shows what it ought to look like more or less: https://wizzup.org/dirlist/maemo-leste/lima/new-traces/fallback.mp4 (made with ffmpeg
using apitrace dump-images
)
Any extra information would be greatly appreciated
IRC conversation:
16:28 < Wizzup> enunes: anarsoul: I figured out one of the problems that made things render poorly, With that now fixed, I have traces that renders fine on amdgpu, llvmpipe,
nvidia binary drivers, but don't seem ok on the pinephone with lima (with X/glamor) for me
16:28 < Wizzup> https://wizzup.org/dirlist/maemo-leste/lima/new-traces/
16:30 < freemangordon> Wizzup: I should be registered now, could you confirm you see my messages?
16:31 < Wizzup> yep, can see
16:31 < freemangordon> ok
16:37 < Wizzup> this video is made using ffmpeg from dump-images in api trace, rendered with amdgpu (mesa):
https://wizzup.org/dirlist/maemo-leste/lima/new-traces/fallback.mp4
16:38 < Wizzup> I can't get dump-images to actually work on my pinephone for some reason, but I can make a video with camera if that helps
18:11 < enunes> Wizzup: looking at hildon-desktop-doublebuf-fallback.trace it seems that the application does not glClear after eglSwapBuffers, but does not seem to use any
extension like partial_update or buffer_age, do you know if that is on purpose?
18:12 < enunes> if it is hoping to reuse the frame, I think doing that is unreliable if not undefined
18:17 < Wizzup> I think EGL_BUFFER_PRESERVED was set in that trace (if not, I will have to re-check, but iirc the result was the same) - that sets the buffer age does it
not?
18:18 < enunes> I cant find it searching by EGL_BUFFER_PRESERVED
18:18 < Wizzup> Yeah it should be an eglSurfaceAttrib
18:19 < Wizzup> I'll make a new trace, but I am pretty confident the outcome will be the same since I had that change locally earlier today and it didn't help
18:19 < enunes> I would doublecheck if apitrace actually implements replaying EGL_BUFFER_PRESERVED
18:20 < Wizzup> ok
18:20 < Wizzup> what I meant is that with eglSurfaceAttrib in my code on the pinephone the problem seemed to persist
18:25 < enunes> ok, so playing the trace on my intel system without EGL_BUFFER_PRESERVED does result in glitches, which seem to make sense since I get alternating blocks of
black with selection as its not reusing the buffers properly
18:26 < enunes> I think EGL_BUFFER_PRESERVED is not too common so if the application relies on that, it seems fishy
18:27 < Wizzup> freemangordon might be in a better position to comment
18:28 < Wizzup> The application is using clutter, for what it is worth
18:35 < Wizzup> looks like eglSurfaceAttrib setting the EGL_SWAP_BEHAVIOUR to EGL_BUFFER_PRESERVED returns EGL_FALSE
18:36 < Wizzup> (at least in qapitrace)
18:42 < Wizzup> Rading https://lists.freedesktop.org/archives/mesa-dev/2018-July/199332.html (from https://gitlab.freedesktop.org/lima/mesa/-/issues/59) it looks lima
specifically doesn't support the buffer age, if I understand correctly
18:51 < freemangordon> apitrace doesn;t seems to support EGL_SWAP_BEHAVIOUR
18:53 < freemangordon> enunes: I don;t understand what is fishy about EGL_SWAP_BEHAVIOUR, could you elaborate?
18:54 < freemangordon> https://www.khronos.org/registry/EGL/sdk/docs/man/html/eglSurfaceAttrib.xhtml
18:55 < freemangordon> BTW I have intel around, I will replay on it as well to see what the result will be
18:55 < enunes> I mean its not used by many applications so it is probably less tested and this is some place where there might be some bug. also if apitrace doesnt even
handle it properly, it might be making your debug more difficult
18:57 < freemangordon> enunes: I see. Well, default behaviour is implementation specific, but it seems most if not all choose EGL_BUFFER_PRESERVED, at least by judging of
replay on AMD/NV/llvmpipe and on-device behaviour on PVR
18:58 < enunes> it is also just not a very good idea to use it for something that targets the mali anyway, there is a writeup at
https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/posts/mali-performance-3-is-egl_5f00_buffer_5f00_preserved-a-good-thing
18:58 < freemangordon> maybe it makes sense to read what lima thinks about the default value
19:00 < freemangordon> enunes: still, there seems to be some issue in lima driver
19:01 < enunes> yes there are probably still issues in the driver :)
19:01 < freemangordon> :)
19:02 < freemangordon> so, clutter seems to prefer partial updates and it relies on buffers being preserved
19:03 < freemangordon> I guess we can tell it to do full scene update, but that would affect the performance on devices with slower GPUS (like d4 and friends and esp n900
with its dated sgx530)
19:03 < enunes> I think EGL_SWAP_BEHAVIOUR is a different thing than partial updates, if you do partial updates you should query the buffer at the beginning of the frame iirc
19:04 < freemangordon> not really, there is a note in buffer_age extension about EGL_SWAP_BEHAVIOUR
19:05 < freemangordon> if I read the docs correctly that is
19:05 < freemangordon> :)
19:05 < freemangordon> https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_buffer_age.txt
19:06 < freemangordon> EGL_BUFFER_PRESERVED gives 2 frames age, and this is supported, like incremental damage for the last 2 frames
19:06 < freemangordon> that's why I said "partial updates"
19:07 < freemangordon> by "last frames" I mean current and previous frame
19:08 < enunes> right but those are different things, what I understand is you do either buffer_age/partial_update (better) or BUFFER_PRESERVED, the blurb in the docs is
more to say that it wont break in case you happen to try to use both
19:09 < enunes> in the trace I got there was none of them
19:09 < freemangordon> sure those are different what i was trying to say is that if buffer_age is not used, but BUFFER_PRESERVED, you are guaranteed to always have n-1 age :)
19:09 < freemangordon> sure, but we set BUFFER_PRESERVED to no use
19:10 < freemangordon> ofc we can provide traces with that call included
19:10 < freemangordon> but we though it doesn;t make sense to do it, as it changes nothing on device
19:10 < freemangordon> *thought
19:11 < enunes> it can be that its not the main issue, but I think it should be fixed before proceeding with further analysis
19:11 < enunes> otherwise we are basically debugging an invalid application
19:12 < enunes> with the current state it seems that just setting BUFFER_PRESERVED is the straightforward thing to do and even if its not the best practice, I agree that
should work
19:12 < freemangordon> enunes: I think we shall first check what is the default value reported
19:12 < enunes> if we suspect BUFFER_PRESERVED is broken with lima maybe it can be validated with a simple triangle app or something
19:13 < freemangordon> kids are shouting at me that I am on the PC, ttyl :)
19:13 < enunes> well the default shouldnt matter if you rely on it being BUFFER_PRESERVED
20:11 < anarsoul> freemangordon: it's handled by winsys, not lima
20:12 < anarsoul> lima as backend driver doesn't really care if you want to preserve buffers or not. If frontend doesn't call clear explicitly, it will reload the buffer
into tile buffer
20:13 < anarsoul> btw that's why using preserved buffers is expensive on tiling GPUs, if you don't do clear, you have to essentially draw a textured quad to restore contents
of old buffer
20:14 < anarsoul> well, bad wording, it's not you that have to draw it, lima does it for you
20:14 < freemangordon> :nod:
20:15 < freemangordon> I understand it might not be the best performance wise in some ocasions
20:15 < anarsoul> freemangordon: actually in most cases it's a bad idea, especially if you don't use scissors
20:15 < freemangordon> we use
20:15 < anarsoul> OK, good
20:16 < freemangordon> and damage tracking
20:16 < freemangordon> well, 'we' is clutter
20:17 < freemangordon> but it either uses viewport(not used) or scissors to limit the updated area
20:17 < freemangordon> the point is that on all the GPUs we tested (we did not on intel), the trace renders correctly
20:17 < freemangordon> I will test on intel later on
20:18 < anarsoul> freemangordon: other GPUs may not require explicit tile reload
20:19 < freemangordon> well, we use the same code on sgx 530 and sgx 540
20:19 < freemangordon> both are tiling GPUs
20:19 < anarsoul> freemangordon: sorry, no idea how these work :)
20:19 < freemangordon> tiling
20:19 < anarsoul> freemangordon: it doesn't mean that they need to reload tile buffer explicitly
20:19 < anarsoul> there may be an implicit reload if driver doesn't send clear command
20:19 < anarsoul> or something like that
20:19 < freemangordon> won;t argue, I have no idea how those work too
20:20 < anarsoul> try with panfrost if you have any devices with newer Mali, it does explicit reload for sure :)
20:20 < freemangordon> anarsoul: well, the point is to have lete working with lima on pinephone
20:20 < freemangordon> *leste
20:21 -!- jernej_ is now known as jernej
20:21 < anarsoul> well, get us an apitrace that doesn't work on lima and works on intel (IIRC enunes' laptop uses Intel GPU, mine is also Intel)
20:22 < freemangordon> ok
20:22 < anarsoul> and someone will look into it eventually
20:22 < freemangordon> I have intel, will see what we can do :)
20:23 -!- camus [~Instantbi@2409:8a1e:911c:df30:f514:51a4:e6b5:dc8b] has quit [Ping timeout: 480 seconds]
20:23 < anarsoul> freemangordon: tbh I haven't heard about issues like that with lima yet, so there must be something special about your compositor
20:24 < anarsoul> gnome-shell (X11 and wayland), plasma (X11 and wayland), xcompmgr (X11), sway (wayland), weston (wayland) - all work just fine on lima
20:25 < anarsoul> if you know what it is, that may help to pinpoint the issue :)
20:35 < freemangordon> yeah, I think we'll be able to identify what's going on
21:09 < freemangordon> anarsoul: is "GL_RENDERER: Mesa DRI Intel(R) UHD Graphics 620 (KBL GT2)" good enough?
21:10 < freemangordon> https://wizzup.org/dirlist/maemo-leste/lima/new-traces/hildon-desktop-doublebuf-fallback.trace replays just fine on that
21:13 < anarsoul> freemangordon: yep
21:14 < freemangordon> two other traces render just fine too
21:14 < freemangordon> the one with EGL_SWAP_BEHAVIOUR gives "invalide attrinute" warning or somesuch
21:24 < anarsoul> freemangordon: can you open a bug at https://gitlab.freedesktop.org/mesa/mesa/-/issues and attach the trace there?
21:25 < freemangordon> sure
21:26 < freemangordon> Wizzup: ^^^
21:47 < freemangordon> Wizzup: I guess it would be good if you can capture a video of what really happens on the device
22:10 < Wizzup> will do, tomorrow
22:10 < Wizzup> ty!