Erratic and slow glDrawBuffer and GLX_EXT_buffer_age on Intel IVB X11
System information
System 1 - Exhibits the problem.
- OS: Arch Linux
- GPU: 00:02.0 VGA compatible controller [0300]: Intel Corporation 3rd Gen Core processor Graphics Controller [8086:0166] (rev 09)
- Kernel version: 5.16.8 x86_64
- Mesa: 21.3.5
- Xserver: 1.21.1.3
- Desktop manager and compositor: SDDM, KDE, Kwin 5.24.0
System 2 - Does not exhibit the problem.
- OS: Arch Linux
- GPU: 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Lucienne [1002:164c] (rev c2)
- Kernel version: 5.16.7-zen x86_64
- Mesa: 21.3.6
- Xserver: 1.21.1.3
- Desktop manager and compositor: SDDM, KDE, Kwin 5.24.1
Describe the issue
-
The primary issue is that on an Intel Ivy Bridge laptop, under certain workloads, the value of
buffer_age
returned byglXQueryDrawable(dpy, drawable, GLX_BACK_BUFFER_AGE_EXT, &buffer_age);
is erratic, in a way to be detailed later, and sometimes very large - usually at least 6, but can reach over 200. If the workload is an OpenGL-using GTK app, then this triggers slow paths inside GTK, which likes to have 1 <=
buffer_age
<= 3. By comparison, a second system I tested returned 2 consistently, and only occasionally 4. -
The second, related issue is that on the above-mentioned slow-path, the function
glDrawBuffer(GL_FRONT);
is called, and takes an enormous amount of time - sometimes nearly 8 ms! This wreaks havoc with the app, turning it into a juddering mess. Again, by comparison, the second system always ran this function instantaneously.
To give more detail, the GTK app above is a development snapshot of Inkscape that has OpenGL support and a built-in benchmark function. I made a video of the benchmark, showing the values of buffer_age
obtained by GTK each frame, and the time taken to run glDrawBuffer
if it was run.
https://drive.google.com/file/d/18kWeRZUmNOdaemO5NMqdnVkZuxSQNP8x/view
Here's a brief summary of what happens:
- Initially the Inkscape window is focused, and the buffer age is stuck at 34.
- I focus a terminal, and the buffer age sticks at 1.
- I move the mouse back to Inkscape and the buffer age jumps up to 234.
- I focus Inkscape, and the buffer age goes to 1 for about 13 frames then jumps to 13.
- The same thing then continues to happen whenever the focused window is changed, except with different values: 4, 13, 5, 6, 20, 5, 20. No more changes in response purely to mouse position are generated.
- All the while,
glDrawBuffer
was taking around 8 ms to run, except whenbuffer age
briefly dropped to 1 and it wasn't being called. - Disabling kwin's compositor doesn't affect the issue. Neither does focusing and full-screening Inkscape.
On a second more recent system, the same steps resulted in almost perfectly consistent values of 2. After enough testing, I managed to get it to stick at 4, and it didn't go back to 2 again until I changed the focused window by clicking. While it was at 4, glDrawBuffer
was returning instantaneously.
Reproducibility
The problem is 100% reproducible on System 1, but never occurs on System 2. On System 1, the only random element is the sequence of values taken by buffer_age
, but values or 5-6 are typical. Lower values are very rare, but it can stick at 1 for periods of time before jumping back up by itself. Switching from X11 to Wayland, the issue goes away, and performance is much better. (Specifically, Inkscape's redraw performance is 2x better because it is no longer spending half its time waiting for glDrawBuffer
.)
Other notes
As an API user (not a developer!), I am shocked that glDrawBuffer
takes any time at all, because it looks like a function that merely manipulates GL state rather than a function that actually does anything.
I am also puzzled as to why the buffer age is so large. You would think that the driver would attempt to minimise this value, so while it could be expected to jump up during periods of heavy load, it should settle down to 1-3 afterwards, especially if the window is focused and full-screen. If I've understood things correctly, a constant value of 234 for the buffer age suggests that the driver is maintaining a set of 234 buffers in a ring - which seems ridiculous! So is the value reported wrong, or is something like this actually going on?
Finally, I admittedly don't know enough to tell whether this is a bug in Mesa or the expected behaviour. The reason I think it's a bug is because 1) it doesn't happen on another system, and 2) when it happens the performance of the app goes through the floor. Still, if it does turn out this is expected behaviour, then what is triggering it, and how can apps avoid it from happening?
Extra information
-
The Inkscape development snapshot is pre-built in AppImage form at
https://gitlab.com/pbs3141/inkscape/-/jobs/2103043169
To access the benchmark, turn on Preferences / Rendering / Developer Mode / Animate.
-
It must be used with the following patched GTK in
LD_PRELOAD
, since it fixes various slowdowns which would otherwise mask the issue:https://gitlab.gnome.org/pbs3141/gtk/-/commit/2ce10d1999cdfa7c54247604d7f8c80f0afe8e98
It additionally prints
buffer_age
to the console andglDrawBuffer
timings to a file/tmp/framecheck.txt
. -
The timeline viewer at the top of the screen is optional and can be found here: