Glamor commits leads to Xorg freezing if using H264 Fluendo codecs
Description
If using the modesetting driver on Intel devices (Valleyview, Baytrail were tested up to now) and starting to play a H264 video with parole mediaplayer the Xorg freeze. The mouse pointer is still moveable and changes its forms but all other screen contents are frozen. This happens with the fluendo gstreamer acceleration and it happens immediately. As it is not so easy to get the fluendo codecs and due to other tests a much simpler way to get the freeze was found. This is disable the DRI3 usage in GLAMOR and start glxinfo.
How to reproduce
The simple possibility to reproduce the issue is to build a patched Xorg server and there you get the crash if you call glxinfo. Change in the glamor/glamor_egl.c file following function:
void
glamor_egl_screen_init(ScreenPtr screen, struct glamor_context *glamor_ctx)
{
ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
struct glamor_egl_screen_private *glamor_egl =
glamor_egl_get_screen_private(scrn);
#ifdef DRI3
glamor_screen_private *glamor_priv = glamor_get_screen_private(screen);
#endif
glamor_egl->saved_close_screen = screen->CloseScreen;
screen->CloseScreen = glamor_egl_close_screen;
glamor_egl->saved_destroy_pixmap = screen->DestroyPixmap;
screen->DestroyPixmap = glamor_egl_destroy_pixmap;
glamor_ctx->ctx = glamor_egl->context;
glamor_ctx->display = glamor_egl->display;
glamor_ctx->make_current = glamor_egl_make_current;
#ifdef DRI3
/* Tell the core that we have the interfaces for import/export
* of pixmaps.
*/
glamor_enable_dri3(screen);
/* If the driver wants to do its own auth dance (e.g. Xwayland
* on pre-3.15 kernels that don't have render nodes and thus
* has the wayland compositor as a master), then it needs us
* to stay out of the way and let it init DRI3 on its own.
*/
if (!(glamor_priv->flags & GLAMOR_NO_DRI3)) {
/* To do DRI3 device FD generation, we need to open a new fd
* to the same device we were handed in originally.
*/
glamor_egl->device_path = drmGetDeviceNameFromFd2(glamor_egl->fd);
if (!dri3_screen_init(screen, &glamor_dri3_info)) {
xf86DrvMsg(scrn->scrnIndex, X_ERROR,
"Failed to initialize DRI3.\n");
}
}
#endif
}
to
void
glamor_egl_screen_init(ScreenPtr screen, struct glamor_context *glamor_ctx)
{
ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
struct glamor_egl_screen_private *glamor_egl =
glamor_egl_get_screen_private(scrn);
#ifdef DRI3
glamor_screen_private *glamor_priv = glamor_get_screen_private(screen);
#endif
glamor_egl->saved_close_screen = screen->CloseScreen;
screen->CloseScreen = glamor_egl_close_screen;
glamor_egl->saved_destroy_pixmap = screen->DestroyPixmap;
screen->DestroyPixmap = glamor_egl_destroy_pixmap;
glamor_ctx->ctx = glamor_egl->context;
glamor_ctx->display = glamor_egl->display;
glamor_ctx->make_current = glamor_egl_make_current;
#if 0
/* Tell the core that we have the interfaces for import/export
* of pixmaps.
*/
glamor_enable_dri3(screen);
/* If the driver wants to do its own auth dance (e.g. Xwayland
* on pre-3.15 kernels that don't have render nodes and thus
* has the wayland compositor as a master), then it needs us
* to stay out of the way and let it init DRI3 on its own.
*/
if (!(glamor_priv->flags & GLAMOR_NO_DRI3)) {
/* To do DRI3 device FD generation, we need to open a new fd
* to the same device we were handed in originally.
*/
glamor_egl->device_path = drmGetDeviceNameFromFd2(glamor_egl->fd);
if (!dri3_screen_init(screen, &glamor_dri3_info)) {
xf86DrvMsg(scrn->scrnIndex, X_ERROR,
"Failed to initialize DRI3.\n");
}
}
#endif
}
After this you get the freeze after calling glxinfo. I do not know what the fluendo codecs do to trigger the same issue as they are binary only. I'm sure you can also set the flags to GLAMOR_NO_DRI3
to get the same problem but this is not really important here.
Analyse
The issue only happens with Xorg 1.20 and there since rc4. So we went up the commits from rc3 to rc4 and found that reverting the commits:
glamor: Reallocate pixmap storage without modifiers if necessary
glamor: Push make_exportable into callers
Resolve the issue (the revert of commit glamor: Push make_exportable into callers
is not really needed but the 2 commits seems to be related). After this the Xorg will not freeze. The Mesa used was 19.0.5 and 19.1.4 if this helps.
I'm not really sure what really is happening here but the code where the freeze happen is
if (pixmap_priv->image &&
(modifiers_ok || !pixmap_priv->used_modifiers))
return TRUE;
Which is in the glamor_make_pixmap_exportable
function in glamor/glamor_egl.c
Summary
Reverting the 2 above mentioned commits seems to help and up to now no negative issues appeared so far (will need some more deeply tests) but as I do not really know what the reasons for the 2 commits really were I'm not happy with the reverts. If someone with more knowledge here could look over this and explain what is going wrong here and how the correct fix should look like this would be preferable.