Assertion `!xcb_xlib_threads_sequence_lost' failed when glXCreateContextAttribsARB fails
The attached program crashes with libX11:
got error 2 req 152
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
test: ../../src/xcb_io.c:269: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Annullato
It is the same assertion that fails in #141 (closed), but I am not sure if the underlying problem is the same, so I am filing another issue.
I can reproduce the crash with the libX11 package shipped with Debian unstable on my laptop, with the libX11 packaged shipped with Arch on a Steam Deck (the libX11 is unmodified from upstream Arch) and with a libX11 build from git master on the Steam Deck. I discovered the bug while debugging a Steam game (The Last Campfire) running under Proton (basically a patched version of Wine) on the Steam Deck.
What's happening in the test program
The test program basically creates an X11 display and a window and then tries to create a GL context with glXCreateContextAttribsARB
. However, notice that the attributes for that call are invalid, so an error has to be raised (for some reason I don't know the game tries to use an invalid GLX_CONTEXT_FLAGS_ARB
, but I think that anything that makes glXCreateContextAttribsARB
return an error would cause the same problem). Just before calling glXCreateContextAttribsARB
, the program maps the window and it configures events so that an expose event is delivered during the little sleep.
So, at the time glXCreateContextAttribsARB
is called the X client has received an event, and it receives an error immediately after. The glXCreateContextAttribsARB
implementation calls __glXSendErrorForXcb
, which in turn calls _XError
, which calls _XSetLastRequestRead
, which sets last_request_read
for the connection to the sequence number of the error. The libX11 error callback defined by the test program is then called, and the error is ignored.
Later, within the XSync
call, pending events are finally processed. When the expose event is processed, widen
is called, using last_request_read
as the reference request number to reconstruct the upper dword. But last_request_read
has already taken the sequence number of the error, so it is greater than the event's request number, which means that it will incorrectly be increased by 2^32, triggering the failing assertion.
I am not sure of what is the wrong ring of the chain here: clearly libX11 assumes that last_request_read
is always smaller than the request number of the event it is processing, but calling _XError
from Mesa violates this assumption. So either a different algorithm is used for widening that doesn't assume that last_request_read
is smaller than the event's request number (e.g., it could add 2^32 only if the widened sequence number is at least 2^31 bigger than last_request_read
), or a different reference number is used for widening, or Mesa avoids calling _XError
(but I don't know what it should do instead).
The first one seems to be the easier solution to implement, but I don't know if calling _XError
from Mesa breaks other assumptions on last_request_read
.