Crash when winding down the stream
After porting Mutter to PipeWire 0.3, I noticed that Mutter would crash randomly during or after streaming, but never before. Interestingly, it would leave no backtraces behind, nor journal entries, nothing useful. GNOME Shell would simply bye and leave.
After a lot of experimentation, I finally managed to get a reasonable backtrace.
Thread 19 "gnome-shell" received signal SIG32, Real-time event 32.
[Switching to Thread 0x7f1f60fa9700 (LWP 5264)]
0x00007f1f9ce4670e in epoll_wait () from /usr/lib/libc.so.6
#0 0x00007f1f9ce4670e in epoll_wait () at /usr/lib/libc.so.6
#1 0x00007f1f90572a80 in impl_pollfd_wait (object=<optimized out>, pfd=<optimized out>, ev=0x7f1f60fa8760, n_ev=<optimized out>, timeout=<optimized out>) at ../spa/plugins/support/system.c:150
#2 0x00007f1f90571892 in loop_iterate (object=0x558bc3c55fb8, timeout=-1) at ../spa/plugins/support/loop.c:284
#3 0x00007f1f99fd1c64 in do_loop (user_data=0x558bc64dfa10) at ../src/pipewire/data-loop.c:77
#4 0x00007f1f9c18a46f in start_thread () at /usr/lib/libpthread.so.0
#5 0x00007f1f9ce463d3 in clone () at /usr/lib/libc.so.6
It's a hairy behavior to describe. Apparently, what's happening is that PipeWire eventually cancels a thread using pthread_cancel
. This seems to be called by pw_data_loop_destroy()
. This is supported by the backtrace of the thread nr. 1 of gnome-shell at the time it crashes:
Thread 1 (Thread 0x7f1f98c5cc00 (LWP 3029)):
#0 0x00007f1f9c18ba67 in __pthread_clockjoin_ex () at /usr/lib/libpthread.so.0
#1 0x00007f1f99fd201e in pw_data_loop_stop (loop=loop@entry=0x558bc64dfa10) at ../src/pipewire/data-loop.c:209
#2 0x00007f1f99fd21f0 in pw_data_loop_destroy (loop=0x558bc64dfa10) at ../src/pipewire/data-loop.c:147
#3 0x00007f1f99fcd9a5 in pw_context_destroy (context=0x558bc8199c00) at ../src/pipewire/context.c:365
#4 0x00007f1f9d05b4df in meta_screen_cast_stream_src_finalize (object=0x7f1f8c009b90) at ../src/backends/meta-screen-cast-stream-src.c:984
#5 0x00007f1f9dd0ec21 in g_object_unref () at /usr/lib/libgobject-2.0.so.0
#6 0x00007f1f9d0559e1 in meta_screen_cast_stream_close (stream=0x558bc6722e90) at ../src/backends/meta-screen-cast-stream.c:153
#7 0x00007f1f9d055d3a in meta_screen_cast_stream_finalize (object=0x558bc6722e90) at ../src/backends/meta-screen-cast-stream.c:250
#8 0x00007f1f9d051787 in meta_screen_cast_monitor_stream_finalize (object=0x558bc6722e90) at ../src/backends/meta-screen-cast-monitor-stream.c:254
#9 0x00007f1f9dd0ec21 in g_object_unref () at /usr/lib/libgobject-2.0.so.0
#10 0x00007f1f9dc1fbf8 in g_list_foreach () at /usr/lib/libglib-2.0.so.0
#11 0x00007f1f9dc210fc in g_list_free_full () at /usr/lib/libglib-2.0.so.0
#12 0x00007f1f9d0548c1 in meta_screen_cast_session_close (session=0x558bc84505a0) at ../src/backends/meta-screen-cast-session.c:126
#13 0x00007f1f9d054c31 in handle_stop (skeleton=0x558bc84505a0, invocation=0x558bc89a3400) at ../src/backends/meta-screen-cast-session.c:270
#14 0x00007f1f9c17c69a in ffi_call_unix64 () at /usr/lib/libffi.so.6
#15 0x00007f1f9c17bfb6 in ffi_call () at /usr/lib/libffi.so.6
#16 0x00007f1f9dd1659e in g_cclosure_marshal_generic () at /usr/lib/libgobject-2.0.so.0
#17 0x00007f1f9dd1761a in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#18 0x00007f1f9dcf80e8 in () at /usr/lib/libgobject-2.0.so.0
#19 0x00007f1f9dcfd8c8 in g_signal_emitv () at /usr/lib/libgobject-2.0.so.0
#20 0x00007f1f9cf7ab94 in _meta_dbus_screen_cast_session_skeleton_handle_method_call (connection=0x558bc397c670, sender=0x7f1f840aa7e0 ":1.97", object_path=0x558bc51026d0 "/org/gnome/Mutter/ScreenCast/Session/u1", interface_name=0x558bc45c7650 "org.gnome.Mutter.ScreenCast.Session", method_name=0x7f1f840400f0 "Stop", parameters=0x7f1f8431a2c0, invocation=0x558bc89a3400, user_data=0x558bc84505a0) at src/meta-dbus-screen-cast.c:2647
#21 0x00007f1f9dd8f6d6 in () at /usr/lib/libgio-2.0.so.0
#22 0x00007f1f9ddad3cf in () at /usr/lib/libgio-2.0.so.0
#23 0x00007f1f9dc1c88f in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0
#24 0x00007f1f9dc1e831 in () at /usr/lib/libglib-2.0.so.0
#25 0x00007f1f9dc1f843 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#26 0x00007f1f9cff8ba4 in meta_run () at ../src/core/main.c:676
#27 0x0000558bc1a01342 in main (argc=1, argv=0x7ffed364f1b8) at ../src/main.c:552
As documented in man pthread_cancel
:
On Linux, cancellation is implemented using signals. Under the NPTL threading implementation, the first real-time signal (i.e., signal 32) is used for this purpose. On LinuxThreads, the second real-time signal is used, if real-time signals are available, otherwise SIGUSR2 is used.
It seems like receiving signal 32 is a documented behavior, but I'm not sure (1) why gnome-shell aborts execution immediately, nor (2) where should this signal be handled.