Segfault in mtx_unlock/amdgpu_bo_slab_destroy
System information
- OS: Arch Linux
- GPU: AMD RX 480 (
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev c7)
) - Kernel version: 5.11.16-arch1-1
- Mesa version: Mesa 21.0.3 (official Arch package built locally for debug symbols)
- Sway version: sway version 1.6-66721051 (Apr 24 2021, branch 'community/packages/sway')
- Firefox Developer Edition, Version 89.0b7
Describe the issue
The segfault occurs in sway, but due to the location of the segfault-causing instruction it appears to be a Mesa/amdgpu issue.
Context
The segfault occurs occasionally while running Sway (maybe once every 10 hours of runtime). I could not determine a precise action that causes the segfault.
My only theory is that the issue might be related to having a lot of tabs open in Firefox. Note that I have disabled hardware acceleration in Firefox, so I guess Mesa was only involved during compositing the browser windows (?).
Details
The segfault in the syslog:
kernel: sway:cs0[1006]: segfault at 104fc0008 ip 00007f54b2f34b61 sp 00007f54aa98e8c0 error 6 in radeonsi_dri.so[7f54b2712000+f49000]
kernel: Code: 01 d3 e0 48 98 48 39 d0 72 df 48 89 ef ff 15 c6 28 c6 00 48 8d 45 40 48 89 ef 48 89 43 28 48 8b 45 40 48 89 43 20 48 8b 45 40 <4c> 89 60 08 4c 89 65 40 5b 5d 41 5c ff 25 c5 27 c6 00 0f 1f 44 00
I've attached a full backtrace below, but for clarity I'm only including the most relevant part here (this is the thread in which the segfault occured):
Thread 1 (Thread 0x7f54aa98f640 (LWP 1006)):
#0 0x00007f54b2f34b61 in mtx_unlock (mtx=0x558da9874a08) at ../mesa-21.0.3/include/c11/threads_posix.h:274
bo = 0x104fbffe0
#1 pb_slab_free (entry=<optimized out>, slabs=0x558da9874a08) at ../mesa-21.0.3/src/gallium/auxiliary/pipebuffer/pb_slab.c:167
bo = 0x104fbffe0
#2 amdgpu_bo_slab_destroy(pb_buffer*) (_buf=0x104fbffe0) at ../mesa-21.0.3/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c:661
bo = 0x104fbffe0
#3 0x0000558da834f658 in ()
#4 0x0000558da97ca140 in ()
#5 0x0000000000000000 in ()
Regression
Due to the rarity of the segfault, I can't make any precise statements about when this started. From the package manager logs, I see that mesa was upgraded to 21.0.1 (from 20.3.4) at the end of March on my system, but I'm not sure if the issue occured before that.
Log files as attachment
- Coredump analyzed using GDB (with backtrace for every thread): mesa.dump.txt