radv: regression in 23.2 causing GPU hangs
Description
System hangs after playing for 10 to 30 min
Games that crashed: Overwatch 2, Starfield, Friends vs Friends, and some others
Log files (for system lockups / game freezes / crashes)
Steps to reproduce
Start game either normally or with gamescope
Load up save/load into game/lobby
Wait about 10-30 min, you don't even have to do anything
System will freeze and hang
System information
System:
Host: Kohai Kernel: 6.5.5-zen1-1-zen arch: x86_64 bits: 64 compiler: gcc
v: 13.2.1 Desktop: KDE Plasma v: 5.27.8 tk: Qt v: 5.15.10 wm: Gamescope
dm: SDDM Distro: EndeavourOS base: Arch Linux
CPU:
Info: 6-core model: AMD Ryzen 5 5600X bits: 64 type: MT MCP arch: Zen 3+
rev: 0 cache: L1: 384 KiB L2: 3 MiB L3: 32 MiB
Speed (MHz): avg: 3878 high: 4641 min/max: 2200/4650 boost: enabled cores:
1: 3712 2: 4641 3: 3713 4: 4641 5: 3920 6: 3714 7: 3710 8: 4641 9: 3854
10: 2200 11: 4084 12: 3714 bogomips: 88636
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
Device-1: AMD Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] vendor: ASRock
driver: amdgpu v: kernel arch: RDNA-2 pcie: speed: 16 GT/s lanes: 16 ports:
active: DP-2,HDMI-A-1 empty: DP-1,DP-3 bus-ID: 08:00.0 chip-ID: 1002:73bf
Display: wayland server: X.org v: 1.21.1.8 with: Xwayland v: 23.2.1
compositors: 1: Gamescope 2: kwin_wayland driver: X: loaded: amdgpu
unloaded: modesetting,radeon alternate: fbdev,vesa dri: radeonsi
gpu: amdgpu d-rect: 4480x1440 display-ID: 0
Monitor-1: DP-2 pos: primary,left res: 2560x1440 size: N/A
Monitor-2: HDMI-A-1 pos: right res: 1920x1080 size: N/A
API: EGL v: 1.5 platforms: device: 0 drv: radeonsi device: 1 drv: swrast
surfaceless: drv: radeonsi wayland: drv: radeonsi x11: drv: radeonsi
inactive: gbm
API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 23.2.1-arch1.1
glx-v: 1.4 direct-render: yes renderer: AMD Radeon RX 6800 XT (navi21 LLVM
16.0.6 DRM 3.54 6.5.5-zen1-1-zen) device-ID: 1002:73bf display-ID: :1.0
API: Vulkan v: 1.3.264 surfaces: xcb,xlib,wayland device: 0
type: discrete-gpu driver: mesa radv device-ID: 1002:73bf
It also happened with my 6700 xt, just a lot more rarely
If applicable
- Wine/Proton version: Proton GE-16
- None of these affect the crash, only the driver change does
Regression
23.1.7 and 23.1.8 do not have this issue, I have tried to trigger it multiple times by just playing the games above
It definitely became consistent before the 22nd of September because that was when I tried to play Friends vs Friends and Starfield for the first time with mesa-git. Changing kernels didn't affect it, and it still exists as of 4bc58c9f, not even a fresh install of arch fixed it. If you need more information I can provide it, just tell me what I need to do
Sorry for this being messy, it is late where I am.
UPDATE:
RADV_DEBUG=hang output for Starfield
radv_dump.zip