radv: GPU hang in Witcher 3 DX12 with Mesa 22.3.1
I have a similar error as #7739 (closed) with Witcher 3 (dx12) through steam with the Proton Hotfix variant. It occurs directly after loading the game or after a teleport, so basically when a lot of Geometry is loaded. It is not reliable reproducible, sometimes it happens very often sometimes barely at all.
It crashes the entire desktop session, dmesg says:
Dez 19 14:11:30 llpc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
_0.0.0 timeout, signaled seq=162110, emitted seq=162112
Dez 19 14:11:30 llpc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process witcher3.exe pid 3807 thread WorkSubmissionT pid 3910
Dez 19 14:11:30 llpc kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Dez 19 14:11:30 llpc kernel: amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
Dez 19 14:11:30 llpc kernel: amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Dez 19 14:11:30 llpc kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Dez 19 14:11:30 llpc kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Dez 19 14:11:31 llpc kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
Dez 19 14:11:31 llpc kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
Dez 19 14:11:31 llpc kernel: [drm] VRAM is lost due to GPU reset!
ACO_DEBUG=noopt
doesn't seem to help (maybe the bug triggers less often, but could be coincidental). With RADV_DEBUG=llvm
it doesn't start at all and directly crashes. It never occured with the previous mesa version of 22.2.3.
I appended the debug output from RADV_DEBUG=hang: radv_dumps_10783_2022.12.19_18.54.47.tar.gz. It might be incomplete since umr_*.log say sh: 1: umr: not found
, even though it is installed and setuid bit is set.
I never developed debugged anything mesa or GPU so no idea what to do with the debug stuff, maybe someone could give a hint where to start or take a look?
Edit:
To reproduce:
Setup:
- Install witcher 3 via steam
- Install Proton Hotfix
- Launch Witcher 3 once and select DX12 in the launcher, click play (probably will crash, if not just exit).
- Goto Witcher 3 Properties:
- under "Compatibility": select "Proton Hotfix"
- under "General" set
%command% --launcher-skip
as launch options
- Install teleport mod:
- goto witcher 3 properties -> Local files -> Browse Local files
- create folder
mods/modTP/content/scripts/local
- copy file ttp.ws into the created folder
- Enable debug console:
- Goto $GAMEFOLDER/bin/config/base
- Insert new line into
general.ini
with contentDBGConsoleOn=true
Reproduce
- start witcher, load a game
- if crash does not occur after load, use teleport
- open map, select a location with a marker (left click)
- open console (usually ~ key)
- type in
ttp
, press enter - It works best if you go a crowded location (e.g. Novigrad)
- If crash does not occur retry to teleport to a few locations
- If crash does not occur, restart game and try again
- If crash does not occur, reboot PC
- usually after reboot the crash should be somewhat easily reproducible