radv: GPU hang in Space Engineers and Tony Hawk's Pro Skater on RX 6700XT
Brief summary of the problem:
A few games, both on steam and on EGS (through Heroic) seem to cause a GPU reset. On steam, playing space engineers and loading a save game causes a reset instantly 100% of the time. On EGS/Heroic, opening Tony Hawk's Pro Skater 1+2 causes a GPU reset instantly 100% of the time.
In the case of Space Engineers, changing proton versions doesnt seem to do anything, and I have followed the install instructions on ProtonDB. Mainly to install the correct DotNet48.
In THPS, ive tried several versions of wine, including wine-ge 7.0 rc4 Staging.
Other games I've played work fine in both store fronts.
I have also tried several versions of linux-firmware notably: 20210315.3568496 and 20210818.c46b8c3
Hardware description:
- CPU: Ryzen 9 5950X
- GPU: Radeon 6700XT, lspci line:
0c:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT / 6800M] [1002:73df] (rev c1)
- System Memory: 64GB DDR4 3600 XMP profile 1
- Display(s): 3
Type of Display Connection:
- (primary) Dell Ultrasharp U2518D (2560x1440), HDMI (with a HDMI switch between multiple computers)
- Panasonic Plasma TV (1080p) HDMI (to a Yamaha RX-V667 Reciever that then outputs to the TV)
- Generic CRT (640x480) on DP to HDMI converter cable to composite converter
Output of inxi -GSCI -xx
System:
Host: rufus Kernel: 5.15.12-arch1-1 x86_64 bits: 64 compiler: gcc v: 11.1.0
Desktop: bspwm 0.9.10-33-ge22d0fa dm: startx Distro: Arch Linux
CPU:
Info: 16-core model: AMD Ryzen 9 5950X bits: 64 type: MT MCP arch: Zen 3
rev: 0 cache: L1: 1024 KiB L2: 8 MiB L3: 64 MiB
Speed (MHz): avg: 2536 high: 3076 min/max: 2200/5083 boost: enabled
cores: 1: 2698 2: 2853 3: 2873 4: 2880 5: 2871 6: 2875 7: 2874 8: 2873
9: 2196 10: 2199 11: 2197 12: 2198 13: 2196 14: 2200 15: 2199 16: 2198
17: 3076 18: 2877 19: 2878 20: 2878 21: 2874 22: 2872 23: 2877 24: 2877
25: 2197 26: 2199 27: 2198 28: 2194 29: 2197 30: 2196 31: 2197 32: 2198
bogomips: 217677
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
Device-1: AMD Navi 22 [Radeon RX 6700/6700 XT / 6800M] vendor: XFX Limited
driver: amdgpu v: kernel bus-ID: 0c:00.0 chip-ID: 1002:73df
Display: server: X.Org 1.21.1.3 compositor: picom driver: loaded: amdgpu
resolution: 1: 1920x1080~60Hz 2: 640x480~60Hz 3: 2560x1440~60Hz s-dpi: 96
OpenGL: renderer: AMD Radeon RX 6700 XT (NAVY_FLOUNDER DRM 3.42.0
5.15.12-arch1-1 LLVM 13.0.0)
v: 4.6 Mesa 21.3.3 direct render: Yes
Info:
Processes: 492 Uptime: 1h 34m Memory: 62.74 GiB used: 4.65 GiB (7.4%)
Init: systemd v: 250 Compilers: gcc: 11.1.0 clang: 13.0.0 Packages:
pacman: 1398 Shell: Bash v: 5.1.12 running-in: urxvtd inxi: 3.3.11
System information:
- Distro name and Version: Archlinux (latest)
- Kernel version:
5.15.12-arch1-1 #1 SMP PREEMPT x86_64
- Custom kernel: N/A
- AMD official driver version: N/A Using Mesa 21.3.3 & Vulkan-Radeon (RADV)
How to reproduce the issue:
Space Engineers:
- Open Steam
- Run Space Engineers
- Load a save game
- GPU RESET
Tony Hawk's Pro Skater 1+2
- Open Heroic
- Run THPS
- GPU RESET
Log files (for system lockups / game freezes / crashes)
journalctl:
Jan 04 17:48:02 rufus kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Jan 04 17:48:02 rufus kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Jan 04 17:48:02 rufus kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=7864437, emitted seq=7864439
Jan 04 17:48:02 rufus kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process THPS12.exe pid 74073 thread THPS12.exe pid 74248
Jan 04 17:48:02 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Jan 04 17:48:05 rufus amdfand[865]: ERROR amdfand::service > Failed to change speed to 4. Fan(FailedToChangeSpeed { value: 10, error: Write { io: Os { code>
Jan 04 17:48:06 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: failed to suspend display audio
Jan 04 17:48:06 rufus kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jan 04 17:48:06 rufus kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Jan 04 17:48:07 rufus kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jan 04 17:48:07 rufus kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Jan 04 17:48:07 rufus kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Jan 04 17:48:07 rufus kernel: [drm] free PSP TMR buffer
Jan 04 17:48:07 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: MODE1 reset
Jan 04 17:48:07 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset
Jan 04 17:48:07 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: GPU smu mode1 reset
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: snd_hda_intel 0000:0c:00.1: spurious response 0x0:0x0, last cmd=0x224011
Jan 04 17:48:07 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset succeeded, trying to resume
Jan 04 17:48:07 rufus kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
Jan 04 17:48:07 rufus kernel: [drm] VRAM is lost due to GPU reset!
Jan 04 17:48:07 rufus kernel: [drm] PSP is resuming...
Jan 04 17:48:08 rufus kernel: [drm] reserve 0xa00000 from 0x82fe000000 for PSP TMR
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: RAS: optional ras ta ucode is not available
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: SMU is resuming...
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: SMU is resumed successfully!
Jan 04 17:48:08 rufus kernel: [drm] DMUB hardware initialized: version=0x02020003
Jan 04 17:48:08 rufus kernel: [drm] kiq ring mec 2 pipe 1 q 0
Jan 04 17:48:08 rufus kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Jan 04 17:48:08 rufus kernel: [drm] JPEG decode initialized successfully.
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: recover vram bo from shadow start
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: recover vram bo from shadow done
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset(6) succeeded!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm] Skip scheduling IBs!
Jan 04 17:48:08 rufus kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!