Graphics corruption and GPU hang with RADV/LLVM
Description
Using RADV_DEBUG=llvm
, I'm getting severe graphics glitch and GPU hang with several games.
No issue with ACO.
Log files (for system lockups / game freezes / crashes)
[ 978.276310] [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
[ 983.406345] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=108661, emitted seq=108664
[ 983.406453] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process ACOdyssey.exe pid 25655 thread ACOdyssey.:cs0 pid 25681
[ 983.406542] amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
[ 983.787837] amdgpu 0000:0b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[ 983.787912] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
[ 984.029171] amdgpu 0000:0b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[ 984.029249] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
[ 984.270486] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
[ 984.285373] [drm] free PSP TMR buffer
[ 984.327467] amdgpu 0000:0b:00.0: amdgpu: BACO reset
[ 987.433407] amdgpu 0000:0b:00.0: amdgpu: GPU reset succeeded, trying to resume
Steps to reproduce
Nothing special, run a game and it goes wrong quickly.
System information
System: Host: jensen Kernel: 5.11.16-gentoo-x86_64 x86_64 bits: 64 compiler: N/A Desktop: KDE Plasma 5.21.4 tk: Qt 5.15.2
wm: kwin_x11 dm: SDDM Distro: Gentoo Base System release 2.7
CPU: Info: 8-Core model: AMD Ryzen 7 3800X bits: 64 type: MT MCP arch: Zen 2 L2 cache: 4096 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 124570
Speed: 3593 MHz min/max: 2200/3900 MHz Core speeds (MHz): 1: 3593 2: 3593 3: 3593 4: 3590 5: 3613 6: 3470 7: 3567
8: 3768 9: 3590 10: 3593 11: 3754 12: 3568 13: 3514 14: 3680 15: 4054 16: 3591
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
vendor: Sapphire Limited driver: amdgpu v: kernel bus ID: 0b:00.0 chip ID: 1002:731f
Display: x11 server: X.org 1.20.11 compositor: kwin_x11 driver: amdgpu FAILED: ati unloaded: modesetting,radeon
alternate: fbdev,vesa resolution: <xdpyinfo missing>
OpenGL: renderer: AMD Radeon RX 5700 XT (NAVI10 DRM 3.40.0 5.11.16-gentoo-x86_64 LLVM 12.0.0)
v: 4.6 Mesa 21.2.0-devel (git-ee9b744cb5) direct render: Yes
Regression
Bisected to commit ee9b744c