amdgpu GPU fault detected: 147 when playing a vulkan game
Brief summary of the problem:
When this crash happens i can still hear sound but the graphics become really bugged and have to restart.
Hardware description:
- CPU: AMD Ryzen 5 5600X
- GPU: RX 580 4GB
- System Memory: 16gb
- Display(s): 2
- Type of Display Connection: DVI-D and HDMI
System information:
- Distro name and Version: Linux arch 5.14.9-arch2-1 #1 (closed) SMP PREEMPT Fri, 01 Oct 2021 19:03:20 +0000 x86_64 GNU/Linux
- Kernel version: 5.14.9-arch2-1
- AMD package version: xf86-video-amdgpu 21.0.0-1
How to reproduce the issue:
When playing dota 2 using the vulkan drivers at some random point the graphics break, on both displays and is not fixed until a hard restart.dmesg.log
Some logs i consider relevant
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x07c08402 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000022F8
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02084002
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 1, pasid 32783) at page 8952, read from 'TC7' (0x54433700) (132)
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x07c08802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000022F8
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02088002
Oct 07 11:44:11 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 1, pasid 32783) at page 8952, read from 'TC6' (0x54433600) (136)
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x06400802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000070C8
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08008002
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 4, pasid 32783) at page 28872, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x06400402 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000070C8
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08008002
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 4, pasid 32783) at page 28872, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x0aa04802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0000FF54
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04048002
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 2, pasid 32783) at page 65364, read from 'TC4' (0x54433400) (72)
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x0aa04402 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0000FF54
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04048002
Oct 07 11:45:16 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 2, pasid 32783) at page 65364, read from 'TC4' (0x54433400) (72)
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x03580802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0001066B
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E008002
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 7, pasid 32783) at page 67179, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x08c00802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00006F18
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E008002
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 7, pasid 32783) at page 28440, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x0d980802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00001DB3
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02008002
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 1, pasid 32783) at page 7603, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x0d984402 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00001DB3
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02008002
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 1, pasid 32783) at page 7603, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x0cf80802 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0000FF9F
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04008002
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 2, pasid 32783) at page 65439, read from 'TC0' (0x54433000) (8)
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU fault detected: 147 0x0cf80402 for process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0000FF9F
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04044002
Oct 07 11:45:17 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 2, pasid 32783) at page 65439, read from 'TC5' (0x54433500) (68)
Oct 07 11:45:21 arch kernel: gmc_v8_0_process_interrupt: 325 callbacks suppressed
...
Oct 07 11:47:59 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0001208D
Oct 07 11:47:59 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06084002
Oct 07 11:47:59 arch kernel: amdgpu 0000:2b:00.0: amdgpu: VM fault (0x02, vmid 3, pasid 32783) at page 73869, read from 'TC7' (0x54433700) (132)
Oct 07 11:48:05 arch kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Oct 07 11:48:10 arch kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Oct 07 11:48:10 arch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=16740391, emitted seq=16740393
Oct 07 11:48:10 arch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process dota2 pid 77403 thread dota2 pid 77403
Oct 07 11:48:10 arch kernel: amdgpu 0000:2b:00.0: amdgpu: GPU reset begin!
Oct 07 11:48:10 arch kernel: amdgpu 0000:2b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Oct 07 11:48:10 arch kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Oct 07 11:48:10 arch kernel: amdgpu: cp is busy, skip halt cp
Oct 07 11:48:11 arch kernel: amdgpu: rlc is busy, skip halt rlc
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:11 arch kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Oct 07 11:48:16 arch kernel: amdgpu_cs_ioctl: 805 callbacks suppressed
Edited by Edgar