R7 PRO 6850U: GPU drivers randomly crashing kernel
Brief summary of the problem:
Watching videos through MPV, after some time a crash appears which hangs the whole system and it doesn't recover. Before the crash the screen becomes distorted as if swapping between applications. I have had this issue appear twice in the last week (had my new laptop for a week only). Both times when the crash appeared, librewolf had 4/6 videos loaded on different to tabs. Display and input becomes unresponsive to actions. Can't swap TTY. Force reboot is required to make the system responsive again.
Will add more info here if it the crash appears again.
Hardware description:
- CPU: R7 Pro 6850U
- GPU: 26:04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] [1002:1681] (rev d1)
- System Memory: 32GB
- Display(s):
Rembrandt [Radeon 680M]; Output eDP-1 'California Institute of Technology 0x1404 Unknown' (focused) Current mode: 1920x1200 @ 60.001 Hz Position: 0,0 Scale factor: 1.500000
- Type of Display Connection: eDP-1
- Display server: Wayland + Sway
System information:
- Distro name and Version: ArchLinux
- Kernel version: 6.1.2-arch1-1
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
- Using wayland, start mpv and watch a video using hardware decoding set to: vaapi
Log files (for system lockups / game freezes / crashes)
journalctl log of crash:
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x0000800109df8000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00203830
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: VCN (0x1c)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x3
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f4000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00203831
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: VCN (0x1c)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x1
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x3
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f4000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00203831
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: VCN (0x1c)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x1
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x3
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f5000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f5000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1fc000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f5000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f5000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f6000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f6000 from client 0x12 (VM>
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f6000 from client 0x12 (VM>
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:07 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:24 vmid:2 pasid:32781, for process >
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080010a1f6000 from client 0x12 (VM>
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: MP0 (0x0)
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0
Jan 08 10:11:08 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RW: 0x0
Jan 08 10:11:18 T14 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec_0 timeout, signaled seq=65499, emitted seq>
Jan 08 10:11:18 T14 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process mpv pid 24981 thread mpv:c>
Jan 08 10:11:18 T14 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Jan 08 10:11:18 T14 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Jan 08 10:11:19 T14 kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x000001a0 != 0x00000140
Jan 08 10:11:19 T14 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: free PSP TMR buffer
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: MODE2 reset
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
Jan 08 10:11:19 T14 kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400A00000).
Jan 08 10:11:19 T14 kernel: [drm] PSP is resuming...
Jan 08 10:11:19 T14 kernel: [drm] reserve 0xa00000 from 0xf43e000000 for PSP TMR
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resuming...
Jan 08 10:11:19 T14 kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully!
Jan 08 10:11:19 T14 kernel: [drm] DMUB hardware initialized: version=0x0400002E
Jan 08 10:11:19 T14 kernel: [drm] kiq ring mec 2 pipe 1 q 0
Jan 08 10:11:20 T14 kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-1>
Jan 08 10:11:20 T14 kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <vcn_v3_0> failed -110
Jan 08 10:11:20 T14 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(1) failed
Jan 08 10:11:20 T14 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset end with ret = -110
Jan 08 10:11:20 T14 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Jan 08 10:11:21 T14 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Jan 08 10:11:21 T14 kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x00000010 != 0x00000000
Jan 08 10:11:21 T14 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Jan 08 10:11:24 T14 kernel: [drm] Fence fallback timer expired on ring sdma0
Jan 08 10:11:30 T14 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec_0 timeout, signaled seq=65501, emitted seq>
Jan 08 10:11:30 T14 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process mpv pid 24981 thread mpv:c>
Jan 08 10:11:30 T14 kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Jan 08 10:11:50 T14 systemd-logind[824]: Power key pressed short.
Jan 08 10:11:50 T14 systemd-logind[824]: Powering off...
Jan 08 10:11:50 T14 systemd-logind[824]: System is powering down.