T14s: [AMD/ATI] Renoir (rev d3): GPU Recovery Failed: -110 – Random crashes and subsequently repeating crashes every 10 seconds
Brief summary of the problem:
The problem occurs about 1 to 2 times a day and is very frustrating.
With no specific action triggering the event (can happen when idle, with locked screen or when working on the computer) the screen freezes turns off and on again. The screen reappears but is no longer usable (no mouse or keyboard interaction possible) in video calls, sound continues to be transferred in both directions (I can still talk to people).
10 seconds later, the screen turns off and on again, and the whole cycle repeats until forced reboot.
The problem seems to occur only when an external monitor is connected. (But this could also be an artifact of me almost always having an external monitor connected.)
Hardware description:
- CPU: AMD Ryzen 5 PRO 4650U with Radeon Graphics
- GPU: 06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d3)
- System Memory: 16G
- Display(s): Laptop Screen 1920x1080 + 4k Display at 125%
- Type of Display Connection: USB-C
System information:
- Distro name and Version: Kubuntu 23.04
- Kernel version: Linux badgastein 6.2.0-20-generic #20 (closed)-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr 6 07:48:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
Happens randomly, no way to reproduce found. Can happen when idle, with locked screen or when working on the computer. Happens about 1 to 2 times during 8 hour working day.
[Update:] Locking the screen using ctrl+alt+l (in kde plasma) seems to trigger the crash in 1 out of 3 times.
Attached files:
Screenshots/video files
Creating screenshot no longer possible.
Log files: Excerpts from syslog
Crash when using alt-tab
2023-05-16T13:42:10.585994+02:00 badgastein kernel: [16974.554330] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_high timeout, signaled seq=630797, emitted seq=630799
2023-05-16T13:42:10.586023+02:00 badgastein kernel: [16974.555082] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_wayland pid 1274 thread kwin_wayla:cs0 pid 1294
2023-05-16T13:42:10.586025+02:00 badgastein kernel: [16974.555791] amdgpu 0000:06:00.0: amdgpu: GPU reset begin!
2023-05-16T13:42:10.825972+02:00 badgastein kernel: [16974.794675] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
2023-05-16T13:42:10.853989+02:00 badgastein kernel: [16974.820805] amdgpu 0000:06:00.0: amdgpu: MODE2 reset
2023-05-16T13:42:10.854011+02:00 badgastein kernel: [16974.820874] amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume
2023-05-16T13:42:10.854013+02:00 badgastein kernel: [16974.821125] [drm] PCIE GART of 1024M enabled.
2023-05-16T13:42:10.854016+02:00 badgastein kernel: [16974.821129] [drm] PTB located at 0x000000F41FC00000
2023-05-16T13:42:10.854018+02:00 badgastein kernel: [16974.821228] [drm] PSP is resuming...
2023-05-16T13:42:11.558009+02:00 badgastein kernel: [16975.527413] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
2023-05-16T13:42:11.818252+02:00 badgastein kernel: [16975.786858] amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
2023-05-16T13:42:11.829993+02:00 badgastein kernel: [16975.797619] amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
2023-05-16T13:42:11.834411+02:00 badgastein kernel: [16975.803689] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
2023-05-16T13:42:11.834422+02:00 badgastein kernel: [16975.803913] [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
2023-05-16T13:42:11.834424+02:00 badgastein kernel: [16975.803918] amdgpu 0000:06:00.0: amdgpu: Secure display: Generic Failure.
2023-05-16T13:42:11.834424+02:00 badgastein kernel: [16975.803924] amdgpu 0000:06:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
2023-05-16T13:42:11.834426+02:00 badgastein kernel: [16975.803932] amdgpu 0000:06:00.0: amdgpu: SMU is resuming...
2023-05-16T13:42:11.838405+02:00 badgastein kernel: [16975.805548] amdgpu 0000:06:00.0: amdgpu: SMU is resumed successfully!
2023-05-16T13:42:11.838420+02:00 badgastein kernel: [16975.806089] [drm] DMUB hardware initialized: version=0x01010026
2023-05-16T13:42:12.261977+02:00 badgastein kernel: [16976.231629] [drm] kiq ring mec 2 pipe 1 q 0
2023-05-16T13:42:12.446204+02:00 badgastein kernel: [16976.415722] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2023-05-16T13:42:12.446226+02:00 badgastein kernel: [16976.415968] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
2023-05-16T13:42:12.446227+02:00 badgastein kernel: [16976.416195] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
2023-05-16T13:42:12.446230+02:00 badgastein kernel: [16976.416429] amdgpu 0000:06:00.0: amdgpu: GPU reset(2) failed
2023-05-16T13:42:12.446239+02:00 badgastein kernel: [16976.416538] amdgpu 0000:06:00.0: amdgpu: GPU reset end with ret = -110
2023-05-16T13:42:12.449983+02:00 badgastein kernel: [16976.416545] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Crash when with locked screen / when screen locking starts
2023-05-17T16:52:56.642121+02:00 badgastein kernel: [26556.533446] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=4601766, emitted seq=4601768
2023-05-17T16:52:56.642142+02:00 badgastein kernel: [26556.534188] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kscreenlocker_g pid 198690 thread kscreenloc:cs0 pid 198693
2023-05-17T16:52:56.642145+02:00 badgastein kernel: [26556.534905] amdgpu 0000:06:00.0: amdgpu: GPU reset begin!
2023-05-17T16:52:56.882132+02:00 badgastein kernel: [26556.773380] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
2023-05-17T16:52:56.910140+02:00 badgastein kernel: [26556.800056] amdgpu 0000:06:00.0: amdgpu: MODE2 reset
2023-05-17T16:52:56.910159+02:00 badgastein kernel: [26556.800155] amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume
2023-05-17T16:52:56.910161+02:00 badgastein kernel: [26556.800437] [drm] PCIE GART of 1024M enabled.
2023-05-17T16:52:56.910162+02:00 badgastein kernel: [26556.800440] [drm] PTB located at 0x000000F41FC00000
2023-05-17T16:52:56.910164+02:00 badgastein kernel: [26556.800522] [drm] PSP is resuming...
2023-05-17T16:52:57.614154+02:00 badgastein kernel: [26557.503644] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
2023-05-17T16:52:57.874230+02:00 badgastein kernel: [26557.764730] amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
2023-05-17T16:52:57.886128+02:00 badgastein kernel: [26557.775605] amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
2023-05-17T16:52:57.890152+02:00 badgastein kernel: [26557.781619] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
2023-05-17T16:52:57.890163+02:00 badgastein kernel: [26557.781841] [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
2023-05-17T16:52:57.890165+02:00 badgastein kernel: [26557.781845] amdgpu 0000:06:00.0: amdgpu: Secure display: Generic Failure.
2023-05-17T16:52:57.890167+02:00 badgastein kernel: [26557.781851] amdgpu 0000:06:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
2023-05-17T16:52:57.890168+02:00 badgastein kernel: [26557.781859] amdgpu 0000:06:00.0: amdgpu: SMU is resuming...
2023-05-17T16:52:57.890169+02:00 badgastein kernel: [26557.782303] amdgpu 0000:06:00.0: amdgpu: SMU is resumed successfully!
2023-05-17T16:52:57.890170+02:00 badgastein kernel: [26557.782837] [drm] DMUB hardware initialized: version=0x01010026
2023-05-17T16:52:58.330124+02:00 badgastein kernel: [26558.223215] [drm] kiq ring mec 2 pipe 1 q 0
2023-05-17T16:52:58.514838+02:00 badgastein kernel: [26558.407300] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2023-05-17T16:52:58.514869+02:00 badgastein kernel: [26558.407554] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
2023-05-17T16:52:58.514872+02:00 badgastein kernel: [26558.407779] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
2023-05-17T16:52:58.514873+02:00 badgastein kernel: [26558.407997] amdgpu 0000:06:00.0: amdgpu: GPU reset(2) failed
2023-05-17T16:52:58.514875+02:00 badgastein kernel: [26558.408029] amdgpu 0000:06:00.0: amdgpu: GPU reset end with ret = -110
2023-05-17T16:52:58.518114+02:00 badgastein kernel: [26558.408032] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Crash when using firefox
2023-05-23T11:33:06.638888+02:00 badgastein kernel: [ 5884.478730] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=210400, emitted seq=210403
2023-05-23T11:33:06.638912+02:00 badgastein kernel: [ 5884.479500] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 6393 thread firefox:cs0 pid 6512
2023-05-23T11:33:06.638916+02:00 badgastein kernel: [ 5884.480166] amdgpu 0000:06:00.0: amdgpu: GPU reset begin!
2023-05-23T11:33:06.874920+02:00 badgastein kernel: [ 5884.713965] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
2023-05-23T11:33:06.898982+02:00 badgastein kernel: [ 5884.740526] amdgpu 0000:06:00.0: amdgpu: MODE2 reset
2023-05-23T11:33:06.899014+02:00 badgastein kernel: [ 5884.740606] amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume
2023-05-23T11:33:06.902941+02:00 badgastein kernel: [ 5884.740943] [drm] PCIE GART of 1024M enabled.
2023-05-23T11:33:06.902965+02:00 badgastein kernel: [ 5884.740948] [drm] PTB located at 0x000000F41FC00000
2023-05-23T11:33:06.902968+02:00 badgastein kernel: [ 5884.741079] [drm] PSP is resuming...
2023-05-23T11:33:07.682924+02:00 badgastein kernel: [ 5885.523616] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
2023-05-23T11:33:07.938908+02:00 badgastein kernel: [ 5885.780485] amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
2023-05-23T11:33:07.950889+02:00 badgastein kernel: [ 5885.791549] amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
2023-05-23T11:33:07.958884+02:00 badgastein kernel: [ 5885.797632] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
2023-05-23T11:33:07.958905+02:00 badgastein kernel: [ 5885.797851] [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
2023-05-23T11:33:07.958907+02:00 badgastein kernel: [ 5885.797855] amdgpu 0000:06:00.0: amdgpu: Secure display: Generic Failure.
2023-05-23T11:33:07.958910+02:00 badgastein kernel: [ 5885.797861] amdgpu 0000:06:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
2023-05-23T11:33:07.958912+02:00 badgastein kernel: [ 5885.797867] amdgpu 0000:06:00.0: amdgpu: SMU is resuming...
2023-05-23T11:33:07.958913+02:00 badgastein kernel: [ 5885.798658] amdgpu 0000:06:00.0: amdgpu: SMU is resumed successfully!
2023-05-23T11:33:07.958915+02:00 badgastein kernel: [ 5885.799154] [drm] DMUB hardware initialized: version=0x01010026
2023-05-23T11:33:08.438912+02:00 badgastein kernel: [ 5886.278102] [drm] kiq ring mec 2 pipe 1 q 0
2023-05-23T11:33:08.654209+02:00 badgastein kernel: [ 5886.494467] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2023-05-23T11:33:08.654249+02:00 badgastein kernel: [ 5886.494957] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
2023-05-23T11:33:08.654251+02:00 badgastein kernel: [ 5886.495492] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
2023-05-23T11:33:08.654253+02:00 badgastein kernel: [ 5886.496009] amdgpu 0000:06:00.0: amdgpu: GPU reset(2) failed
2023-05-23T11:33:08.654266+02:00 badgastein kernel: [ 5886.496041] [drm] Skip scheduling IBs!
2023-05-23T11:33:08.654270+02:00 badgastein kernel: [ 5886.496073] amdgpu 0000:06:00.0: amdgpu: GPU reset end with ret = -110
2023-05-23T11:33:08.654948+02:00 badgastein kernel: [ 5886.496079] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Errors from all subsequent crashes repeating every 10 seconds
2023-05-23T11:33:18.918886+02:00 badgastein kernel: [ 5896.756728] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=29774, emitted seq=29776
2023-05-23T11:33:18.918899+02:00 badgastein kernel: [ 5896.757544] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
2023-05-23T11:33:18.918901+02:00 badgastein kernel: [ 5896.758275] amdgpu 0000:06:00.0: amdgpu: GPU reset begin!
2023-05-23T11:33:19.154896+02:00 badgastein kernel: [ 5896.995933] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
2023-05-23T11:33:19.182884+02:00 badgastein kernel: [ 5897.021017] amdgpu 0000:06:00.0: amdgpu: MODE2 reset
2023-05-23T11:33:19.182906+02:00 badgastein kernel: [ 5897.021130] amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume
2023-05-23T11:33:19.182907+02:00 badgastein kernel: [ 5897.021349] [drm] PCIE GART of 1024M enabled.
2023-05-23T11:33:19.182909+02:00 badgastein kernel: [ 5897.021352] [drm] PTB located at 0x000000F41FC00000
2023-05-23T11:33:19.182910+02:00 badgastein kernel: [ 5897.021369] [drm] PSP is resuming...
2023-05-23T11:33:19.902885+02:00 badgastein kernel: [ 5897.741814] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
2023-05-23T11:33:20.158929+02:00 badgastein kernel: [ 5897.998974] amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
2023-05-23T11:33:20.170947+02:00 badgastein kernel: [ 5898.009889] amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
2023-05-23T11:33:20.174875+02:00 badgastein kernel: [ 5898.016110] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
2023-05-23T11:33:20.174891+02:00 badgastein kernel: [ 5898.016328] [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
2023-05-23T11:33:20.174892+02:00 badgastein kernel: [ 5898.016331] amdgpu 0000:06:00.0: amdgpu: Secure display: Generic Failure.
2023-05-23T11:33:20.174893+02:00 badgastein kernel: [ 5898.016336] amdgpu 0000:06:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
2023-05-23T11:33:20.174893+02:00 badgastein kernel: [ 5898.016340] amdgpu 0000:06:00.0: amdgpu: SMU is resuming...
2023-05-23T11:33:20.178879+02:00 badgastein kernel: [ 5898.017119] amdgpu 0000:06:00.0: amdgpu: SMU is resumed successfully!
2023-05-23T11:33:20.178892+02:00 badgastein kernel: [ 5898.017551] [drm] DMUB hardware initialized: version=0x01010026
2023-05-23T11:33:20.626889+02:00 badgastein kernel: [ 5898.466801] [drm] kiq ring mec 2 pipe 1 q 0
2023-05-23T11:33:20.821679+02:00 badgastein kernel: [ 5898.662217] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2023-05-23T11:33:20.821697+02:00 badgastein kernel: [ 5898.662702] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
2023-05-23T11:33:20.821700+02:00 badgastein kernel: [ 5898.663132] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
2023-05-23T11:33:20.821710+02:00 badgastein kernel: [ 5898.663534] amdgpu 0000:06:00.0: amdgpu: GPU reset(3) failed
2023-05-23T11:33:20.822881+02:00 badgastein kernel: [ 5898.663642] amdgpu 0000:06:00.0: amdgpu: GPU reset end with ret = -110
2023-05-23T11:33:20.822900+02:00 badgastein kernel: [ 5898.663647] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110