Frequent lockups requiring hard shutdown during video playback
Brief summary of the problem:
During playback of videos, whether that be through the browser on youtube or occasionally on mplayer/vlc, I will experience various lockups only fixed by a hard shutdown.
Hardware description:
- CPU: AMD Ryzen 7 PRO 6850U
- GPU: 33:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [1002:1681] (rev d1)
- System Memory: 32 GiB Micron MT62F2G32D8DR-031 WT, 2105MHz (4 sticks)
- Display(s): 1
System information:
- Ubuntu 22.04.4 LTS
- Kernel version: 6.5.0-25-generic
- Custom kernel: N/A
- AMD official driver version: xserver-xorg-video-amdgpu 22.0.0-1ubuntu0.2 (i think!)
How to reproduce the issue:
Unfortunately, I've struggled to reproduce it, although I see it happen most often while watching videos through the browser rather than natively through mplayer or vlc.
Attached files:
Log files (for system lockups / game freezes / crashes)
Across every occurance of this bug (>10) the error log is always identical to the below.
Mar 19 20:18:30 tyler-box kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=277555, emitted seq=277557
Mar 19 20:18:30 tyler-box kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 3754 thread gnome-shel:cs0 pid 3768
Mar 19 20:18:30 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset begin!
Mar 19 20:18:30 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: MODE2 reset
Mar 19 20:18:30 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset succeeded, trying to resume
Mar 19 20:18:30 tyler-box kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F43FC00000).
Mar 19 20:18:30 tyler-box kernel: [drm] PSP is resuming...
Mar 19 20:18:30 tyler-box kernel: [drm] reserve 0xa00000 from 0xf43e000000 for PSP TMR
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: RAS: optional ras ta ucode is not available
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: RAP: optional rap ta ucode is not available
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: SMU is resuming...
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: SMU is resumed successfully!
Mar 19 20:18:31 tyler-box kernel: [drm] DMUB hardware initialized: version=0x04000043
Mar 19 20:18:31 tyler-box kernel: [drm] kiq ring mec 2 pipe 1 q 0
Mar 19 20:18:31 tyler-box gnome-shell[3754]: amdgpu: amdgpu_cs_query_fence_status failed.
Mar 19 20:18:31 tyler-box kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Mar 19 20:18:31 tyler-box kernel: [drm] JPEG decode initialized successfully.
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: recover vram bo from shadow start
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: recover vram bo from shadow done
Mar 19 20:18:31 tyler-box kernel: amdgpu 0000:33:00.0: amdgpu: GPU reset(2) succeeded!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box kernel: [drm] Skip scheduling IBs!
Mar 19 20:18:31 tyler-box gnome-shell[3754]: amdgpu: The CS has been rejected (-125), but the context isn't robust.
Mar 19 20:18:31 tyler-box gnome-shell[3754]: amdgpu: The process will be terminated.