Debian Bookworm - XWayland page fault loop + [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout
Random freeze. I was doing nothing special. I left ffmpeg working and when I came back the graphics were frozen but the system continued working.
Hardware description:
-
CPU: Ryzen 7 3700 X
-
GPU: 06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] [1002:73ff] (rev c7)
-
System Memory: 16GB
-
Display(s): LG 20M35
-
Type of Display Connection: D-Sub + HDMI Adaptor
-
Distro name and Version: Debian Bookworm (completely updated)
-
Kernel version: Linux debian 6.1.0-13-amd64 #1 (closed) SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux
-
AMD official driver version: Mesa 22.3.6
systemctl shows a loop of:
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32771, for process Xwayland pid 2185 thread Xwayland:cs0 pid 2196)
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x000080034691b000 from client 0x1b (UTCL2)
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x0
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x0
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0
dic 06 15:01:35 debian kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x0
that ends with [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Attached file has more information.