amdgpu: [gfxhub] page fault for process gnome-shell when using eog
GPU crashes (sometimes just crash report in dmesg, sometimes recoverable gpu crash, sometimes full system freeze) when eog (gnome image viewer) loads an image file. When recoverable the following crash report is written in logs:
mai 20 22:34:21 frost kernel: gmc_v10_0_process_interrupt: 49 callbacks suppressed
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:1 pasid:32770, for process gnome-shell pid 2264 thread gnome-shel:cs0 pid 2288)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: in page starting at address 0x0000800132465000 from client 0x1b (UTCL2)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x0010113B
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MORE_FAULTS: 0x1
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: WALKER_ERROR: 0x5
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: PERMISSION_FAULTS: 0x3
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MAPPING_ERROR: 0x1
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: RW: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:1 pasid:32770, for process gnome-shell pid 2264 thread gnome-shel:cs0 pid 2288)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: in page starting at address 0x0000800132461000 from client 0x1b (UTCL2)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MORE_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: WALKER_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: PERMISSION_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MAPPING_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: RW: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:1 pasid:32770, for process gnome-shell pid 2264 thread gnome-shel:cs0 pid 2288)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: in page starting at address 0x0000800132469000 from client 0x1b (UTCL2)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MORE_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: WALKER_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: PERMISSION_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MAPPING_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: RW: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:1 pasid:32770, for process gnome-shell pid 2264 thread gnome-shel:cs0 pid 2288)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: in page starting at address 0x000080013246c000 from client 0x1b (UTCL2)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MORE_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: WALKER_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: PERMISSION_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MAPPING_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: RW: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:1 pasid:32770, for process gnome-shell pid 2264 thread gnome-shel:cs0 pid 2288)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: in page starting at address 0x0000800132470000 from client 0x1b (UTCL2)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MORE_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: WALKER_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: PERMISSION_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MAPPING_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: RW: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:157 vmid:1 pasid:32770, for process gnome-shell pid 2264 thread gnome-shel:cs0 pid 2288)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: in page starting at address 0x0000800132474000 from client 0x1b (UTCL2)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MORE_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: WALKER_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: PERMISSION_FAULTS: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: MAPPING_ERROR: 0x0
mai 20 22:34:21 frost kernel: amdgpu 0000:35:00.0: amdgpu: RW: 0x0
mai 20 22:34:31 frost kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Hardware description:
- CPU: AMD Ryzen 9 6900HX with Radeon Graphics
- GPU: Rembrandt [Radeon 680M]
- System Memory: 32GB DDR5
- Display(s): Iiyama ProLite X3272UHS (3840x2160) Panel
- Type of Display Connection: HDMI
System information:
- Distro name and Version: Fedora 38
- Kernel version: 6.2.15-300.fc38.x86_64
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
Crash appear to happen randomly, but there a guaranteed way to trigger it in my case with gnome-shell 4x:
- run eog on a picture set (low to medium resolution, it must loads fast!)
- extend eog window (extended mode, not fullscreen)
- maintain the right arrow key to quickly iterate over pictures
- It crash in less than one minute
Attached files:
Edited by Maxime Gervais