Alder Lake GPU HANG: ecode 12:1:84dffffb (Lenovo Thinkpad X1 Carbon Gen 10)
I have gotten a couple of GPU hangs, for the last one I could get a proper error message. This happens randomly, previously I had similar (more frequent hangs) than the ones reported in issue #6757. In my particular case, I was viewing a document with Evince, and then switched to Nautilus.
The relevant dmesg output is as follows:
Nov 12 16:00:52 woo kernel: Asynchronous wait on fence 0000:00:02.0:gnome-shell[2785]:1fb572 timed out (hint:intel_atomic_commit_ready [i915])
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:84dffffb, in nautilus [102189]
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] nautilus[102189] context reset due to GPU hang
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] HuC authenticated
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] GuC submission enabled
Nov 12 16:00:55 woo kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
FWIW, I think my system is new enough to have the commit mesa/mesa@0fa540ef pointed in #6757 (comment 1624571) so probably that fix is not enough.
Information:
- System architecture: x86_64
- kernel: 6.0.7-060007-generic
- Linux distribution: Ubuntu 22.10 (kinetic)
- Machine: LENOVO_MT_21CB_BU_Think_FM_ThinkPad X1 Carbon Gen 10
- dmidecode output: dmidecode.txt
- vbios: vbios.dump
- dmesg: dmesg.log.xz
- Output of
sudo cat /sys/class/drm/card0/error
: error.xz - Output of intel_reg dump`: intel_reg_dump.txt
Some other information:
% inxi -Gzx --display
Graphics:
Device-1: Intel Alder Lake-P Integrated Graphics vendor: Lenovo driver: i915 v: kernel
arch: Gen-12.2 bus-ID: 00:02.0
Device-2: Acer Integrated RGB Camera type: USB driver: uvcvideo bus-ID: 3-8:3
Display: server: X.Org v: 1.22.1.3 with: Xwayland v: 22.1.3 driver: X: loaded: vesa
unloaded: fbdev,modesetting gpu: i915 resolution: 2560x1440~60Hz
OpenGL: renderer: Mesa Intel Graphics (ADL GT2) v: 4.6 Mesa 23.0.0-devel (git-eca63c5
2022-11-11 kinetic-oibaf-ppa) direct render: Yes
The output of intel_reg gives a warning:
% sudo intel_reg dump --all > intel_reg_dump.txt
=Warning: register spec not found in '/usr/share/igt-gpu-tools/registers'. Using builtin register spec.
Also:
# echo 1 > /sys/devices/pci0000:00/0000:00:02.0/rom
# cat /sys/devices/pci0000:00/0000:00:02.0/rom
cat: '/sys/devices/pci0000:00/0000:00:02.0/rom': Input/output error
Regarding dmesg, it has some drm debugging at the beginning, but later I disabled when the issue I was trying to debug was fixed with a newer kernel (issue #7510 (closed)). In between, I got my screen frozen but without any apparent error, until now.