[skl] GPU HANG: ecode 9:0:0x85dffffb, in Xorg
Submitted by PN
Assigned to Intel 3D Bugs Mailing List
Link to original bug (#104953)
Description
Created attachment 137172 Content of /sys/class/drm/card0/error
About a week ago, my system froze for the first time with the following syslog entries:
Jan 31 11:22:42 TP kernel: [14282.025287] [drm] GPU HANG: ecode 9:0:0x86dffffd, in Xorg [870], reason: Hang on rcs0, action: reset Jan 31 11:22:42 TP kernel: [14282.025322] drm/i915: Resetting chip after gpu hang Jan 31 11:22:42 TP kernel: [14282.025486] [drm] RC6 on Jan 31 11:22:46 TP kernel: [14285.246630] asynchronous wait on fence i915:kwin_x11[1506]/1:bd84 timed out Jan 31 11:22:50 TP kernel: [14290.046737] drm/i915: Resetting chip after gpu hang Jan 31 11:22:50 TP kernel: [14290.046964] [drm] RC6 on Jan 31 11:23:03 TP kernel: [14303.006629] drm/i915: Resetting chip after gpu hang Jan 31 11:23:03 TP kernel: [14303.006806] [drm] RC6 on Jan 31 11:23:11 TP kernel: [14311.006544] drm/i915: Resetting chip after gpu hang Jan 31 11:23:11 TP kernel: [14311.006715] [drm] RC6 on Jan 31 11:23:19 TP kernel: [14319.006432] drm/i915: Resetting chip after gpu hang Jan 31 11:23:19 TP kernel: [14319.006606] [drm] RC6 on Jan 31 11:23:20 TP org.kde.kuiserver[1425]: kuiserver: Fatal IO error: client killed
The kernel back then was: Linux 4.13.0-32-generic x86_64
This occurred several times a day. I reported this bug here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1746551
I then re-installed kubuntu 17.10, plus a newer 4.15 mainline kernel, but the crash happened again:
Linux kernel:
Linux 4.15.0-041500-generic #201802011154 SMP Thu Feb 1 11:55:45 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Dmesg entries:
[16127.462805] [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [887], reason: Hang on rcs0, action: reset [16127.462806] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [16127.462807] [drm] Please file a new bug report on bugs.freedesktop.org against DRI -> DRM/Intel [16127.462807] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [16127.462807] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [16127.462808] [drm] GPU crash dump saved to /sys/class/drm/card0/error [16127.462813] i915 0000:00:02.0: Resetting rcs0 after gpu hang [16135.453438] i915 0000:00:02.0: Resetting rcs0 after gpu hang [16143.452952] i915 0000:00:02.0: Resetting rcs0 after gpu hang [16157.468290] i915 0000:00:02.0: Resetting rcs0 after gpu hang [16171.451587] i915 0000:00:02.0: Resetting rcs0 after gpu hang
The content of /sys/class/drm/card0/error is attached as requested
Attachment 137172, "Content of /sys/class/drm/card0/error":
drm_card0_error