[skl] GPU HANG: ecode 9:0:0x85dffffb, in X
Submitted by rob..@..nl.gov
Assigned to Intel 3D Bugs Mailing List
Link to original bug (#105182)
Description
Created attachment 137489 Intel GPU Crash Dump
When this issue happens, the X server and gdm process terminates and you end up back at the login screen. The system itself is not hung, but you lose your desktop session.
Kernel: 3.10.0-693.11.1.el7.x86_64 Linux Distro: RHEL 7.4 Hardware: Dell XPS 13 9350 Display Connector: Just the laptop display
I have attached a GPU crash dump.
In addition, here are the relevant log entries from /var/log/messsages for the last 2 times the GPU hung.
===================== Feb 20 16:35:12 xecho kernel: [kern.info][250168.455912] [drm] GPU HANG: ecode 9:0:0x85dffffb, in X [1636], reason: Hang on render ring, action: reset Feb 20 16:35:12 xecho kernel: [kern.info][250168.455916] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. Feb 20 16:35:12 xecho kernel: [kern.info][250168.455917] [drm] Please file a new bug report on bugs.freedesktop.org against DRI -> DRM/Intel Feb 20 16:35:12 xecho kernel: [kern.info][250168.455918] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. Feb 20 16:35:12 xecho kernel: [kern.info][250168.455918] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. Feb 20 16:35:12 xecho kernel: [kern.info][250168.455919] [drm] GPU crash dump saved to /sys/class/drm/card0/error Feb 20 16:35:12 xecho kernel: [kern.notice][250168.455963] drm/i915: Resetting chip after gpu hang Feb 20 16:35:12 xecho kernel: [kern.info][250168.456024] [drm] RC6 on Feb 20 16:35:12 xecho kernel: [kern.info][250168.467483] [drm] GuC firmware load skipped Feb 20 16:35:24 xecho kernel: [kern.notice][250180.401159] drm/i915: Resetting chip after gpu hang Feb 20 16:35:24 xecho kernel: [kern.info][250180.401251] [drm] RC6 on Feb 20 16:35:24 xecho kernel: [kern.info][250180.417580] [drm] GuC firmware load skipped Feb 20 16:35:25 xecho gdm: [user.notice] Child process 1636 was already dead.
Feb 20 17:04:01 xecho kernel: [kern.notice][251897.385910] drm/i915: Resetting chip after gpu hang Feb 20 17:04:01 xecho kernel: [kern.info][251897.386001] [drm] RC6 on Feb 20 17:04:01 xecho kernel: [kern.info][251897.402075] [drm] GuC firmware load skipped Feb 20 17:04:17 xecho kernel: [kern.notice][251913.380806] drm/i915: Resetting chip after gpu hang Feb 20 17:04:17 xecho kernel: [kern.info][251913.380882] [drm] RC6 on Feb 20 17:04:17 xecho kernel: [kern.info][251913.396346] [drm] GuC firmware load skipped Feb 20 17:04:29 xecho kernel: [kern.notice][251925.384639] drm/i915: Resetting chip after gpu hang Feb 20 17:04:29 xecho kernel: [kern.info][251925.384717] [drm] RC6 on Feb 20 17:04:29 xecho kernel: [kern.info][251925.400945] [drm] GuC firmware load skipped Feb 20 17:04:31 xecho gdm: [user.notice] Child process 16301 was already dead.
Attachment 137489, "Intel GPU Crash Dump":
xecho_gpu_crash_dump