Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2)
Description
I am experiencing persistent gpu hangs which result in a system freeze of a few seconds while the kernel is "Resetting rcs0", causing gnome-shell to restart and losing any work in open windows. Sometimes the freeze results in a hard-lock requiring a hard shutdown of the system. Both xorg and wayland are affected.
Dec 5 03:03:30 andrew-XPS kernel: [ 76.284049] Asynchronous wait on fence i915:gnome-shell[1856]:586 timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
Dec 5 03:03:33 andrew-XPS kernel: [ 78.749060] i915 0000:00:02.0: Resetting rcs0 for preemption time out
Dec 5 03:03:33 andrew-XPS kernel: [ 78.749075] i915 0000:00:02.0: Xwayland[1871] context reset due to GPU hang
Dec 5 03:03:45 andrew-XPS kernel: [ 90.781256] i915 0000:00:02.0: Resetting rcs0 for preemption time out
Dec 5 03:03:45 andrew-XPS kernel: [ 90.781271] i915 0000:00:02.0: Xwayland[1871] context reset due to GPU hang
Dec 5 03:03:57 andrew-XPS kernel: [ 102.813404] i915 0000:00:02.0: Resetting rcs0 for preemption time out
Dec 5 03:03:57 andrew-XPS kernel: [ 102.813433] i915 0000:00:02.0: Xwayland[1871] context reset due to GPU hang
Sample i915 error dump (pre drm-tip, i.e. 5.4): i915-rsc0.dump
Reproduction steps
- Install
hardinfo
using your package manager - Launch
hardinfo
- Run the
GPU Drawing
benchmark underBenchmarks
(last option) - Wait a few seconds for the test which draws hearts and the text (I <3 hardinfo), it crashes within seconds of this test
This error can happen from any program, however this is the fastest way to reproduce. Even using gnome-terminal
can cause the gpu hang, although rare. The biggest issue however is with JetBrains products, e.g. IntelliJ IDEA. I'm attaching a thread dump in case it proves to be useful: threadDump-20191205-132009.txt. A gpu hang will happen whether the bundled JetBrains JDK is used or an openjdk release.
Kernels affected
- 5.3
- 5.4†
- drm-tip (2019-12-04)
† using v5.4 tagged from linus' github repository and additionally a custom build with these commits reverted: ea0b163b13ffc52818c079adb00d55e227a6da6f 926abff21a8f29ef159a3ac893b05c6e50e043c3 f8c08d8faee5567803c8c533865296ca30286bbf 0546a29cd884fb8184731c79ab008927ca8859d0
having read drm/intel#161
Kernel cmdline
i915.disable_power_well=0 i915.enable_dc=0 i915.enable_fbc=0 i915.enable_guc=3 i915.enable_psr=0 i915.fastboot=1 i915.modeset=1
There's no combination of i915.*
commands that result in a stable system, neither does setting processor.max_cstate=0
help (only acpi_idle is supported with this generation Intel processor).
Hardware report
System: Host: andrew-XPS Kernel: 5.4.0-050400-generic x86_64 bits: 64 Desktop: Gnome 3.34.1
Distro: Ubuntu 19.10 (Eoan Ermine)
Machine: Type: Convertible System: Dell product: XPS 13 7390 2-in-1 v: N/A serial: <filter>
Mobo: Dell model: 06CDVY v: A00 serial: <filter> UEFI: Dell v: 1.0.13
date: 09/17/2019
Battery: ID-1: BAT0 charge: 49.0 Wh condition: 49.0/50.0 Wh (98%)
Memory: RAM: total: 31.14 GiB used: 3.34 GiB (10.7%)
RAM Report: permissions: Unable to run dmidecode. Root privileges required.
CPU: Topology: Quad Core model: Intel Core i7-1065G7 bits: 64 type: MT MCP
L2 cache: 8192 KiB
Speed: 2824 MHz min/max: 400/3900 MHz Core speeds (MHz): 1: 401 2: 1486 3: 1812
4: 1735 5: 1819 6: 858 7: 1592 8: 1916
Graphics: Device-1: Intel driver: i915 v: kernel
Display: wayland server: X.Org 1.20.5 driver: i915 resolution: 3840x2400~60Hz
OpenGL: renderer: Mesa DRI Intel Iris Plus Graphics (Ice Lake 8x8 GT2)
v: 4.5 Mesa 19.2.1
Audio: Device-1: Intel driver: N/A
Device-2: Intel driver: snd_hda_intel
Sound Server: ALSA v: k5.4.0-050400-generic
Network: Device-1: Intel driver: iwlwifi
IF: wlp0s20f3 state: up mac: <filter>
IF-ID-1: docker0 state: down mac: <filter>
Drives: Local Storage: total: 953.87 GiB used: 47.88 GiB (5.0%)
ID-1: /dev/nvme0n1 vendor: Toshiba model: KBG40ZPZ1T02 NVMe 1024GB size: 953.87 GiB
Partition: ID-1: / size: 229.70 GiB used: 47.70 GiB (20.8%) fs: ext4 dev: /dev/nvme0n1p4
Sensors: System Temperatures: cpu: 44.0 C mobo: N/A
Fan Speeds (RPM): N/A
Info: Processes: 310 Uptime: 9m Shell: bash inxi: 3.0.36
Scaling set to 200%.