Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
mesa
mesa
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2,323
    • Issues 2,323
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 637
    • Merge Requests 637
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Mesa
  • mesamesa
  • Issues
  • #1770

Closed
Open
Opened Sep 25, 2019 by Bugzilla Migration User@bugzilla-migration

[drm] GPU HANG: ecode 9:0:0x85dffffd, in chrome [18418], reason: hang on rcs0, action: reset

Submitted by muradm

Assigned to Intel 3D Bugs Mailing List

Link to original bug (#108717)

Description

Created attachment 142445 cat /sys/class/drm/card0/error

A month back I moved to ThinkPad X1 Carbon 6th Gen (20KH006MRT) with fresh ArchLinux install. Since then I'm battling with GPU.

Periodically (at least once a day, can do more frequently) GPU hangs. Google Chrome is running (with hardware acceleration). As the result, sometimes not in any particular order:

  1. GPU process of Chrome may crash on first hang, then in few hours Gnome is crashing any way
  2. Gnome may crash to black text mode screen with me be able to switch to another terminal to reboot
  3. Everything is crashing to black screen (no text cursor) and host not responding to anything (including network) then hard power cycle reboot is needed.

This happens regardless external monitor attached to HDMI or not.

I think I read every article / wiki available on subject, and tried a lot of configurations of i915 and other things.

Yesterday I switched from mainline 4.18 to testing 4.19 Linux kernel in order to get latest everything. Just now same hang happened as per 1) above.

journalctl (omitting other errors) =>

Nov 13 01:15:22 muradm-aln1 kernel: [drm] GPU HANG: ecode 9:0:0x85dffffd, in chrome [18418], reason: hang on rcs0, action: reset Nov 13 01:15:22 muradm-aln1 kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. Nov 13 01:15:22 muradm-aln1 kernel: [drm] Please file a new bug report on bugs.freedesktop.org against DRI -> DRM/Intel Nov 13 01:15:22 muradm-aln1 kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. Nov 13 01:15:22 muradm-aln1 kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. Nov 13 01:15:22 muradm-aln1 kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error Nov 13 01:15:22 muradm-aln1 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

Dump attached as well.

OS: Arch Linux x86_64 Kernel: 4.19.1-arch1-1-ARCH Host: 20KH006MRT ThinkPad X1 Carbon 6th DE: GNOME 3.30.1 CPU: Intel i7-8550U (8) @ 4.000GHz GPU: Intel UHD Graphics 620

Some related packages:

local/libdrm 2.4.96-1 local/libva 2.3.0-1 local/libva-intel-driver 2.2.0-1 local/libva-utils 2.3.0-1 local/linux 4.19.1.arch1-1 (base) local/linux-api-headers 4.17.11-1 local/linux-firmware 20181026.1cb4e51-1 (base) local/mesa 18.2.4-1 local/mesa-demos 8.4.0-1 local/qt5-wayland 5.11.2-1 (qt qt5) local/util-linux 2.33-2 (base base-devel) local/vulkan-icd-loader 1.1.85+2969+5abee6173-1 local/vulkan-intel 18.2.4-1 local/wayland 1.16.0-1 local/wayland-protocols 1.16-1 local/xorg-bdftopcf 1.1-1 (xorg xorg-apps) local/xorg-server 1.20.3-1 (xorg) local/xorg-server-common 1.20.3-1 (xorg) local/xorg-server-xwayland 1.20.3-1 (xorg) local/xorgproto 2018.4-1

cat /etc/modprobe.d/i915.conf options i915 modeset=1 enable_guc=3 enable_fbc=1 fastboot=1

dmesg | grep drm == (up to a point of hang) ============== [ 2.654949] fb: switching to inteldrmfb from EFI VGA [ 2.654994] [drm] Replacing VGA console driver [ 2.657309] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 2.657310] [drm] Driver supports precise vblank timestamp query. [ 2.659687] [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) [ 2.666245] [drm] HuC: Loaded firmware i915/kbl_huc_ver02_00_1810.bin (version 2.0) [ 2.677443] [drm] GuC: Loaded firmware i915/kbl_guc_ver9_39.bin (version 9.39) [ 3.224056] [drm] Initialized i915 1.6.0 20180719 for 0000:00:02.0 on minor 0 [ 3.674308] fbcon: inteldrmfb (fb0) is primary device [ 3.674318] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device [ 4.145904] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 31.447100] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 3377.147569] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 3389.843556] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 3391.847593] [drm] HuC: Loaded firmware i915/kbl_huc_ver02_00_1810.bin (version 2.0) [ 3391.858472] [drm] GuC: Loaded firmware i915/kbl_guc_ver9_39.bin (version 9.39) [ 3392.079989] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 3413.745747] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.

Attachment 142445, "cat /sys/class/drm/card0/error":
drmi915klcrash.dmp.gz

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: mesa/mesa#1770