Skip to content
Snippets Groups Projects
Closed (moved) [TGL] GPU hangs in multithreading benchmarks
  • View options
  • [TGL] GPU hangs in multithreading benchmarks

  • View options
  • Closed (moved) Issue created by Eero Tamminen

    Setup:

    • HW: TGL-H GT1
    • OS: Ubuntu 20.04 (+ updates)
    • Kernel: drm-tip Git (commit: "2021y-03m-05d-14h-43m-32s UTC integration manifest")
    • Gfx stack: Git versions from last weekend (Mesa: 12f1e42ed3, Weston: 022ea43f9b, Xserver/Xwayland: 15a413e11d)
    • Test-case: SynMark2 v7.0.1

    Use-case is running SynMark multithreading fullscreen test-cases (Wayland version under Weston):

    • synmark2 OglHdrBloom (about same test with less threads and more GPU bound)
    • synmark2 OglMultithreading (more CPU bound, more likely to trigger the issue)

    They've triggered GPU hang every night for several days, since I started testing the TGL-H device:

    [ 1725.488400] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in synmark2 [26040]
    [ 1725.488403] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
    [ 1725.488404] Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/intel/issues/new.
    [ 1725.488404] Please see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details.
    [ 1725.488404] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
    [ 1725.488405] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
    [ 1725.488405] GPU crash dump saved to /sys/class/drm/card0/error
    [ 1725.489486] i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
    [ 1725.489531] i915 0000:00:02.0: [drm] synmark2[26040] context reset due to GPU hang
    [ 1736.936644] Asynchronous wait on fence 0000:00:02.0:synmark2[26034]:be0 timed out (hint:intel_atomic_commit_ready [i915])
    [ 1740.467713] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in synmark2:gdrv0 [26041]
    [ 1740.467733] i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
    [ 1740.467747] i915 0000:00:02.0: [drm] synmark2:gdrv0[26041] context reset due to GPU hang

    Note: there are further GPU hangs in later tests, but this is the first error. Last night this particular hang didn't, so either it's gone, or gotten harder to trigger. If it doesn't happen again in next couple of weeks, I'll close this.

    GPU error file is attached: i915_error_state-tgl-hdrbloom.txt

    Edited by Eero Tamminen

    Linked items ... 0

  • Activity

    • All activity
    • Comments only
    • History only
    • Newest first
    • Oldest first
    Loading Loading Loading Loading Loading Loading Loading Loading Loading Loading