GPU hangs twice in a QML application
I have observed 2 GPU hangs when using a QML application Qcm, details are listed below:
00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-S UHD Graphics (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation AD107M [GeForce RTX 4050 Max-Q / Mobile] (rev a1)
Operating System: Arch Linux
KDE Plasma Version: 6.0.2
KDE Frameworks Version: 6.0.0
Qt Version: 6.6.2
Kernel Version: 6.8.1-arch1-1 (64-bit)
Graphics Platform: Wayland
Processors: 32 × 13th Gen Intel® Core™ i9-13900HX
Memory: 62.5 GiB of RAM
Graphics Processor: Mesa Intel® Graphics
Manufacturer: LENOVO
Product Name: 82WK
System Version: Legion Y9000P IRX8
dmesg
says:
[38127.766533] i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: Engine reset failed on 0:0 (rcs0) because 0x00000000
[38127.797750] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:0020fffe, in Qcm [1141064]
[38127.797752] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[38127.797753] Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/intel/issues/new.
[38127.797753] Please see https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html for details.
[38127.797753] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[38127.797753] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
[38127.797754] GPU crash dump saved to /sys/class/drm/card1/error
[38127.798018] i915 0000:00:02.0: [drm] GT0: Resetting chip for GuC failed to reset engine mask=0x1
[38127.901798] i915 0000:00:02.0: [drm] *ERROR* GT0: rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[38127.902514] i915 0000:00:02.0: [drm] *ERROR* GT0: rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[38127.902621] i915 0000:00:02.0: [drm] Qcm[1141064] context reset due to GPU hang
[38127.902676] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.20.0
[38127.902678] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[38127.905673] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads
[38127.906393] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[38127.906394] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[38695.598955] i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: Engine reset failed on 0:0 (rcs0) because 0x00000000
[38695.629165] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:84dffffb, in Qcm [1145063]
[38695.629438] i915 0000:00:02.0: [drm] GT0: Resetting chip for GuC failed to reset engine mask=0x1
[38695.732655] i915 0000:00:02.0: [drm] *ERROR* GT0: rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[38695.733365] i915 0000:00:02.0: [drm] *ERROR* GT0: rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[38695.733463] i915 0000:00:02.0: [drm] Qcm[1145063] context reset due to GPU hang
[38695.733506] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.20.0
[38695.733508] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[38695.736168] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads
[38695.737166] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[38695.737167] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled