[gen3] GPU hanging and killing application.
Submitted by Lucas
Assigned to Ian Romanick
Description
Created attachment 122091 The error file generated at /sys/class/drm/card0/error when reproducing the game crash.
While playing the game The Sims 3 with Pets expansion on a 945G GPU with oibaf's Ubuntu's PPA, the game crashed when creating a pet dog.
On looking at the dmesg, I got the following error:
Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.804071] [drm] stuck on render ring Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805899] [drm] GPU HANG: ecode 4:0:0x87f5fefe, in TS3W.exe [1815], reason: Ring hung, action: reset Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805905] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805907] [drm] Please file a new bug report on bugs.freedesktop.org against DRI -> DRM/Intel Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805908] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805910] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805911] [drm] GPU crash dump saved to /sys/class/drm/card0/error Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805914] i915: render error detected, EIR: 0x00000010 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805917] i915: IPEIR: 0x00000000 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805919] i915: IPEHR: 0x780a0101 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805921] i915: INSTDONE_0: 0xffffffff Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805922] i915: INSTDONE_1: 0xbfffeff0 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805924] i915: INSTDONE_2: 0x00000000 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805925] i915: INSTDONE_3: 0x00000000 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805927] i915: INSTPS: 0x8001e023 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805929] i915: ACTHD: 0x06e84f60 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805931] i915: page table error Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805932] i915: PGTBL_ER: 0x00000001 Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.806001] [drm:i915_handle_error [i915]] ERROR EIR stuck: 0x00000010, masking Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.807019] drm/i915: Resetting chip after gpu hang Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.804065] [drm] stuck on render ring Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.805739] [drm] GPU HANG: ecode 4:0:0x87f5fefe, in TS3W.exe [1815], reason: Ring hung, action: reset Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.805903] [drm:i915_set_reset_status [i915]] ERROR gpu hanging too fast, banning! Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.805996] drm/i915: Resetting chip after gpu hang
I thought the problem might be related with the oibaf's unstable version of the graphics driver, so I revert to the original version in Ubuntu 15.10, and restart. Unfortunately, I forgot to save the error file for that crash.
After restarting, I was able to reproduce the error with the same game. The error file is the attached "TS3W.exe.error". The message in dmesg was the following:
[ 685.816087] [drm] stuck on render ring [ 685.817915] [drm] GPU HANG: ecode 4:0:0x87e5fefe, in TS3W.exe [1557], reason: Ring hung, action: reset [ 685.817919] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 685.817921] [drm] Please file a new bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 685.817923] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 685.817925] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 685.817926] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 685.818213] drm/i915: Resetting chip after gpu hang [ 691.816039] [drm] stuck on render ring [ 691.817958] [drm] GPU HANG: ecode 4:0:0x87f5fefe, in TS3W.exe [1557], reason: Ring hung, action: reset [ 691.818146] [drm:i915_set_reset_status [i915]] ERROR gpu hanging too fast, banning! [ 691.818243] drm/i915: Resetting chip after gpu hang
Reproducing an error with a proprietary game is a big hassle, but fortunately, I was able to trigger the same problem (or at least, a very similar problem) with the freely available tool at: http://www.geeks3d.com/gputest/ (it seems it won't download if you have an ad-blocker).
To reproduce, download and unpack Geeks3D GpuTest, run the python script called gputest_gui.py:
$ python gputest_gui.py
then select PixMark Piano (OpenGL 2.1/3.0) and click on "Run benchmark".
The above steps crashed the benchmark, and left the following message on dmesg:
[ 430.816064] [drm] stuck on render ring [ 430.817671] [drm] GPU HANG: ecode 4:0:0xf98df17c, in GpuTest [1647], reason: Ring hung, action: reset [ 430.817673] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 430.817674] [drm] Please file a new bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 430.817676] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 430.817677] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 430.817679] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 430.817844] drm/i915: Resetting chip after gpu hang [ 436.804065] [drm] stuck on render ring [ 436.805797] [drm] GPU HANG: ecode 4:0:0xf989e17c, in GpuTest [1647], reason: Ring hung, action: reset [ 436.805944] [drm:i915_set_reset_status [i915]] ERROR gpu hanging too fast, banning! [ 436.806005] drm/i915: Resetting chip after gpu hang
The corresponding error file will also be attached.
Attachment 122091, "The error file generated at /sys/class/drm/card0/error when reproducing the game crash.":
TS3W.exe.error