i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
Had the below in dmesg against 5.4.13, so I am following the instruction to attach /sys/class/drm/card0/error here. The machine is still running; but there was a "graphic card non-responsive" (and still ssh'able) over the weekend with a more serious problem. That came after some scary ext4 related messages, with 5.4.11 .
[21367.623131] i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
[21367.623134] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[21367.623136] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[21367.623137] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[21367.623138] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
[21367.623140] GPU crash dump saved to /sys/class/drm/card0/error
[21367.624155] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
This is the content of /sys/class/drm/card0/error .
The below was in a previous boot with an earlier kernel (5.4.11)
[130013.998415] __schedule+0x2e5/0x750
[130013.998417] schedule+0x49/0xd0
[130013.998420] wb_wait_for_completion+0x51/0x80
[130013.998424] ? wait_woken+0x80/0x80
[130013.998426] __writeback_inodes_sb_nr+0xa8/0xd0
[130013.998428] try_to_writeback_inodes_sb+0x60/0x80
[130013.998431] ext4_nonda_switch+0x7e/0x80
[130013.998433] ext4_da_write_begin+0x67/0x470
[130013.998437] generic_perform_write+0xba/0x1c0
[130013.998439] ? file_update_time+0x62/0x140
[130013.998442] ? fsnotify_destroy_event+0x1c/0x20
[130013.998444] __generic_file_write_iter+0x107/0x1d0
[130013.998446] ext4_file_write_iter+0xb9/0x360
[130013.998449] new_sync_write+0x125/0x1c0
[130013.998452] __vfs_write+0x29/0x40
[130013.998454] vfs_write+0xb9/0x1a0
[130013.998456] ksys_write+0x67/0xe0
[130013.998459] __x64_sys_write+0x1a/0x20
[130013.998462] do_syscall_64+0x57/0x190
[130013.998463] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[130013.998465] RIP: 0033:0x7f6b3da18317
[130013.998470] Code: Bad RIP value.
[130013.998472] RSP: 002b:00007ffded258698 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[130013.998473] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f6b3da18317
[130013.998474] RDX: 0000000000000001 RSI: 00007ffded258840 RDI: 0000000000000002
[130013.998475] RBP: 00007ffded258840 R08: 0000000000000001 R09: 0000000000000001
[130013.998476] R10: 000055d24fb9709b R11: 0000000000000246 R12: 0000000000000001
[130013.998477] R13: 00007f6b3daf25c0 R14: 0000000000000001 R15: 00007f6b3daf28a0
I suspect this is one of the known problems with i915 in the 5.4.x series. With 5.3.14 I had the machine up for about 5 weeks.