[TGL] media VME encoding GPU hang w/o i915 error state captured
Submitted by Dmitry Rogozhkin
Assigned to Intel GFX Bugs mailing list
Link to original bug (#112377)
Description
Created attachment 146016
Full dmesg log
From https://github.com/intel/media-driver/issues/773
Run:
wget https://fate-suite.libav.org/h264-conformance/AUD_MW_E.264
sample_multi_transcode -i::h264 AUD_MW_E.264 -hw -async 1 -u 4 -o::h264 a.h264
Stack:
-
https://github.com/intel/gmmlib/commit/f78be970a6c3aef6d0347159f9c3f250421af16c
-
https://github.com/intel/media-driver/commit/1645d0f06597599393625168cc4445b0ae092219
-
https://github.com/Intel-Media-SDK/MediaSDK/commit/2515d8fbb65979685ce086aa5c9b24786cc2cab6 + apply https://github.com/Intel-Media-SDK/MediaSDK/pull/1771
Ran with latest drm-tip kernel:
commit 5bbbc0061acc528705e593d7e01c4c9c40b208db
Merge: e67c139 883d955
Author: Lyude Paul
Date: Fri Nov 22 14:12:54 2019 -0500
Merge remote-tracking branch 'drm-intel/topic/core-for-CI' into drm-tip
Essential part of dmesg (full one attached with drm.debug=0x1e):
[ 77.006161] i915 0000:00:02.0: Resetting rcs0 for preemption time out
[ 77.006204] i915 0000:00:02.0: sample_multi_tr[1951] context reset due to GPU hang
[ 77.006293] [drm:__i915_request_reset [i915]] client sample_multi_tr[1951]: gained 1 ban score, now 1
i915 error state: EMPTY
Can you, please, help debug the issue from kmd stand point?
Why i915 error state is missing? is it real GPU hang or i915 bug?
Note: the above media encoder works on RCS and VCS rings, VCS tasks depend on RCS ones.
**Attachment 146016**, "Full dmesg log":
dmesg.txt