[BAT][CI][GLK only] igt@* - incomplete - timeout/system hang
Submitted by Marta Löfstedt @marta
Assigned to Intel GFX Bugs mailing list
Link to original bug (#103359)
Description
CI_DRM_3265 GLK-shards igt@gem_softpin@noreloc-S3
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3265/shard-glkb3/igt@gem_softpin@noreloc-S3.html
Note bootlog start at:
<7>
[ 548.719874] [drm:edp_panel_vdd_off_sync [i915]] Turning eDP port A VDD off
i.e there has been at least one shard of tests run before this on on this boot.
last dmesg:
<6>
[ 1005.896525] PM: suspend entry (deep)
run.log:
[13/73] skip: 5, pass: 7, dmesg-warn: 1 /
FATAL: command execution failed
java.io.EOFException
there is:
[ 2603.510504] vblank not available on crtc 0, ret=-22
[ 2603.510562] ------------[ cut here ]------------
[ 2603.510585] WARNING: CPU: 2 PID: 5328 at drivers/gpu/drm/drm_vblank.c:1042 drm_wait_one_vblank+0x160/0x1b0
in this dmesg.
Blocking
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Bugzilla Migration User added CI feature: GEM feature: display/Other platform: GLK priority::medium severity::normal + 1 deleted label
added CI feature: GEM feature: display/Other platform: GLK priority::medium severity::normal + 1 deleted label
Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3267/shard-glkb4/igt@kms_frontbuffer_tracking@psr-shrfb-scaledprimary.html
The backtrace has been cut-off:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3267/shard-glkb4/dmesg-1508445725_Oops_1.log
last dmesg:
<7>
[ 2884.501987] [IGT] kms_plane_multiple: exiting, ret=77
<7>
[ 2884.658294] [IGT] kms_frontbuffer_tracking: executing
then stray.
run.log:
[22/72] skip: 5, pass: 15, fail: 2 -
FATAL: command execution failed
java.io.EOFException
dmesg start at:
<7>
[ 2753.040777] [IGT] kms_frontbuffer_tracking: executing
so other shards have been run on this machine before. Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3268/shard-glkb1/igt@gem_eio@in-flight-contexts.html
last demesg followed by stray:
<7>
[ 462.664336] [drm:gen8_init_common_ring [i915]] Execlists enabled for vcs0
<7>
[ 462.664534] [drm:gen8_init_common_ring [i915]] Execlists enabled for vecs0
<4>
[ 462.689948] Setting dangerous option reset - tainting kernel
otherwise no ERROR in dmesg
run-log:
[40/72] skip: 18, pass: 16, fail: 6 |
FATAL: command execution failed
java.io.EOFException Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3275/shard-glkb3/igt@gem_ppgtt@blt-vs-render-ctx0.html
last dmesg:
<7>
[ 1175.713017] [IGT] gem_ppgtt: executing
<7>
[ 1175.737657] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0
<6>
[ 1256.231064] perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
run.log:
[33/73] skip: 11, pass: 19, fail: 3 /
FATAL: command execution failed
java.io.EOFException Marta Löfstedt@marta
said:CI_DRM_3286 shard-glkb1 igt@gem_cpu_reloc@full
First dmesg:
<5>
[ 1648.677701] owatch: Using watchdog device /dev/watchdog0
Last dmesg:
<7>
[ 2012.063887] [IGT] gem_cpu_reloc: starting subtest full
run.log:
[33/73] skip: 11, pass: 20, fail: 2 /
FATAL: command execution failed
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3286/shard-glkb1/igt@gem_cpu_reloc@full.html Marta Löfstedt@marta
said:CI_DRM_3286 shard-glkb2 igt@gem_tiled_partial_pwrite_pread@reads
First dmesg:
<5>
[ 806.330960] owatch: Using watchdog device /dev/watchdog0
Last dmesg:
<7>
[ 997.929771] [IGT] gem_tiled_partial_pwrite_pread: executing
[00/73] |
FATAL: command execution failed
java.io.EOFException
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3286/shard-glkb2/igt@gem_tiled_partial_pwrite_pread@reads.html Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3289/shard-glkb1/igt@gem_ringfill@basic-default-hang.html
First dmesg:
<5>
[ 3379.148765] owatch: Using watchdog device /dev/watchdog0
Last dmesg:
<7>
[ 3694.039373] [IGT] gem_ringfill: executing
Followed by "stay".
runtimes log:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3289/shard-glkb1/runtimes31.txt
=> ~5 minutes
Note dmesg timestamps are not wall time, the runtime should be. Also, Arek just fixed this feature with runtime just for the tests in each shard, thanks Arek!
run.log:
[11/72] skip: 3, pass: 8 \
FATAL: command execution failed
java.io.EOFException
From this I draw the conclusion that this was a system hang. Marta Löfstedt@marta
said:I will also file incomplete timeout/system hang on fi-glk-dsi on this bug. However, note bug 103514 which is a full scenario where we start with wedged GPU and then end up in incomplete. This is not the case for this: CI_DRM_3305 fi-glk-dsi igt@gem_ctx_switch@basic-default-heavy
last dmesg:
<7>
[ 75.283189] [IGT] gem_ctx_switch: executing
Followed by stry.
run.log:
[031/289] skip: 9, pass: 22 \
FATAL: command execution failed
java.io.EOFException
CI_IGT_test runtime 175 seconds
So, this looks like system hang.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3305/fi-glk-dsi/igt@gem_ctx_switch@basic-default-heavy.html Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3308/shard-glkb4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt.html
First dmesg:
<5>
[ 1593.174703] owatch: Using watchdog device /dev/watchdog0
<5>
[ 1593.174783] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 1593.177277] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
Last dmesg:
<7>
[ 1979.031808] [drm:drm_helper_hpd_irq_event] [CONNECTOR:89:DP-1] status updated from disconnected to disconnected
<7>
[ 1979.183337] [IGT] kms_frontbuffer_tracking: executing
Followed by stray.
run.log:
[55/73] skip: 12, pass: 38, dmesg-warn: 1, fail: 4 \
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3308@shard-glkb4 : FAILURE
CI_IGT_test runtime 463 seconds Marta Löfstedt@marta
said:CI_DRM_3312 fi-glk-dsi igt@gem_exec_flush@basic-wb-set-default
dmesg:
<5>
[ 43.114273] owatch: Using watchdog device /dev/watchdog0
<5>
[ 43.114505] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 43.116098] owatch: timeout for /dev/watchdog0 set to 100 (requested 100)
...
<7>
[ 229.819324] [IGT] gem_exec_flush: executing
run.log
[073/289] skip: 16, pass: 57 /
FATAL: command execution failed
java.io.EOFException
....
Completed CI_IGT_test CI_DRM_3312@fi-glk-dsi : FAILURE
CI_IGT_test runtime 286 seconds
looks like system hang.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3312/fi-glk-dsi/igt@gem_exec_flush@basic-wb-set-default.html Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3328/shard-glkb3/igt@gem_exec_blt@cold-max.html
dmesg:
<5>
[ 1760.361917] owatch: Using watchdog device /dev/watchdog0
<5>
[ 1760.361996] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 1760.364687] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
no obvious ERROR
...
<7>
[ 1985.345189] [IGT] gem_exec_params: executing
<7>
[ 1985.372299] [IGT] gem_exec_params: starting subtest cliprects_ptr-dirt
<7>
[ 1985.372502] [IGT] gem_exec_params: exiting, ret=0
Followed by stray.
run.log:
running: igt/kms_flip/vblank-vs-modeset-suspend-interruptible
[34/73] skip: 16, pass: 16, fail: 2 -
FATAL: command execution failed
java.io.EOFException
...
Finished: FAILURE
Completed CI_IGT_test CI_DRM_3328@shard-glkb3 : FAILURE
CI_IGT_test runtime 515 seconds Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3326/shard-glkb3/igt@gem_softpin@softpin.html
dmesg:
<5>
[ 3641.153567] owatch: Using watchdog device /dev/watchdog0
<5>
[ 3641.153647] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 3641.156151] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
no ERRORs
...
<7>
[ 3974.363963] [IGT] gem_readwrite: exiting, ret=0
<7>
[ 3974.540413] [IGT] gem_softpin: executing
<7>
[ 3974.555885] [IGT] gem_softpin: starting subtest softpin
run.log:
running: igt/gem_softpin/softpin
[60/72] skip: 15, pass: 43, fail: 2 |
FATAL: command execution failed
...
Finished: FAILURE
Completed CI_IGT_test CI_DRM_3326@shard-glkb3 : FAILURE
CI_IGT_test runtime 514 seconds Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3334/shard-glkb1/igt@perf@polling.html
dmesg:
<5>
[ 2456.779388] owatch: Using watchdog device /dev/watchdog0
<5>
[ 2456.779456] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 2456.780498] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
<7>
[ 2705.896434] [drm:intel_power_well_disable [i915]] disabling power well 2
<7>
[ 2705.896455] [drm:drm_helper_hpd_irq_event] [CONNECTOR:89:DP-1] status updated from disconnected to disconnected
<7>
[ 2706.025517] [IGT] perf: executing
run.log
[04/72] skip: 1, pass: 2, fail: 1 |
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3334@shard-glkb1 : FAILURE
CI_IGT_test runtime 339 seconds Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb2/igt@kms_frontbuffer_tracking@fbc-suspend.html
dmesg:
<5>
[ 470.310793] owatch: Using watchdog device /dev/watchdog0
<5>
[ 470.310870] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 470.312856] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
<7>
[ 753.111566] [drm:intel_power_well_disable [i915]] disabling DC off
<7>
[ 753.111665] [drm:gen9_enable_dc5 [i915]] Enabling DC5
<7>
[ 753.111759] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 01
run.log:
[30/73] skip: 12, pass: 15, fail: 3 -
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3339@shard-glkb2 : FAILURE
CI_IGT_test runtime 387 seconds Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb5/igt@kms_flip@vblank-vs-modeset-suspend-interruptible.html
above is just weird, impossible to decide which pstore that belongs to this run, timestamstes indicates runtimes over 24 hours. Marta Löfstedt@marta
said:this is from same run as previous:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb5/igt@kms_busy@basic-flip-b.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3339/shard-glkb5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3340/shard-glkb5/igt@pm_rpm@system-suspend.html
dmesg:
<5>
[ 1627.977850] owatch: Using watchdog device /dev/watchdog0
<5>
[ 1627.977929] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 1627.979299] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
<7>
[ 1713.425491] [drm:hsw_power_well_disable [i915]] power well 1 forced on (bios:1 driver:0 kvmr:0 debug:0)
<7>
[ 1713.425583] [drm:bxt_enable_dc9 [i915]] Enabling DC9
<7>
[ 1713.425623] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 08
<7>
[ 1713.425686] [drm:intel_runtime_suspend [i915]] Device suspended
run.log
[25/72] skip: 6, pass: 19 /
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3340@shard-glkb5 : FAILURE
CI_IGT_test runtime 441 seconds Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3341/shard-glkb5/igt@gem_workarounds@suspend-resume-fd.html
dmesg:
<5>
[ 25.911268] owatch: Using watchdog device /dev/watchdog0
<5>
[ 25.911583] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 25.912970] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
<7>
[ 65.759686] [IGT] gem_workarounds: executing
<7>
[ 65.782123] [IGT] gem_workarounds: starting subtest suspend-resume-fd
<6>
[ 65.828745] PM: suspend entry (deep)
run.log:
[11/72] skip: 3, pass: 7, fail: 1 \
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3341@shard-glkb5 : FAILURE
CI_IGT_test runtime 548 seconds Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3342/shard-glkb5/igt@drv_suspend@fence-restore-untiled.html
dmesg:
<5>
[ 27.357629] owatch: Using watchdog device /dev/watchdog0
<5>
[ 27.357971] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 27.359511] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
<7>
[ 147.392151] [IGT] gem_exec_blt: exiting, ret=0
<7>
[ 147.504680] [IGT] drv_suspend: executing
<7>
[ 147.529732] [IGT] drv_suspend: starting subtest fence-restore-untiled
run.log
[19/72] skip: 8, pass: 11 \
Build timed out (after 17 minutes). Marking the build as aborted.
FATAL: command execution failed
java.io.IOException
...
Completed CI_IGT_test CI_DRM_3342@shard-glkb5 : FAILURE
CI_IGT_test runtime 1028 seconds
Note: Jenkins timeout after 17 minutes. Last dmesg is at 147 seconds! Marta Löfstedt@marta
said:https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3343/shard-glkb5/igt@kms_cursor_crc@cursor-256x256-suspend.html
dmesg:
<5>
[ 469.226040] owatch: Using watchdog device /dev/watchdog0
<5>
[ 469.226117] owatch: Watchdog /dev/watchdog0 is a software watchdog
<5>
[ 469.227135] owatch: timeout for /dev/watchdog0 set to 370 (requested 370)
...
<7>
[ 722.630630] [IGT] kms_cursor_crc: starting subtest cursor-256x256-suspend
<7>
[ 722.630905] [drm:drm_mode_addfb2] [FB:98]
<7>
[ 722.645396] [drm:drm_mode_setcrtc] [CRTC:42:pipe A]
<7>
[ 722.645453] [drm:drm_mode_setcrtc] [CONNECTOR:76:eDP-1]
<7>
[ 722.661151] [drm:drm_mode_setcrtc] [CRTC:58:pipe B]
<7>
[ 722.661687] [drm:drm_mode_setcrtc] [CRTC:74:pipe C]
Folloed by "stray".
run.log:
[66/72] skip: 24, pass: 37, fail: 5 -
FATAL: command execution failed
java.io.EOFException
...
Completed CI_IGT_test CI_DRM_3343@shard-glkb5 : FAILURE
CI_IGT_test runtime 709 seconds