Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
Equivalent query: runconfig_tag IS IN ["DRM-TIP"] AND machine_tag IS IN ["ARL-S", "MTL-P"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@i915_selftest@live@hangcheck", "igt@i915_selftest@live@gt_mocs"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["dmesg-warn"])) AND dmesg ~= '\*ERROR\* GT0: GUC: Bad context sched_state 0x0, ctx_id .*'
Equivalent query: runconfig_tag IS IN ["DRM-TIP"] AND machine_tag IS IN ["ARL-S", "MTL-P", "ARL-H", "DG2"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@i915_selftest@live@hangcheck", "igt@i915_selftest@live@gt_mocs", "igt@i915_selftest@live@workarounds"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["dmesg-warn"])) AND dmesg ~= '\*ERROR\* GT0: GUC: Bad context sched_state 0x0, ctx_id .*'
GuC to host communication is interrupt driven, the handling has 3
parts: interrupt context, tasklet and request queue worker.
During GuC reset prepare, interrupt is disabled before destroy
contexts steps start. The IRQ and worker flushed to finish
in progress message handling if there are. The tasklet flush is
missing, it might causes 2 race conditions:
Tasklet runs after IRQ flushed, add request to queue after worker
flush started, causes unexpected G2H message request processing,
meanwhile, reset prepare code already get the context destroyed.
This will causes error reported about bad context state. (This issue and #12303 (closed))
Tasklet runs after intel_guc_submission_reset_prepare,
ct_try_receive_message start to run, while intel_uc_reset_prepare
already finished guc sanitize and set ct->enable to false. This will
causes warning on incorrect ct->enable state.
(#12439 (closed))
Add the missing tasklet flush to flush all 3 parts.