Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
Equinix is shutting down its operations with us on April 30, 2025. They have graciously supported us for almost 5 years, but all good things come to an end. We are expecting to transition to new infrastructure between late March and mid-April. We do not yet have a firm timeline for this, but it will involve (probably multiple) periods of downtime as we move our services whilst also changing them to be faster and more responsive. Any updates will be posted in freedesktop/freedesktop#2011 as it becomes clear, and any downtime will be announced with further broadcast messages.
Something went wrong while setting issue due date.
igt@gem_sync@basic-all - fail - Timed out waiting for children
This started happening in CI_DRM_12703 so seems to be due to this:
commit 4c7b9344cadbed477372c75e3c0a8cfd542f5990Author: Ashutosh Dixit <ashutosh.dixit@intel.com>Date: Fri Feb 3 07:53:09 2023 -0800 drm/i915/hwmon: Enable PL1 power limit Previous documentation suggested that PL1 power limit is always enabled. However we now find this not to be the case on some platforms (such as ATSM). Therefore enable PL1 power limit during hwmon initialization. Bspec: 51864 v2: Add Bspec reference (Gwan-gyeong) v3: Add Fixes tag Fixes: 99f55efb79114 ("drm/i915/hwmon: Power PL1 limit and TDP setting") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230203155309.1042297-1-ashutosh.dixit@intel.com (cherry picked from commit 0349c41b05968befaffa5fbb7e73d0ee6004f610) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Though the exact cause is not clear at this point. One think to check would be the IFWI version on this system and if it's the recommended IFWI. It seems this system was added to CI only recently?
I have reproduced the bug locally and root caused it. The issue is that firmware is setting the default PL1 power limit to 0, which implies HW will work with minimum power and therefore the lowest effective frequency. This means all workloads will run slower.
For example, igt@gem_sync@basic-all has a 12 seconds timeout and was previously completing in 3 seconds. After enabling the PL1 limit (with the PL1 limit set to 0) it now takes 15 seconds to complete and therefore times out.
Other operations such as GuC load are also showing similar timeouts rendering the ATSM platform unusable.
A CI Bug Log filter associated to this bug has been updated by Gundlakarthik.
Description:DG2 ATS_M: igt@gem_sync@basic-all - fail - Timed out waiting for children
Equivalent query: runconfig_tag IS IN ["DRM-TIP"] AND machine_tag IS IN ["DG2", "ATSM-HW"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@gem_sync@basic-store-all", "igt@gem_sync@basic-each", "igt@gem_sync@basic-all"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["fail"])) AND stderr ~= 'Timed out waiting for children'