Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
I could not repro this issue locally, locally igt@perf_pmu@frequency is always passes on a DG2. In CI the the failure is being seen occasionally on a DG2 system, though even on that system it was not seen for a while resulting in the bug being closed and later reopened.
Also for this local reproduction the freq measured by PMU using a command such as:
sudo ./perf stat -e i915_0000_4d_00.0/requested-frequency/ -I 1000
matches the set freq as long as there are no idle periods (for the effect of idle periods on divergence of PMU and sysfs freq's see #7025 (closed)).
The failure in igt@perf_pmu@frequency in this bug is as follows: the test sets a min == max freq, runs a spinner and then compares the requested freq measured by PMU against the min == max freq set. The test fails when the PMU measured freq if outside a tolerance of the set freq.
I am considering an enhancement to the test which should minimize such sporadic failures. Note that the PMU measured freq is compared against the set freq, not against an "actual" freq, whatever this "actual" freq might be. So my proposal would be to estimate this "actual" freq by sampling the sysfs over a period (say every 5 ms, which is the same as the sampling period used by PMU) and then compare the PMU measured freq against the freq measured by sampling the sysfs (rather than against the freq set).
I am planning to submit a patch based on this approach for this bug.
Rather than sampling the sysfs every 5 ms, the patch samples the sysfs just once after PMU (but when the spinner is still running) to estimate the requested freq.
The root cause of this issue as well as #6786 (closed) is traced to the breakage of the i915 freq ABI caused by enabling SLPC efficient freq in 95ccf312a1e4f. This issue needs to be resolved before any fixes for these issues can be merged.
Equivalent query: runconfig_tag IS IN ["DRM-TIP"] AND machine_tag IS IN ["DG1", "DG2", "ADL-P"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@perf_pmu@frequency"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["fail"])) AND stderr ~= 'Failed assertion: \(double\)\(min\[0\]\) <= \(1.0 \+ \(tolerance\)\) \* \(double\)\(min_freq\) && \(double\)\(min\[0\]\) >= \(1.0 - \(tolerance\)\) \* \(double\)\(min_freq\)'
Equivalent query: runconfig_tag IS IN ["DRM-TIP"] AND machine_tag IS IN ["DG1", "DG2", "RPL_S", "ADL-P"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@perf_pmu@frequency"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["fail"])) AND stderr ~= 'Failed assertion: \(double\)\(min\[0\]\) <= \(1.0 \+ \(tolerance\)\) \* \(double\)\(min_freq\) && \(double\)\(min\[0\]\) >= \(1.0 - \(tolerance\)\) \* \(double\)\(min_freq\)'