Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Project 'drm/intel' was moved to 'drm/i915/kernel'. Please update any links and bookmarks that may still have the old path.
It seems i am able to reproduce the issue multiple times, but not able to get the page fault crash dmesg log, my system is resetting even before printing any dmesg log, So not sure if crash is same as page fault in acpi idle.
As i see acpi idle gpf crash logs are available for CI_DRM_5180 and CI_DRM_5184.
From where i can get the vmlinux kernel object file and System.map file for CI_DRM_5180 and CI_DRM_5184. It will helpful to debug the page fault in acpi_idle_enter code.
ICL CPU id family patches are not merged to up streamed mainline kernel, due to that ICL H/W still using ACPI idle driver, This ACPI idle page fault will not be there with intel_idle driver, this issue should be fixed once ICL CPU id patches will be public.
Setting the priority as highest as the failure seen in BAT.
(In reply to CI Bug Log from comment 12)
> A CI Bug Log filter associated to this bug has been updated:
>
> shard-iclb6 fi-icl-u3 shard-iclb1: igt@pm_rpm@* - incomplete
> fi-icl-y shard-iclb6 fi-icl-u3 shard-iclb1: igt@pm_rpm@* - incomplete
>
> New failures caught by the filter:
>
> *
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5590/fi-icl-y/
> igt@pm_rpm@module-reload.html
No information is available in this log, considering the result incomplete it is either a kernel crash or hung but no such information about panic or hung is available from logs.
Anshuman, are there any changes done recently to address this bug? Last seen drmtip_244 (5 days, 11 hours / 87 runs ago)on fi-icl-y.
If not fi-icl-y, last seen on shards IGT_4878_full (1 week, 6 days / 168 runs ago).
Seems like the common denominator is that i915_gem_wait_for_idle() hangs (or never completes), which triggers a hang, which reboots the machine because CI sets panic=1.
There is an assumption that a page fault happening in the ACPI idle driver might be responsible for this, but I'm CC:ing Francesco to verify that the logic is sound with his engineers, while waiting for the ACPI driver fixes to be ready and landed in Linux.
Seems like the common denominator is that i915_gem_wait_for_idle() hangs (or
never completes), which triggers a hang, which reboots the machine because
CI sets panic=1.
There is an assumption that a page fault happening in the ACPI idle driver
might be responsible for this, but I'm CC:ing Francesco to verify that the
logic is sound with his engineers, while waiting for the ACPI driver fixes
to be ready and landed in Linux.
Daniel Vetter suggests that we should write a patch for core-for-CI that disables the ACPI driver for ICL only. Who is a taker?