regression: waking up after hibernation causes [drm] *ERROR* Fault errors on pipe A: 0x00000080
Similar issues (based on the error message) include #6421 (closed), #6305 (closed), #2017 (closed).
Steps to reproduce the issue.
My bisection gives me:
- Good commit: e1a4bbb6e837d4f4605dffa9eccce722fc59b9cc.
You might need to setCONFIG_DRM_I915_SELFTEST
for it to build or tweak a header (example). - Bad commit: 39a2bd34c933b00f7c7ada923c212b3ff826fb5d.
It looks like a simple refactor but requires a bit too much context so I'm filling a bug before trying to dig more.
Note I couldn't exactly test these commits as is: they are based on 5.16-rc2 and were merged in 5.18-rc1. I have other graphical issues around 5.16-rc2 that prevented me from reproducing the bug but the bisection clearly pointed the merge commit 647bfd26bf054313305ea9c2c4a1c71f3bbfee63 as bad. However, simply reverting the merge and applying the commits again (with fixes from the merge) was enough for me to finish the bisection.
Before attempting the bisection I was able to confirm 5.17.15 was the last working version and 5.18 the first bad version.
Reproducing is easy enough:
- Boot the laptop, start a graphical environment (currently I use Sway if it matters). It doesn't happen when using the console.
- Hibernate (configured to hibernate when the lid gets closed, no suspend-then-hibernate).
- Boot again and the screen will stay black, switch to another console and get spammed with
[drm] *ERROR* Fault errors on pipe A: 0x00000080
.
How often does the steps listed above trigger the issue? For example: always, 1 out 3 times.
Always when waking up from hibernation after the bad commit.
Which platforms and features are affected (if you can).
No sure about the 'features' part of this question, for the platforms, see below.
The following information about your system:
- System architecture: ("uname -m"):
x86_64
- Kernel version: ("uname -r"). Again, please consider using latest drm-tip from http://cgit.freedesktop.org/drm-tip: 5.16.0-rc5 when on top of the bad commit. I tried with drm-tip but it doesn't stop the machine when entering hibernation and the Nvidia driver doesn't build (so I disabled it), however I seem to be able to resume after a forced shutdown (known issue?).
- Linux distribution: NixOS, stable channel (22.05 at f922a077d8ca521a879fc656fee09e24f34e62a8) and unstable channel when I tried drm-tip.
- Machine or mother board model (use dmidecode if needed): dmidecode.txt
- Display connector: (such as HDMI, DP, eDP, ...): eDP
swaymsg --type get_outputs --pretty > get_outputs.txt
- A full dmesg with debug information and/or a GPU crash dump:
- Under Sway (broken, see the dmesg after waking up):
- before hibernating:
- after hibernating:
- Under the console (working, in case it's useful for diffing):
- before hibernating:
- after hibernating:
- Under Sway (broken, see the dmesg after waking up):
Other attachments if relevant:
- screenshot or photo (a picture is worth a thousand words); I could take a video if it helps.
- output of "xrandr --verbose" for display mode issue; see
swaymsg -t get_oututs
above. - intel_reg_dumper output (see the guide) and VBIOS dump (see the guide) for display issues; The guide references
intel_gpu_dump
but I couldn't find it. Likewise,intel_reg_dumper
appears to have been renamed tointel_reg dump
(see above)? - for GPU hang, get the last batch buffer (see the guide); Sometimes I do get small GPU hangs but I don't believe it's related to this issue (they also happen on good kernels IIRC). Just in case:
cat /sys/class/drm/card1/error > error.txt
- for suspend/resume problems, refer to the guide. I have followed the hibernation section, done hibernating from the console and attached dumps.
Note I do not have access to https://01.org/linuxgraphics/documentation/intel-gpu-dump-tool-guide (linked from https://01.org/linuxgraphics/documentation/development/how-debug-suspend-resume-issues) but I used https://01.org/temp-linuxgraphics/documentation/intel-gpu-dump-tool-guide instead.