OLED Display on Thinkpad X1 Extreme Gen3 blank after suspend (Link Training failed)
I'm trying to debug an issue where a OLED 4k display on a Thinkpad X1 Extreme Gen3 remains blank/with the backlight off after suspend. This happens most of the time, although periodically it does properly come back after suspend. This happens regardless if Xorg is running or not.
I can ssh into the machine, and it's responsive, but switching VTs or anything doesn't cause the display to unblank.
Kernel: 5.12.0-rc2
Distro: Arch Linux
I've attached dmesg logs (suspend-bad.log) with DRM debugging enabled (might be overly verbose, so let me know how to generate better logs). The suspend itself happens at log line 4876.
I'm trying to get a log from when the display properly recovers from sleep to compare.
Let me know if there's any other debugging steps I can perform here.
Edit: Got a log of a successful wakeup. suspend-good.log. The suspend event happens at line 5330 in this log.
Edit 2: Found something suspicious in the logs. From a good resume:
kernel: i915 0000:00:02.0: [drm:intel_dp_link_train_phy [i915]] [CONNECTOR:95:eDP-1] Link Training passed at link rate = 540000, lane count = 4, at DPRX
From a bad resume:
kernel: i915 0000:00:02.0: [drm:intel_dp_link_train_phy [i915]] [CONNECTOR:95:eDP-1] Link Training failed at link rate = 540000, lane count = 4, at DPRX
looks like i915 0000:00:02.0: [drm:intel_dp_link_train_phy [i915]] Max Voltage Swing reached
is the immediate cause preceding that failure
Edit 3: I think I discovered the issue, and was able to resolve it with a hacky patch. I'm reaching out to the intel-gfx mailing list to get some help turning it into a real fix.
I noticed the following during DP link training. A good resume and a bad resume appear to proceed in the same way between a bad and good resume, until I see the following:
From a bad resume:
[drm:drm_dp_dpcd_read [drm_kms_helper]] AUX A/DDI A/PHY A: 0x00202 AUX -> (ret= 6) 00 00 00 00 22 22
From a good resume:
[drm:drm_dp_dpcd_read [drm_kms_helper]] AUX A/DDI A/PHY A: 0x00202 AUX -> (ret= 6) 77 77 81 01 22 22
My thought here is that there's some kind of race or timing issue when we read this register, and it's not fully populated. My workaround was to add an additional sleep unconditionally to the link training function -- after applying that patch, everything has been working. i915-x1-extreme-gen3-hack.patch