Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
Equinix is shutting down its operations with us on April 30, 2025. They have graciously supported us for almost 5 years, but all good things come to an end.
Given the time frame, it's going to be hard to make a smooth transition of the cluster to somewhere else (TBD). Please expect in the next months some hiccups in the service and probably at least a full week of downtime to transfer gitlab to a different place.
All help is appreciated.
[drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D
Description of problem:
Computer tends to freeze from time to time for short periods,
especially when starting Steam.
It is impossible to play games, especially "Nier: Automata".
The freezes seem to correlate with listing appearing in the logs:
[ 2700.822335] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon[ 2700.822393] [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D[ 2702.251273] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon[ 2702.251330] [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D[ 2703.570188] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon[ 2703.570248] [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D[ 2705.083146] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon[ 2705.083204] [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D
Version-Release number of selected component (if applicable):
Kernel: 5.14.14-200.fc34.x86_64
How reproducible:
On system startup, this listing appears:
[ 0.000000] Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.14-200.fc34.x86_64 root=UUID=a5ac22bc-1581-415d-88f5-e5de3d96fdec ro rootflags=subvol=root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1[ 0.175505] Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.14-200.fc34.x86_64 root=UUID=a5ac22bc-1581-415d-88f5-e5de3d96fdec ro rootflags=subvol=root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1[ 2.110669] i915 0000:00:02.0: [drm] VT-d active for gfx access[ 2.110673] fb0: switching to inteldrmfb from EFI VGA[ 2.114904] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)[ 2.134746] i915 0000:00:02.0: [drm] [ENCODER:102:DDI B/PHY B] is disabled/in DSI mode with an ungated DDI clock, gate it[ 2.134748] i915 0000:00:02.0: [drm] [ENCODER:118:DDI C/PHY C] is disabled/in DSI mode with an ungated DDI clock, gate it[ 2.134750] i915 0000:00:02.0: [drm] [ENCODER:128:DDI D/PHY D] is disabled/in DSI mode with an ungated DDI clock, gate it[ 3.963368] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon[ 3.963462] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0[ 5.164189] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon[ 5.164272] [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D
Possible to encounter also when running Steam application.
Actual results:
System freezes briefly from time to time.
Games are unplayable.
Expected results:
System stops to freeze.
Games are playable.
Additional info:
Game unplayability may be related to something else.
However brief freezes certainly are related to drm.
I have the same error on Dell Precision 7530 with Intel Xeon E-2186M (Coffee Lake)
[drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon [drm:intel_dp_detect [i915]] *ERROR* LSPCON init failed on port D
On ubuntu kernel (or packages) it works fine:
Linux ThinkPad-P15-Gen-1 5.10.0-1053-oem #55-Ubuntu SMP Sun Dec 12 01:58:07 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
I have this problem on Linux laptop.mydomain.net 6.5.6 #1 SMP PREEMPT_DYNAMIC Sat Oct 7 23:41:01 CST 2023 x86_64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz GenuineIntel GNU/Linux.
I don't think I have rights to reopen this, but it is definitely present on my machine.
I do not see the problem any more. Which does not mean that the problem does not exist, obviously, and I have only tested it for a few hours, but wine programs seem to run normally, and I do not see errors in dmesg any more.
UPD1: no, sorry, the problem is still there.
The errors are not in dmesg, and in dmesg nothing now indicates an error, but the freezes are still present, and wine programs still crash/fail to start.
Wine's log is saying:
10月 11 11:09:31 0114:fixme:ntdll:NtQuerySystemInformation info_class SYSTEM_PERFORMANCE_INFORMATION10月 11 11:09:31 0114:fixme:wbemprox:client_security_SetBlanket 7A680F58, 00D28748, 10, 0, (null), 3, 3, 00000000, 010月 11 11:09:31 0114:fixme:wbemprox:client_security_Release 7A680F5810月 11 11:09:31 0114:err:d3d:wined3d_caps_gl_ctx_create Failed to create a window.10月 11 11:09:31 0114:err:d3d:wined3d_adapter_gl_init Failed to get a GL context for adapter 03EDF068.10月 11 11:09:32 00f8:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,046AEE00,03F124E4),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00f8:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,046AEBF4,03F124E4),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F474,03E0B584),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F300,03F10004),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F4D0,03F119A4),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F730,03E0BC2C),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F730,03E0BC2C),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F730,03F144D4),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F20C30,03C8F730,03F144D4),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F31390,03C8F730,03E0B9C4),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:32 00e4:fixme:ras:RasEnumEntriesW (00000000,(null),03F21388,03C8F730,03F1284C),stub!10月 11 11:09:32 00e8:fixme:ras:RasConnectionNotificationW (FFFFFFFF,00000184,0x00000003),stub!10月 11 11:09:33 [1011/110931:FATAL:temp_window.cc(43)] Check failed: hwnd_. 10月 11 11:09:33 Backtrace:10月 11 11:09:33 (No symbol) [0x058B0701]10月 11 11:09:33 (No symbol) [0x057C50CF]10月 11 11:09:33 (No symbol) [0x057AB7D1]10月 11 11:09:33 (No symbol) [0x05081A90]10月 11 11:09:33 (No symbol) [0x1095CDBD]10月 11 11:09:33 (No symbol) [0x10258D77]10月 11 11:09:33 (No symbol) [0x1025672B]10月 11 11:09:33 (No symbol) [0x10973E55]10月 11 11:09:33 (No symbol) [0x10973CFA]10月 11 11:09:33 (No symbol) [0x1096F53E]10月 11 11:09:33 (No symbol) [0x1096FB6D]10月 11 11:09:33 (No symbol) [0x1029C2A5]10月 11 11:09:33 (No symbol) [0x10259676]10月 11 11:09:33 (No symbol) [0x10976C24]10月 11 11:09:33 WINPROC_wrapper [0x7E7618FC+28]10月 11 11:09:33 (No symbol) [0x7E761DF6]10月 11 11:09:33 (No symbol) [0x7E763D09]10月 11 11:09:33 User32CallWindowProc [0x7E763EBF+207]10月 11 11:09:33 (No symbol) [0x7EF554B5]10月 11 11:09:33 (No symbol) [0xDEADBABE]10月 11 11:09:33
The problem appears "semi-randomly", with high probability. On some reboots that wine programs manages to start successfully, and not freeze is observed later on.
@oneacik These errors seems to be somewhere outside of i915 driver and not related to the PCON issue reported, please report the non i915 related issues in the appropriate channel.
How to debug this issue further? The freezes consistently co-incide with those error messages, and even though the messages are not present with the patch, the root cause seems to be still there?
Do you mean that the driver is behaving correctly, but it's legitimate behaviour is triggering errors in other subsystems?
[ 34.158216] [drm:lspcon_init [i915]] No LSPCON detected, found unknown [ 34.158324] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
In that setup it seemed to be issue where no external display was connected and the probing for LSPCON was giving repeated messages for No LSPCON detected.
Currently the lspcon init and lspcon probe are intermingled and we seem to getting Error message for init failed, even when there is nothing connected.
I am not sure if the freeze is due to repeated probing and detection failure or if there is something else in your case.
CONFIG_DRM_I915=mCONFIG_DRM_I915_FORCE_PROBE=""CONFIG_DRM_I915_CAPTURE_ERROR=yCONFIG_DRM_I915_COMPRESS_ERROR=yCONFIG_DRM_I915_USERPTR=yCONFIG_DRM_I915_GVT_KVMGT=mCONFIG_DRM_I915_PXP=y# drm/i915 DebuggingCONFIG_DRM_I915_WERROR=yCONFIG_DRM_I915_DEBUG=yCONFIG_DRM_I915_DEBUG_MMIO=yCONFIG_DRM_I915_DEBUG_GEM=yCONFIG_DRM_I915_DEBUG_GEM_ONCE=yCONFIG_DRM_I915_ERRLOG_GEM=yCONFIG_DRM_I915_TRACE_GEM=yCONFIG_DRM_I915_TRACE_GTT=yCONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS=y# CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set# CONFIG_DRM_I915_DEBUG_GUC is not setCONFIG_DRM_I915_SELFTEST=y# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set# CONFIG_DRM_I915_DEBUG_VBLANK_EVADE is not setCONFIG_DRM_I915_DEBUG_RUNTIME_PM=y# end of drm/i915 Debugging# drm/i915 Profile Guided OptimisationCONFIG_DRM_I915_REQUEST_TIMEOUT=20000CONFIG_DRM_I915_FENCE_TIMEOUT=10000CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND=250CONFIG_DRM_I915_HEARTBEAT_INTERVAL=2500CONFIG_DRM_I915_PREEMPT_TIMEOUT=640CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE=7500CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT=8000CONFIG_DRM_I915_STOP_TIMEOUT=100CONFIG_DRM_I915_TIMESLICE_DURATION=1# end of drm/i915 Profile Guided OptimisationCONFIG_DRM_I915_GVT=yCONFIG_SND_HDA_I915=y
and the problem disappeared. I added CONFIG_DRM_I915_DEBUG=y and everything that depends on it.
Anyway, the log file is attached, and now the debug too.
When I have time, I will try to turn off those debug options and see which one causes the freezes to disappear.
This really looks like a race condition though. The debug build just makes the correct "thread" win.
Thanks @Lockywolf for the logs. Thanks @jani with the config help.
As I was suspecting before, the setup doesn't have anything connected to the LSPCON port.
So the driver probes for LSPCON port for a couple of times and then moves ahead.
My patches are just separating LSPCON init and probe calls, to avoid unnecessary ERROR messages when there is nothing connected to LSPCON port.
I am not sure, if its because of this change, it helped.
To check if the freeze is indeed due to probing on LSPCON port multiple times, I can perhaps prepare a patch to just ignore LSPCON and see if it helps.
Okay, so I have done some more config bisecting, and the freezes seem to be removed by one debug setting:
# diff -u config.2024-01-16.freezes.lwf config.2024-01-16.no-freezes.lwf --- config.2024-01-16.freezes.lwf 2024-01-26 11:43:32.252314574 +0800+++ config.2024-01-16.no-freezes.lwf 2024-01-26 13:10:17.801149196 +0800@@ -6568,7 +6568,7 @@ CONFIG_DRM_I915_DEBUG_GEM=y CONFIG_DRM_I915_DEBUG_GEM_ONCE=y CONFIG_DRM_I915_ERRLOG_GEM=y-# CONFIG_DRM_I915_TRACE_GEM is not set+CONFIG_DRM_I915_TRACE_GEM=y # CONFIG_DRM_I915_TRACE_GTT is not set CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS=y # CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set
Whatever that "TRACE_GEM" is, it is, seemingly, forcing some kind of ordering in the kernel, so that the deadlock either disappears, or becomes invisble.
On my side, with same troubles, I have installed « MATE Optimus » applet.
Whenever I select the « NVIDIA (On-Demand) » mode, the problem occur and render the whole usage experience bad.
Switching to the « NVIDIA (Performance Mode) » avoid the freezing...
System Information Manufacturer: LENOVO Product Name: 20SUS2LM0D Version: ThinkPad P15 Gen 1 Serial Number: PF2J52X3 UUID: 5c56ca4c-1e9b-11b2-a85c-97876b49c73d Wake-up Type: Power Switch SKU Number: LENOVO_MT_20SU_BU_Think_FM_ThinkPad P15 Gen 1 Family: ThinkPad P15 Gen 1
and now I can has both interr, nvidia, and wine running programs
And no *ERROR* LSPCON init failed on port D problems.
No LSPCON (whatever it is) == no problems.
Maybe at the same time I lose HDMI, or HDMI 2.0, or DP, or whatnot, but at least I can work.
Since we found that this problem is unrelated to the patches, I'm not using them in order to keep the codebase as close to the mainline as possible.
Moreover, this messages are highly correlated to the freezes, so I like having them, they help me distinguish between the cases when the system freezes due to this issue rather than due to something else (e.g. excessive I/O).
I suggest renaming this issue to something like "random system freezes somehow related to LSPCON".
Anyway, patching out HAS_LSPCON removes this issue entirely. Both messages and freezes.
Here's the deal. We're interested in fixing any LSPCON initialization issues. But we're not interested in debugging any issues when an out-of-tree module (in this case nvidia.ko) is present. We have no way of knowing what it does.
For example, since disabling LSPCON i915 side appears to fix all the issues, it's possible there's a mux of some sort, and there's no cooperation between the drivers to handle that. But there can never be as long as the other driver is proprietary.
That said, there's a few things to do. First, be sure you're using pure UEFI boot, not CSM or legacy. Second, once you're sure of the first, attach /sys/kernel/debug/dri/0/i915_vbt.
Also doesn't hurt to attach a fresh dmesg all the way from boot, running a recent kernel, with drm.debug=0xe log_buf_len=4M ignore_loglevel kernel parameters set.
Well, libglvnd was at some point developed, despite proprietary drivers.
This is about upstream kernel support, not about random userspace libaries. As soon as you load an out-of-tree proprietary module, the kernel is tainted, and all bets are off.
It's impossible to make conclusions about what's going on when you have no idea what the proprietary module is doing in kernel space, and no way to change its behaviour. So that's where the support ends.
I understand you, I also logically cannot explain to myself the connection between this error and Wine. In theory, this should affect other applications, not just Wine.
Currently we see the LSPCON init failure messages, even where no external display is connected. The probing for LSPCON gives repeated messages for No LSPCON detected. Currently the lspcon init and lspcon probe are intermingled and we seem to getting Error message for init failed, even when there is nothing connected to the HDMI port.
In such a case, I dont think the freezes can be related to LSPCON, its just the error messages that might be creating the confusion.
I have these errors in the syslog file too, but I don't freeze. Or perhaps I don't freeze, because Kubuntu doesn't restore the whole session after login, so maybe it stopped somewhere.
Also doesn't hurt to attach a fresh dmesg all the way from boot, running a recent kernel, with drm.debug=0xe log_buf_len=4M ignore_loglevel kernel parameters set.
UPD: One more experiment: nvidia completely purged (the card remains), and the same command:
time for (( i=0 ; i<100 ; i++ )) ; do xrandr &>/dev/null ; donereal 1m27.646suser 0m0.099ssys 0m0.217s
Freezes all there, the system not usable during those 1m27s.
I am 90% convinced that Nvidia is unrelated to this issue.
Firstly, the "open kernel" is little more than "forward every request to the GSP", and almost does not influence the kernel.
Secondly, the issue, albeit to a much less of a degree, appears on the system with no nvidia whatsoever.
While wine apps cannot start with the nvidia module present, I am almost sure that they crash due to some hardware probing timing out rather than due to the nvidia module misbehaving.
I am attaching dmesg recorded while that bash for loop was running.
Please tell me if I can provide any more debugging data.
UPD2:
with LSPCON patched away:
time for (( i=0 ; i<100 ; i++ )) ; do xrandr &>/dev/null ; donereal 0m39.230suser 0m0.101ssys 0m0.173s
Well, this is more than twice faster than with LSPCON enabled, and the system was not freezing, even though there was a slight slowdown in responsiveness.
(This is kernel 6.10.11)
UPD3:
For the sake of clarity of testing, I have rebooted the kernel without the logging/debugging enabled, and found that the numbers match almost exactly. 37 seconds without lspcon and 90 seconds with lspcon.
Not going into details, but detect with LSPCON may be tricky and slow and hit timeouts and retries. Looping xrandr is not interesting per se.
The real question is, in a normal usage scenario, why would the driver be doing detect so much that it would impact user experience? Spurious hotplugs? Userspace being silly and doing getconnector too much?
I have the same lspcon errors that mostly consistently occur upon reopening laptop lid after the it has been shut for a little while (>1 hr) and system has been in s2idle. The mouse stops moving and nothing is responsive and a hard reset is required. This is with a Lenovo ThinkPad P17 Gen 1 FHD display running Manjaro Linux with Gnome. It's booting with UEFI and the firmware was updated yesterday using fwupd.
I don't know how to apply the workaround involving HAS_LSPCON suggested by @Lockywolf because I don't have display/intel_display_device.h in the location reported:
[user@user-lenovomws170 ~]$ ls /lib/modules/6.12.4-1-MANJARO/kernel/drivers/gpu/drm/i915/i915.ko.zst kvmgt.ko.zst
^Update: it seems to happen even more often when Slack is either running before entering s2idle or opened after coming out of s2idle.
Also, sometimes, after freezing, the CapsLock button starts flashing.
To anybody more versed in this topic, please let me know if I can provide any further information that may be helpful.