Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
S3 suspend broken on AMD Ryzen Pro 7 5850U (regression in kernel 6.1)
As discussed in #2171 (comment 1726788), my laptop doesn't suspend correctly anymore on 6.1 kernels. This is with the laptop "standalone" (on battery, no power plug, no dongles).
It does suspend correctly with s0ix ("Windows mode"); this bug is about how it used to work in 6.0 kernels with S3 ("Linux mode") and is now broken.
Hardware description: Thinkpad X13 Gen 2a
Motherboard: 20XHCTO1WW
BIOS version: R1NET51W (1.21)
CPU: AMD Ryzen 7 PRO 5850U with Radeon Graphics
GPU: Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]
System Memory: 32GiB
Display: primary Panel
Type of Display Connection: eDP
System information:
Distro name and Version: Debian sid (as of 2023-01-19)
Kernel version: Linux turnagra 6.1.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.6-1 (2023-01-08) x86_64 GNU/Linux, aka 6.1.6-1 with only Debian patches
Custom kernel: N/A
amdgpu firmware version: 20221214
AMD official driver version: N/A
How to reproduce the issue:
Boot on 6.1.6-1
Unlock cryptsetup
Log in KDE Plasma Wayland session
Verify that amd_pmc is loaded
Hit power button, select "Suspend" (allows to see the screen instead of lid closing)
At that point, the screen goes dark, but blinks back to life once, before going dark again. The power button led stays solidly on (= doesn't go "breathing").
By hitting the button again, the kernel is working (Alt-SysRQ-R works), but the Wayland session cannot become useable.
Here is a test kernel with a quirk (with guessed DMI strings from what I saw on the X13 Gen2 that I think should cover it (not Gen2a)). Please test it in both modes.
Confirmed; the wake up from s0ix works by fiddling the touchpad
Well then that's something we still need to explain. Why does this failure only happen in S3 mode. If we quirk it to fix S3 mode, we're going to break it being able to wake from s2idle mode.
Don't worry about testing without amd_pmc. We need that for s0ix, if it's missing it's an invalid test. If it's (manually) loaded for S3 it doesn't do anything.
Can I please see your /sys/kernel/debug/gpio file while booted to S3 mode without my quirk in place?
Thanks for creating the Issue! I can confirm the same issue on my machine. I don't have the exact same Lenovo Laptop, but it seems to be built very similar.
Bisecting came to the conclusion that the following commit is the culprit:
➜ linux-mainline git:(b38f2d5d9615) ✗ git bisect badb38f2d5d9615cf991fb68626e70b042cb8b6dc3e is the first bad commitcommit b38f2d5d9615cf991fb68626e70b042cb8b6dc3eAuthor: Raul E Rangel <rrangel@chromium.org>Date: Thu Sep 29 10:19:11 2022 -0600 i2c: acpi: Use ACPI wake capability bit to set wake_irq
Hardware description: ThinkPad T14s Gen 2a
CPU: AMD Ryzen 7 PRO 5850U with Radeon Graphics
System Memory: 32GB
Display: Built-in Display. No external displays connected.
Type of Display Connection: eDP
System information:
Distro name and Version: Arch Linux
Kernel version: v6.2-rc4
Custom kernel: -
AMD official driver version: -
Log files (for system lockups / game freezes / crashes)
I'm using Lenovo Xiaoxin pro 14 2021 (this model seems to be only available in Chinese market, the same with yoga 14s) with amd ryzen 5800H, 6.1.6 linux zen kernel, nixos.
Previously it was up to the individual drivers. Some of them called device_init_wakeup(dev, true) to always enable wake ups. It looks like the i2c-hid driver never enabled wakeups by default though. So now it's uncovering all the misconfigured ACPI tables.
Yeah; it's really a question of how big of a quirk list we want to try to build versus doing that.
It looks to me that we probably have 4 models affected now.
I wonder if the EC nominally controls the rail to the touchpad in your design? It looks like there are some ACPI table problems, so perhaps the _DSM call didn't get to notify the EC of s0i3 entry.
Are you on the latest BIOS already? If not; can you please upgrade to it so we can see if that _DSM thing is fixed.
S3 and s0ix testing was done with 6.2-rc4.
Well that's pretty odd then to me. Can you please share two acpidumps, one while you're in S3 mode and one while you're in s2idle mode. Lenovo does change the tables with different modes, and maybe the bug only exists in S3 mode.
Are you on the latest BIOS already? If not; can you please upgrade to it so we can see if that _DSM thing is fixed.
Yes, updated the BIOS after first encountering this issue. Didn't change anything.
Well that's pretty odd then to me. Can you please share two acpidumps, one while you're in S3 mode and one while you're in s2idle mode. Lenovo does change the tables with different modes, and maybe the bug only exists in S3 mode.
I think it's a red herring due to a mistake in the debug output not resetting when debounce is off, the register value (last field) is the same on both.
So the _PRW method is used to convey the wake GPIOs via an _AEI table, or via a GPE. This method also defines the sleep state the system can be in so the device can still wake the system. i.e., for S3 use GPE X or GPIO Y. Everything is clearly defined. One thing that always bugged me is how undocumented SharedAndWake keyword is. It doesn't have any constraints. I'm starting to wonder if SharedAndWake was specifically added for the Modern Standby (S0ix) case, and shouldn't be used for any other scenarios. Those other scenarios should be covered by the _PRW case. I'm wondering if the proper fix is to remove _S3 from the pinctrl. Thoughts?
This is actually something I was thinking about too when I was looking earlier. There's another part of the documentation that indicates it is intended to be used for S3 or low power idle though.
The “Wake” designation indicates that the interrupt is capable of waking the system from a low-power idle state or a system sleep state
The thing is the /sys/kernel/debug/gpio output showed that wake from S3 wasn't actually enabled in pinctrl-amd. Because of this - I suspect the "wake" must have been an interrupt that occurred before the SOC got into S3.
On HW-reduced ACPI platforms, wakeup is an attribute of connected interrupts. Interrupts that are designed to wake the processor or the entire platform are defined as wake-capable. Wake-capable interrupts, when enabled by OSPM, wake the system when they assert.
For HW-Reduced ACPI platforms that do not support wake on GPIO-signaled or Interrupt-signaled events, the EventInfo structure is an Integer with value of zero, and is ignored by OSPM. Therefore, _PRW is only required on such platforms if power resources for wakeup must be managed by OSPM (e.g. the _PRW provides a list of Power Resources). Instead, for a device to wake the system, its interrupt must be wake-capable and enabled by the driver. See Interrupt-based Wake Events.
If a connected interrupt is wake-capable (capable of waking the system from a low-power idle state, then it must be configured as ExclusiveAndWake or SharedAndWake; for more information, see Device Power Management.
There is an SxW object for each supported system power state, Sx. Because SoC platforms are always in S0, the only object of interest here is _S0W. This object specifies the platform's ability to wake from a low-power idle state in response to a device's wake signal. The object is used by Windows to determine the target D-state for the device during system low-power idle.
For most SoC platforms, devices are aggressively power-managed to the D3 state when idle, and the system is capable of waking from low-power idle while the device is in this state. For such a system, the _S0W object returns 3 (or 4, if it also supports D3cold).
Wake-capable interrupts (_CRS)
The resource description for a device indicates that the device is capable of detecting and signaling a wake event by marking an interrupt as "wake-capable" (either ExclusiveAndWake or SharedAndWake). Windows and device drivers provide special handling of such interrupts to ensure that they are enabled when the device transitions to a low-power state.
Power Resources for Wake (_PRW)
In some cases, additional power resources must be turned on for a device to be enabled for wake. In this case, the device can provide the _PRW object to list those additional power resources.
_PRW is also used to define the wake capability for traditional (full-ACPI hardware) PC platforms.
All of this makes it sound like SharedAndWake is only used to signal S0ix wake capabilities. I think we need to teach the dev_pm_set_wake_irq to only enable the wake IRQ when entering an S0ix state (if the system is ACPI). Or the other option is to move management of the Wake interrupt into the ACPI subsystem (just like _PRW wake is handled today). This will remove the need for the drivers to register the Wake IRQ (for ACPI systems).
Checking your idea is cheap and that's a cleaner solution if it works.
Having though about this idea over night, I think it's going to break the _AEI based GPIOs. I'm assuming acpi-gpio subsystem will eventually call irq_set_wake on the GPIO controller. :/
Having though about this idea over night, I think it's going to break the _AEI based GPIOs. I'm assuming acpi-gpio subsystem will eventually call irq_set_wake on the GPIO controller. :/
What's worth noting is in the implementations I've seen _AEI based GPIOs are disabled when the system is set to S3. If you compare the ACPI dumps above from those Lenovo systems (specifically ssdt5.dat) you'll see they only use _AEI for S2idle mode. That is likely implementation dependent though and I don't see a reason _AEI couldn't be used in a design with S3.
All of this makes it sound like SharedAndWake is only used to signal S0ix wake capabilities. I think we need to teach the dev_pm_set_wake_irq to only enable the wake IRQ when entering an S0ix state (if the system is ACPI). Or the other option is to move management of the Wake interrupt into the ACPI subsystem (just like _PRW wake is handled today). This will remove the need for the drivers to register the Wake IRQ (for ACPI systems).
Would you mind taking a stab at threading that needle? Depending upon how complex it turns out to be, maybe one of the other solutions outlined here can be temporarily put into place for fixing this in 6.2/6.1, and reverted for 6.3 with that.
So I've tested this patchline on top of Debian's 6.1.7, and can indeed confirm that in both S3/Linux and S0ix/Windows BIOS configuration, suspend and restore work fine now. Awesome, thanks!