Bisected: drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11 cause the issue the laptop doesn't repower after reboot or poweroff completely
Brief summary of the problem:
After upgrading to the Linux Kernel 6.8.7 I'm experienced an issue where my Laptop doesn't restart completely. The actual result is it's going to reboot shut down services according to logs it even turns off the screen but the power led indicator remains active and nothing happened, no matter how long I wait. I have to hold the power button to turn it off and then power on the laptop manually. It happened every single time but only when the system was suspended in current session.
Hardware description:
- CPU:
model: AMD Ryzen 7 7840HS with Radeon 780M Graphics bits: 64
type: MT MCP arch: Zen 4 gen: 5 level: v4 note: check built: 2022+
process: TSMC n5 (5nm) family: 0x19 (25) model-id: 0x74 (116) stepping: 1
microcode: 0xA704101
- GPU:
AMD Phoenix1 vendor: Lenovo driver: amdgpu v: kernel arch: RDNA-3
code: Phoenix process: TSMC n4 (4nm) built: 2023+ pcie: gen: 4 speed: 16 GT/s
lanes: 16 ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4, DP-5, DP-6,
HDMI-A-1, Writeback-1 bus-ID: 64:00.0 chip-ID: 1002:15bf class-ID: 0300
- System Memory:
Channel-A DIMM 0 type: LPDDR5 detail: synchronous unbuffered
(unregistered) size: 8 GiB speed: 6400 MT/s volts: curr: 0.5 min: 0.5 Total 32 GiB
- Display(s):
model: BOE Display 0x0ac1 built: 2021 res: 2560x1600
System information:
- Distro name and Version: Arch Linux
- Kernel version: Linux 6.8.7-arch1-2 #1 (closed) SMP PREEMPT_DYNAMIC Fri, 19 Apr 2024 09:51:31 +0000 x86_64 GNU/Linux
How to reproduce the issue:
- Boot the laptop
- Make a Laptop goes to sleep
- Wake up the laptop
- Try to reboot
Attached files:
Log files (for system lockups / game freezes / crashes)
- Dmesg log dmesg-6.8.7.txt (Log with affected Kernel, dmesg boot log after power on)
What was founded during bisection testing
$ git bisect good
7521329e54931ede9e042bbf5f4f812b5bc4a01d is the first bad commit
commit 7521329e54931ede9e042bbf5f4f812b5bc4a01d
Author: Tim Huang <Tim.Huang@amd.com>
Date: Wed Mar 27 13:10:37 2024 +0800
drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11
commit 31729e8c21ecfd671458e02b6511eb68c2225113 upstream.
While doing multiple S4 stress tests, GC/RLC/PMFW get into
an invalid state resulting into hard hangs.
Adding a GFX reset as workaround just before sending the
MP1_UNLOAD message avoids this failure.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <superm1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [12dadc409c2bd8538c6ee0e56e191efde6d92007] Linux 6.8.7
git bisect bad 12dadc409c2bd8538c6ee0e56e191efde6d92007
# good: [1f7d392571dfec1c47b306a32bbe60be05a51160] Linux 6.8.6
git bisect good 1f7d392571dfec1c47b306a32bbe60be05a51160
# bad: [aed5666b128bcfc8c8fe991a83d7a8306689d090] net/mlx5: Correctly compare pkt reformat ids
git bisect bad aed5666b128bcfc8c8fe991a83d7a8306689d090
# bad: [739f8127f0b4984a6a5c33632c628ef367161467] net: openvswitch: fix unwanted error log on timeout policy probing
git bisect bad 739f8127f0b4984a6a5c33632c628ef367161467
# bad: [9ce46530d18b42d5906ad9d169080ee6c7d2dad3] ARM: OMAP2+: fix N810 MMC gpiod table
git bisect bad 9ce46530d18b42d5906ad9d169080ee6c7d2dad3
# good: [c4a18b842dcd235744d744a3a04f14489baf1c58] ARM: dts: imx7s-warp: Pass OV2680 link-frequencies
git bisect good c4a18b842dcd235744d744a3a04f14489baf1c58
# bad: [9e9bb74a93b7daa32313ccaefd0edc529d40daf8] platform/chrome: cros_ec_uart: properly fix race condition
git bisect bad 9e9bb74a93b7daa32313ccaefd0edc529d40daf8
# good: [9e502ddc22d542a8f96bd4e9f298f0929699b7c5] ring-buffer: Only update pages_touched when a new page is touched
git bisect good 9e502ddc22d542a8f96bd4e9f298f0929699b7c5
# bad: [7521329e54931ede9e042bbf5f4f812b5bc4a01d] drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11
git bisect bad 7521329e54931ede9e042bbf5f4f812b5bc4a01d
# good: [e4cb8382fff6706436b66eafd9c0ee857ff0a9f5] Bluetooth: Fix memory leak in hci_req_sync_complete()
git bisect good e4cb8382fff6706436b66eafd9c0ee857ff0a9f5
# first bad commit: [7521329e54931ede9e042bbf5f4f812b5bc4a01d] drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11
Kernel 6.8.7 with 7521329e54931ede9e042bbf5f4f812b5bc4a01d reverted works good on my machine