dmesg "amdgpu: failed to write reg" and lightdm slowness in Linux 5.14.15
Brief summary of the problem:
With a vanilla 5.14.15 kernel, I'm trying to boot my laptop and login using lightdm. The system starts normally up to the point when systemd starts lightdm, but then all is incredibly slow. It takes about 15 minutes before I see the login form. I was not able to successfully start a wayland or X11 session although I wouldn't exclude that it works eventually with the same slowness.
The system itself is responsive, I can login via ssh and if it was just that, I would not get the idea that anything was wrong.
This did not happen with a 5.14.14 kernel, and I have bisected the issue to
2602e9cc283ad1d61a71ad18af2b772346d368bc is the first bad commit
commit 2602e9cc283ad1d61a71ad18af2b772346d368bc
Author: Yifan Zhang <yifan1.zhang@amd.com>
Date: Tue Sep 28 15:42:35 2021 +0800
drm/amdgpu: init iommu after amdkfd device init
[ Upstream commit 714d9e4574d54596973ee3b0624ee4a16264d700 ]
This patch is to fix clinfo failure in Raven/Picasso:
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.2 AMD-APP (3364.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback
Platform Name: AMD Accelerated Parallel Processing Number of devices: 0
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
Please let me know if there's anything else I can provide.
Hardware description:
- CPU: AMD Ryzen 5 3550H with Radeon Vega Mobile Gfx
- GPU:
- 01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] [1002:67ef] (rev e5)
- 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Picasso [1002:15d8] (rev c2)
- System Memory: 31 GiB
- Display(s): ASUS TUF Gaming FX505DY internal screen and external screen
- Type of Display Connection: internal and HDMI
System information:
- Distro name and Version: Debian 11.1
- Kernel version: Linux HOSTNAME 5.14.14+ #1 (closed) SMP Thu Oct 28 19:31:32 CEST 2021 x86_64 GNU/Linux
- Custom kernel: Vanilla Kernel for bisecting between 5.14.14 and 5.14.15
Attached files:
Log files (for system lockups / game freezes / crashes)
After a while, I can see these errors with repeat dmesg, which are not included above.
[ 31.587323] amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[ 33.027443] snd_hda_intel 0000:05:00.1: Too big adjustment 128
[ 35.796075] snd_hda_intel 0000:05:00.6: Too big adjustment 128
[ 35.808866] snd_hda_intel 0000:05:00.6: Too big adjustment 128
[ 35.834825] snd_hda_intel 0000:05:00.6: Too big adjustment 128
[ 51.707336] amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
[ 52.307649] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
[ 52.678136] [drm] UVD and UVD ENC initialized successfully.
[ 52.778128] [drm] VCE initialized successfully.
[ 73.007591] amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
[ 93.095475] amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
[ 113.155323] amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6