atombios stuck in loop
Brief summary of the problem:
Since Linux Kernel 6.3, I have been having an issue with my AMD VegaM (hybrid with Intel HD 630) GPU hanging. I have been using 6.1 LTS and testing new kernels since then and it has been broken on all of them up to the current 6.9.1.
When loading certain applications it can take a couple of minutes for them to finally appear on screen. Sometimes, the problem does not manifest, but most of the time it does. I have not been able to determine the cause of proper functioning.
Note: lspci/lshw does not list the AMD GPU when issue has occurred.
Hardware description:
- CPU: Intel(R) Core(TM) i7-8705G
- GPU: 01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Polaris 22 XL [Radeon RX Vega M GL] [1002:694e] (rev c0)
- System Memory: 16GB DDR
- Display(s): Built in 4k LCD panel
- Type of Display Connection: Internal
- GPU lshw:
*-display description: Display controller product: Polaris 22 XL [Radeon RX Vega M GL] [1002:694E] vendor: Advanced Micro Devices, Inc. [AMD/ATI] [1002] physical id: 0 bus info: pci@0000:01:00.0 logical name: /dev/fb0 version: c0 width: 64 bits clock: 33MHz capabilities: pm pciexpress msi bus_master cap_list rom fb configuration: depth=32 driver=amdgpu latency=0 mode=3840x2160 visual=truecolor xres=3840 yres=2160 resources: iomemory:2f0-2ef iomemory:2f0-2ef irq:140 memory:2fe0000000-2fefffffff memory:2ff0000000-2ff01fffff ioport:e000(size=256) memory:de400000-de43ffff memory:de440000-de45ffff *-display description: VGA compatible controller product: HD Graphics 630 [8086:591B] vendor: Intel Corporation [8086] physical id: 2 bus info: pci@0000:00:02.0 logical name: /dev/fb0 version: 04 width: 64 bits clock: 33MHz capabilities: pciexpress msi pm vga_controller bus_master cap_list rom fb configuration: depth=32 driver=i915 latency=0 resolution=3840,2160 resources: iomemory:2f0-2ef iomemory:2f0-2ef irq:139 memory:2ffe000000-2ffeffffff memory:2f80000000-2f8fffffff ioport:f000(size=64) memory:c0000-dffff
System information:
- Distro name and Version: Arch Linux
- Kernel version: 6.9.1-arch1-1
- AMD official driver version: N/A
How to reproduce the issue:
Boot system with kernel newer than 6.3 or newer. (I can't remember if 6.2 is affected)
Log files (for system lockups / game freezes / crashes)
amdgpu dmesg log
[ 4.305119] [drm] amdgpu kernel modesetting enabled. [ 4.305136] amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle [ 4.305243] amdgpu: ATPX version 1, functions 0x00000033 [ 4.305314] amdgpu: ATPX Hybrid Graphics [ 4.305525] amdgpu: Virtual CRAT table created for CPU [ 4.305535] amdgpu: Topology: Add CPU node [ 4.305657] amdgpu 0000:01:00.0: enabling device (0006 -> 0007) [ 4.371973] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ATRM [ 4.371977] amdgpu: ATOM BIOS: 406913.180126.01 [ 4.372186] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported [ 4.372221] amdgpu 0000:01:00.0: BAR 2 [mem 0x2ff0000000-0x2ff01fffff 64bit pref]: releasing [ 4.372223] amdgpu 0000:01:00.0: BAR 0 [mem 0x2fe0000000-0x2fefffffff 64bit pref]: releasing [ 4.372230] amdgpu 0000:01:00.0: BAR 0 [mem 0x2fe0000000-0x2fefffffff 64bit pref]: assigned [ 4.372240] amdgpu 0000:01:00.0: BAR 2 [mem 0x2ff0000000-0x2ff01fffff 64bit pref]: assigned [ 4.372252] amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used) [ 4.372254] amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF [ 4.372358] [drm] amdgpu: 4096M of VRAM memory ready [ 4.372359] [drm] amdgpu: 7871M of GTT memory ready. [ 4.375657] amdgpu: hwmgr_sw_init smu backed is vegam_smu [ 4.862633] kfd kfd: amdgpu: Allocated 3969056 bytes on gart [ 4.862643] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1 [ 4.862728] amdgpu: Virtual CRAT table created for GPU [ 4.862802] amdgpu: Topology: Add dGPU node [0x694e:0x1002] [ 4.862803] kfd kfd: amdgpu: added device 1002:694e [ 4.862814] amdgpu 0000:01:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 6, active_cu_number 20 [ 5.021695] amdgpu 0000:01:00.0: amdgpu: Using BOCO for runtime pm [ 5.021985] [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 0 [ 19.366089] amdgpu 0000:01:00.0: not ready 1023ms after resume; waiting [ 20.406053] amdgpu 0000:01:00.0: not ready 2047ms after resume; waiting [ 22.539408] amdgpu 0000:01:00.0: not ready 4095ms after resume; waiting [ 26.806048] amdgpu 0000:01:00.0: not ready 8191ms after resume; waiting [ 35.126018] amdgpu 0000:01:00.0: not ready 16383ms after resume; waiting [ 52.545838] amdgpu 0000:01:00.0: not ready 32767ms after resume; waiting [ 86.670365] amdgpu 0000:01:00.0: not ready 65535ms after resume; giving up [ 86.670399] amdgpu 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible [ 106.734628] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting [ 106.735033] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C6FC (len 403, WS 20, PS 0) @ 0xC815 [ 106.735328] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C410 (len 114, WS 0, PS 8) @ 0xC46D [ 106.735620] amdgpu 0000:01:00.0: amdgpu: amdgpu asic init failed [ 106.735643] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000). [ 110.945363] amdgpu 0000:01:00.0: amdgpu: last message was failed ret is 0 [ 114.095689] amdgpu 0000:01:00.0: amdgpu: last message was failed ret is 0 [ 117.246437] amdgpu 0000:01:00.0: amdgpu: last message was failed ret is 0 [ 120.396166] amdgpu 0000:01:00.0: amdgpu: last message was failed ret is 0 [ 123.549304] amdgpu 0000:01:00.0: amdgpu: last message was failed ret is 0 [ 125.652615] amdgpu: SMU load firmware failed [ 125.652617] amdgpu: fw load failed [ 125.652617] amdgpu: smu firmware loading failed [ 125.652619] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_resume failed (-22). [ 131.018751] amdgpu 0000:01:00.0: not ready 1023ms after resume; waiting [ 132.058781] amdgpu 0000:01:00.0: not ready 2047ms after resume; waiting [ 134.245529] amdgpu 0000:01:00.0: not ready 4095ms after resume; waiting [ 138.511344] amdgpu 0000:01:00.0: not ready 8191ms after resume; waiting [ 146.829892] amdgpu 0000:01:00.0: not ready 16383ms after resume; waiting [ 163.467642] amdgpu 0000:01:00.0: not ready 32767ms after resume; waiting [ 197.598980] amdgpu 0000:01:00.0: not ready 65535ms after resume; giving up [ 200.532219] amdgpu 0000:01:00.0: not ready 1023ms after resume; waiting [ 201.572105] amdgpu 0000:01:00.0: not ready 2047ms after resume; waiting [ 203.785311] amdgpu 0000:01:00.0: not ready 4095ms after resume; waiting [ 208.051891] amdgpu 0000:01:00.0: not ready 8191ms after resume; waiting [ 216.371791] amdgpu 0000:01:00.0: not ready 16383ms after resume; waiting [ 233.437999] amdgpu 0000:01:00.0: not ready 32767ms after resume; waiting