Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
Equinix is shutting down its operations with us on April 30, 2025. They have graciously supported us for almost 5 years, but all good things come to an end. We are expecting to transition to new infrastructure between late March and mid-April. We do not yet have a firm timeline for this, but it will involve (probably multiple) periods of downtime as we move our services whilst also changing them to be faster and more responsive. Any updates will be posted in freedesktop/freedesktop#2011 as it becomes clear, and any downtime will be announced with further broadcast messages.
AMD GPU screen blanking for seconds with a warning
I run Fedora 40 on a ThinkPad T14 Gen3 - comes with AMD Ryzen 7 PRO 6850U with Radeon Graphics. I have my monitor connected via the ThinkPad dock, which is over a USB-C connection.
On F40, I saw these screen blankings maybe once in a day - not enough to be a problem.
Yesterday, I updated to F41 with the 6.11.5-300.fc41 kernel. The screen blanking shot up to maybe 30x per minute, with 1-2s blanking each time, effectively giving me an unusable display.
I booted with the older F40 kernel on the F41 distro -- 6.11.4-200.fc40, and that one's stable, but I saw one blanking event in the last hour. dmesg paste below on this F40 kernel:
I have a similar problem with Kernel 6.11.5 and maybe mesa 24.2.5
I also get these messages here:
[ 7350.595556] usb 4-2: new SuperSpeed USB device number 4 using xhci_hcd[ 7350.617744] usb 4-2: New USB device found, idVendor=05e3, idProduct=0620, bcdDevice=93.07[ 7350.617747] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0[ 7350.617749] usb 4-2: Product: USB3.2 Hub[ 7350.617751] usb 4-2: Manufacturer: GenesysLogic[ 7350.639272] hub 4-2:1.0: USB hub found[ 7350.639539] hub 4-2:1.0: 3 ports detected[ 7350.726459] usb 3-2: new high-speed USB device number 7 using xhci_hcd[ 7350.859226] usb 3-2: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice=93.07[ 7350.859230] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0[ 7350.859233] usb 3-2: Product: USB2.1 Hub[ 7350.859234] usb 3-2: Manufacturer: GenesysLogic[ 7350.916908] hub 3-2:1.0: USB hub found[ 7350.917476] hub 3-2:1.0: 3 ports detected[ 7351.289462] usb 3-2.1: new full-speed USB device number 8 using xhci_hcd[ 7351.424113] usb 3-2.1: New USB device found, idVendor=0b05, idProduct=1933, bcdDevice= 1.10[ 7351.424117] usb 3-2.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0[ 7351.424119] usb 3-2.1: Product: ROG Gaming Display Aura Device[ 7351.525643] hid-generic 0003:0B05:1933.000D: hiddev96,hidraw4: USB HID v1.11 Device [ROG Gaming Display Aura Device] on usb-0000:10:00.3-2.1/input0[ 7352.569263] amdgpu 0000:0e:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn32_program_compbuf_size line:138
It feels on my side a bit like energy saving "gone havoc",
because after I move my mouse the display fires up again, but I have disabled "turning of the screen when idle" in Gnome.
Same setup as the original poster with the same issue, tried to boot with Kernels 6.11.4 up to 6.11.6, same error. At the same time, the screen doesn't wake up from sleep on Wayland, but I can access the system switching to TTY3-4.
I suspect this being the updated mesa package or a firmware update on the DockStation. @amitshah I have updated the Dockstation to the last firmware update; have you ruled out this booting the system without having it connected to the ThinkPad Dock?
Just had this happen in my new Lenovo P16s AMD Gen 2 w. AMD Ryzen 7 PRO 7840U running EndeavorOS (Arch) w. kernel 6.11.6-arch1-1 and w. Lenovo USB-C Dock 40AY.
Mesa:
mesa 1:24.2.6-1
The stack trace is identical, however I can't post it as gitlab freedesktop is blocking it as spam so I attached it and my system config as files.
I'm not sure it is relevant, but inxi --full is reporting one of my three displays wrong, although they are working fine in KDE. Specifically my laptop panel (3) is 3840x2400, NOT 1920x1200.
FYI both myself and @alvarezt.raul have this issue on 6.11.6, so it does not address the issue for us. Are you still using Fedora? I am on Arch, and @alvarezt.raul does not say in his post.
I tested Kernel 6.11.6 and for my 7900GRE the issue seems to be resolved, but my videocard was not hit with the kernelpanic, just with the initial message here:
Hi, I'm having the same issue as well on a Thinkpad x13 gen4 running Arch Linux kernel 6.11.6
Interestingly, I'm also having an issue with PSR, which causes the display to freeze. I wonder if those two issues could be related? Yesterday I disabled PSR as indicated there, but then today I started to have the same problem as people here.
Same warning WARNING: CPU: 1 PID: 84262 at drivers/gpu/drm/amd/amdgpu/../display/dc/hubbub/dcn31/dcn31_hubbub.c:151 dcn31_program_compbuf_size+0xd1/0x230 [amdgpu] on my Thinkpad T14s AMD Gen3.
$ uname --allLinux t14s 6.11.7-300.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 8 19:23:10 UTC 2024 x86_64 GNU/Linux
I just updated my Ayaneo KUN (using a 7840U APU) from a 6.10.x kernel to 6.11.10, and I'm now encountering similar errors and warnings in the kernel log
The first line from the kernel log excerpt (see below) appears once I enabled the external display I have attached to the device. The device is connected to a JSAUX dock, and from there via DisplayPort to a display.
I use Sway as desktop environment. The external display is disabled by default. Everything is fine as long as it stays off. Once I enabled it the systems becomes terribly sluggish. Mouse cursor movement is stuttering and keyboard input is repeated multiples times.
A quick check using htop shows no userspace process that could be responsible for this. According to htop the CPU usage is below 5% in this scenario. Adding the kernel threads to the statistics doesn't change this value.
I'm going back to the 6.10.x kernel for now, since the system is not usable in this state.
EDIT: What I don't see here is this screen blanking phenomenon. I still think it's the same underlying issue.
The same on Devuan with the kernel 6.11.10 (and pretty much every kernel in the 6.10 and 6.11 branches):
Linux -------- 6.11.10-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.11.10-1 (2024-11-23) x86_64 GNU/Linux
amd-firmware version: 20240909-2
dmesg:
I did a bit of digging to see if anything in dcn31_program_compbuf_size() or surrounding code changed from v6.10.x to v6.11.x. Turns out that it's not much really.
First one just moves code around. Second one at least happens in the same file, which is dcn31_hubbub.c. I'm not sure if I'm going to find some time to bisect this before the new year (probably going to be busy until Christmas and during the holiday...)
I'm new here, but wanted to mention this is appears to be resolved for me with kernel 6.12.1. In my case, I started noticing this with the later 6.11 kernels and have a ThinkPad P14s Gen 4 connected to a Dell UltraSharp using DP over USB-C which also handles power. I would get this issue every time I powered up, but this is now gone.
kernel.org upstream 6.12.1 has minimal changes and no AMD-specific changes. So either there are other patches in void-linux or you should see the same in 6.12.0 I think.
Definitely still happening on my Thinkpad with 6.12.1-arch1-1 as packaged by Arch.
I've found I can reliably trigger it because it's tied to the external screens. If I boot with the external screens plugged in but turned off, it's fine at first, fine if I turn on the first external screen, immediately throws the kernel warning when I turn on the second external screen. The stacktrace includes drm_mode_atomic_ioctl so it may be linked to initializing that screen, the initial mode switch.
Getting a very similar problem here with kernel 6.12.3, on a Ryzen 6850U: dmesg.txt.
Symptoms: noticeable system slowdown, eventually resulting in a blank screen and an unusable state.
FYI: I had partial success with fixing my problem by applying the patches from this issue report: #3720 (closed)
Some issues remain, but I can at least work with the machine again.
I get the same, running Fedora Silverblue 41, GNOME 47.2, kernel 6.11.11, on a Framework 13 with AMD Ryzen 5 7640U w/ Radeon 760M Graphics (according to fastfetch). Connected to USB hub via USB-C port for power and display output.
Happens intermittently. I'd say about once or twice a day.
dmesg output as follows:
[ 2605.458018] amdgpu 0000:c1:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=285075, emitted seq=285077[ 2605.458026] amdgpu 0000:c1:00.0: amdgpu: Process information: process librewolf pid 5387 thread librewolf:cs0 pid 5466[ 2605.458031] amdgpu 0000:c1:00.0: amdgpu: GPU reset begin![ 2607.551011] amdgpu 0000:c1:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE[ 2607.551019] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue[ 2607.782687] [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx[ 2607.784465] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State[ 2607.784923] amdgpu 0000:c1:00.0: amdgpu: Dumping IP State Completed[ 2607.784929] amdgpu 0000:c1:00.0: amdgpu: MODE2 reset[ 2607.813796] amdgpu 0000:c1:00.0: amdgpu: GPU reset succeeded, trying to resume[ 2607.814535] [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).[ 2607.814658] amdgpu 0000:c1:00.0: amdgpu: SMU is resuming...[ 2607.815678] amdgpu 0000:c1:00.0: amdgpu: SMU is resumed successfully!
Experiencing similar behaviour also on Fedora 41 on Tuxedo Pulse 14 Gen3 HW when connecting to HP Z32 monitor using usb-c. Apart from the trace and logs, display seems to be working okay.
Dec 16 11:47:33 f kernel: amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
Dec 16 11:47:33 f kernel: ------------[ cut here ]------------
Dec 16 11:47:33 f kernel: WARNING: CPU: 1 PID: 2736 at drivers/gpu/drm/amd/amdgpu/../display/dc/hubbub/dcn31/dcn31_hubbub.c:151 dcn31_program_compbuf_size+0xd1/0x230 [am>
Dec 16 11:47:33 f kernel: Modules linked in: overlay hid_logitech_hidpp uhid uinput rfcomm snd_seq_dummy snd_hrtimer nft_masq nft_reject_ipv4 act_csum cls_u32 sch_htb nf>
Dec 16 11:47:33 f kernel: kvm gpio_keys snd_pcm mc bluetooth wmi_bmof snd_timer i2c_piix4 rapl cfg80211 pcspkr snd i2c_smbus soundcore k10temp rfkill amd_pmc joydev soc>
Dec 16 11:47:33 f kernel: CPU: 1 UID: 1000 PID: 2736 Comm: KMS thread Not tainted 6.12.4-200.fc41.x86_64 #1 (closed)
Dec 16 11:47:33 f kernel: Hardware name: TUXEDO TUXEDO Pulse 14 Gen3/R14FA1, BIOS 8.15 05/30/2024
Dec 16 11:47:33 f kernel: RIP: 0010:dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
Dec 16 11:47:33 f kernel: Code: 00 48 8b 43 28 8b 88 b0 01 00 00 48 8b 43 20 0f b6 50 6c 48 8b 43 18 8b b0 14 01 00 00 e8 d7 24 15 00 85 c0 0f 85 33 01 00 00 <0f> 0b 48 >
Dec 16 11:47:33 f kernel: RSP: 0018:ffffba3cc68f73c8 EFLAGS: 00010202
Dec 16 11:47:33 f kernel: RAX: 0000000000000001 RBX: ffff95a8c73cf400 RCX: 000000000000001f
Dec 16 11:47:33 f kernel: RDX: 0000000000000000 RSI: 000000000000397a RDI: ffff95a8db480000
Dec 16 11:47:33 f kernel: RBP: 0000000000000004 R08: ffffba3cc68f73cc R09: ffffba3cc68f7340
Dec 16 11:47:33 f kernel: R10: 0000000000000000 R11: 0000000000000700 R12: ffff95ac5c140000
Dec 16 11:47:33 f kernel: R13: ffff95a8c73cf400 R14: ffff95a8dc400000 R15: 0000000000000001
Dec 16 11:47:33 f kernel: FS: 00007f4dfbfab6c0(0000) GS:ffff95ae43a80000(0000) knlGS:0000000000000000
Dec 16 11:47:33 f kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 16 11:47:33 f kernel: CR2: 00007ff13a9aaf30 CR3: 000000016ca42000 CR4: 0000000000f50ef0
Dec 16 11:47:33 f kernel: PKRU: 55555554
Dec 16 11:47:33 f kernel: Call Trace:
Dec 16 11:47:33 f kernel:
Dec 16 11:47:33 f kernel: ? dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
Dec 16 11:47:33 f kernel: ? __warn.cold+0x93/0xfa
Dec 16 11:47:33 f kernel: ? dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
Dec 16 11:47:33 f kernel: ? report_bug+0xff/0x140
Dec 16 11:47:33 f kernel: ? handle_bug+0x58/0x90
Dec 16 11:47:33 f kernel: ? exc_invalid_op+0x17/0x70
Dec 16 11:47:33 f kernel: ? asm_exc_invalid_op+0x1a/0x20
Dec 16 11:47:33 f kernel: ? dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
Dec 16 11:47:33 f kernel: ? dcn31_program_compbuf_size+0xc9/0x230 [amdgpu]
Dec 16 11:47:33 f kernel: dcn20_optimize_bandwidth+0xef/0x250 [amdgpu]
Dec 16 11:47:33 f kernel: dc_commit_state_no_check+0xfab/0x1990 [amdgpu]
Dec 16 11:47:33 f kernel: dc_commit_streams+0x178/0x610 [amdgpu]
Dec 16 11:47:33 f kernel: amdgpu_dm_atomic_commit_tail+0x721/0x4750 [amdgpu]
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? dc_stream_get_scanoutpos+0x8b/0x100 [amdgpu]
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10 [amdgpu]
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? amdgpu_crtc_get_scanout_position+0x28/0x40 [amdgpu]
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x15d/0x390
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? wait_for_completion_timeout+0x13b/0x170
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? drm_crtc_get_last_vbltimestamp+0x53/0x90
Dec 16 11:47:33 f kernel: commit_tail+0xac/0x160
Dec 16 11:47:33 f kernel: drm_atomic_helper_commit+0x11a/0x140
Dec 16 11:47:33 f kernel: drm_atomic_commit+0xa6/0xe0
Dec 16 11:47:33 f kernel: ? __pfx___drm_printfn_info+0x10/0x10
Dec 16 11:47:33 f kernel: drm_mode_atomic_ioctl+0xaaa/0xd00
Dec 16 11:47:33 f kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Dec 16 11:47:33 f kernel: drm_ioctl_kernel+0xad/0x100
Dec 16 11:47:33 f kernel: drm_ioctl+0x288/0x540
Dec 16 11:47:33 f kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
Dec 16 11:47:33 f kernel: amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
Dec 16 11:47:33 f kernel: __x64_sys_ioctl+0x91/0xd0
Dec 16 11:47:33 f kernel: do_syscall_64+0x82/0x160
Dec 16 11:47:33 f kernel: ? mutex_lock+0x12/0x30
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? drm_mode_createblob_ioctl+0xf1/0x120
Dec 16 11:47:33 f kernel: ? __pfx_drm_mode_createblob_ioctl+0x10/0x10
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? __check_object_size+0x58/0x230
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? drm_ioctl+0x2b7/0x540
Dec 16 11:47:33 f kernel: ? __pfx_drm_mode_createblob_ioctl+0x10/0x10
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? __pm_runtime_suspend+0x69/0xc0
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? amdgpu_drm_ioctl+0x6e/0x80 [amdgpu]
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? syscall_exit_to_user_mode+0x10/0x210
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? do_syscall_64+0x8e/0x160
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? __pm_runtime_suspend+0x69/0xc0
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? amdgpu_drm_ioctl+0x6e/0x80 [amdgpu]
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? syscall_exit_to_user_mode+0x10/0x210
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? do_syscall_64+0x8e/0x160
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? syscall_exit_to_user_mode+0x10/0x210
Dec 16 11:47:33 f kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Dec 16 11:47:33 f kernel: ? do_syscall_64+0x8e/0x160
Dec 16 11:47:33 f kernel: ? exc_page_fault+0x7e/0x180
Dec 16 11:47:33 f kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
Dec 16 11:47:33 f kernel: RIP: 0033:0x7f4e256fc5ad
Dec 16 11:47:33 f kernel: Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d >
Dec 16 11:47:33 f kernel: RSP: 002b:00007f4dfbfa9970 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Dec 16 11:47:33 f kernel: RAX: ffffffffffffffda RBX: 00007f4de4039b30 RCX: 00007f4e256fc5ad
Dec 16 11:47:33 f kernel: RDX: 00007f4dfbfa9a10 RSI: 00000000c03864bc RDI: 000000000000000c
Dec 16 11:47:33 f kernel: RBP: 00007f4dfbfa99c0 R08: 00000000000001b0 R09: 0000000000000001
Dec 16 11:47:33 f kernel: R10: 0000000000000013 R11: 0000000000000246 R12: 00007f4dfbfa9a10
Dec 16 11:47:33 f kernel: R13: 00000000c03864bc R14: 000000000000000c R15: 00007f4de40276f0
Dec 16 11:47:33 f kernel:
Dec 16 11:47:33 f kernel: ---[ end trace 0000000000000000 ]---
also this message is ocuring when external display connected
Dec 16 11:47:34 f kernel: amdgpu 0000:03:00.0: [drm] *ERROR* lttpr_caps phy_repeater_cnt is 0x0, forcing it to 0x80.
I'm using a PC with 7700X iGPU with two monitors, one connected via HDMI and another via DP. I've been observing dcn30_dpp.c:534 call trace in every boot, since the system upgraded from 6.11.2 to 6.11.3:
In recent months, I started to encounter even worse problem. The system occasionally (like 2 or 3 times a week) enters an unstable status during which the screens becomes very slow to reseponse. Usually the DP screen becomes nearly unusable but the HDMI screen still works. Then after a while the screens completely freeze. I cannot switch to TTY but the kernel still responds to magic sysrq. Here is the new log that was produced during last freeze:
Click to expand
Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32794)Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: in process firefox pid 3652 thread firefox:cs0 pid 3733Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: in page starting at address 0x0000000042b70000 from client 0x1b (UTCL2)Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401430Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: MORE_FAULTS: 0x0Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: WALKER_ERROR: 0x0Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: PERMISSION_FAULTS: 0x3Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: MAPPING_ERROR: 0x0Dec 20 15:54:07 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: RW: 0x0Dec 20 15:54:17 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: Dumping IP StateDec 20 15:54:17 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: Dumping IP State CompletedDec 20 15:54:17 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring gfx_0.1.0 timeout, signaled seq=693277, emitted seq=693279Dec 20 15:54:17 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: Process information: process kwin_wayland pid 1838 thread kwin_wayla:cs0 pid 1874Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: GPU reset begin!Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: MODE2 resetDec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: GPU reset succeeded, trying to resumeDec 20 15:54:18 cvhc-tomato kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F4FFC00000).Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: PSP is resuming...Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: reserve 0xa00000 from 0xf4fe000000 for PSP TMRDec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: RAS: optional ras ta ucode is not availableDec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: RAP: optional rap ta ucode is not availableDec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not availableDec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: SMU is resuming...Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: SMU is resumed successfully!Dec 20 15:54:18 cvhc-tomato kernel: [drm] DMUB hardware initialized: version=0x05001C00Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* lttpr_caps phy_repeater_cnt is 0x0, forcing it to 0x80.Dec 20 15:54:18 cvhc-tomato kernel: [drm] kiq ring mec 2 pipe 1 q 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8Dec 20 15:54:18 cvhc-tomato kernel: amdgpu 0000:11:00.0: amdgpu: GPU reset(2) succeeded!Dec 20 15:54:18 cvhc-tomato kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!Dec 20 15:55:40 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 15:57:32 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 15:58:53 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 15:59:11 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 16:00:34 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 16:01:22 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 16:01:45 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 16:02:11 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 16:03:14 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:83:crtc-1] flip_done timed outDec 20 16:03:14 cvhc-tomato kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:83:crtc-1] hw_done or flip_done timed outDec 20 16:03:24 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] flip_done timed outDec 20 16:03:24 cvhc-tomato kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:79:crtc-0] hw_done or flip_done timed outDec 20 16:04:00 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* flip_done timed outDec 20 16:04:00 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:79:crtc-0] commit wait timed outDec 20 16:04:10 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* flip_done timed outDec 20 16:04:10 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CRTC:83:crtc-1] commit wait timed outDec 20 16:04:20 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* flip_done timed outDec 20 16:04:20 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CONNECTOR:93:HDMI-A-1] commit wait timed outDec 20 16:04:30 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* flip_done timed outDec 20 16:04:30 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [CONNECTOR:108:DP-2] commit wait timed outDec 20 16:04:38 cvhc-tomato kernel: sysrq: Keyboard mode set to system defaultDec 20 16:04:40 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* flip_done timed outDec 20 16:04:40 cvhc-tomato kernel: amdgpu 0000:11:00.0: [drm] *ERROR* [PLANE:52:plane-2] commit wait timed outDec 20 16:04:48 cvhc-tomato kernel: sysrq: Terminate All Tasks
Same problem here, with the difference that I'm running a laptop connected to a USB-C dock, which disconnects. Also as mentioned earlier, the screen of my laptop doesn't wake up after it went to standby, funnily enough the screens on my dock do wake up. Errors are similar:
Jan 22 07:46:48 inw-nb-9001 kernel: usb 8-1: USB disconnect, device number 8Jan 22 07:46:48 inw-nb-9001 kernel: usb 8-1.1: USB disconnect, device number 9Jan 22 07:46:48 inw-nb-9001 kernel: r8152-cfgselector 8-1.2: USB disconnect, device number 10Jan 22 07:46:48 inw-nb-9001 kernel: r8152 8-1.2:1.0 enp229s0f3u1u2: Stop submitting intr, status -108Jan 22 07:46:49 inw-nb-9001 kernel: amdgpu 0000:e4:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141Jan 22 07:46:49 inw-nb-9001 kernel: [drm] DM_MST: starting TM on aconnector: 00000000e2164c63 [id: 107]Jan 22 07:46:49 inw-nb-9001 kernel: [drm] DM_MST: DP12, 2-lane link detectedJan 22 07:46:49 inw-nb-9001 kernel: usb 8-1: new SuperSpeed USB device number 11 using xhci_hcdJan 22 07:46:49 inw-nb-9001 kernel: usb 8-1: New USB device found, idVendor=04b4, idProduct=6504, bcdDevice=50.00Jan 22 07:46:49 inw-nb-9001 kernel: usb 8-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Laptop is a HP Elitebook 645 G11 running Fedora 41
CPU: AMD Ryzen 7 7735U with Radeon Graphics
GPU: Radeon 680M
Kernel: Linux 6.12.9-200.fc41.x86_64 x86_64
Edit: After some tests it appeared the USB disconnecting was unrelated, the dock was incompatible with my laptop.