Lenovo z16 random amdgpu hangs [amdgpu 0000:67:00.0: [drm] *ERROR* [CONNECTOR:78:eDP-1] commit wait timed out]
Description:
I'm experiencing random ( possibly gpu related? ) crashes that that freeze graphics. Sometimes I can access the TTY, sometimes after waiting a few minutes it recovers, and sometimes it requires a hard reboot.
I haven't tracked down an exact way to trigger the crash/freeze. It does seem to happen more frequently when watching a video or using an video application like Zoom. That said, it's also happen when just doing simple web browsing and while copying files of a USB disk. I'm currently not using any external monitors. Both the BIOS and amd microcode are up to date.
I haven't yet found a reliable way to make my system completely stable, but using ryzenadj --max-performance
does seem to help significantly. Otherwise, different combinations of kernel params listed in other bug tickets don't seem to help.
System Info:
OS: Arch
kernel: 6.2.1-arch1-1
6.2.2-arch1-1
Hardware
model: Lenovo z16
RAM: 32GB
CPU: AMD Ryzen 9 PRO 6950H with Radeon Graphics
GPU: Radeon 680M
Logs
Starts with:
Mar 04 07:31:37 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* [CRTC:67:crtc-0] flip_done timed out
Mar 04 07:32:25 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* flip_done timed out
Mar 04 07:32:25 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* [CRTC:67:crtc-0] commit wait timed out
Mar 04 07:32:35 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* flip_done timed out
Mar 04 07:32:35 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* [CONNECTOR:78:eDP-1] commit wait timed out
Mar 04 07:32:45 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* flip_done timed out
Mar 04 07:32:45 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* [PLANE:55:plane-3] commit wait timed out
Mar 04 07:32:55 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* flip_done timed out
Mar 04 07:32:55 ethics-gradient kernel: amdgpu 0000:67:00.0: [drm] *ERROR* [PLANE:65:plane-5] commit wait timed out
Mar 04 07:32:55 ethics-gradient kernel: WARNING: CPU: 10 PID: 1089 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7893 amdgpu_dm_atomic_commit_tail+0x2c87/0x2d00 [amdgpu]
Mar 04 07:32:55 ethics-gradient kernel: Modules linked in: algif_hash af_alg snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm michael_mic nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib ses nft_reject_inet enclosure nf_reject_ipv4 scsi_transport_sas nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink uas btusb btrtl btbcm uvcvideo btintel btmtk videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 bluetooth videodev snd_soc_dmic snd_soc_acp6x_mach snd_acp6x_pdm_dma vfat videobuf2_common qrtr_mhi snd_sof_amd_rembrandt snd_ctl_led ecdh_generic mc usb_storage fat snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp qrtr snd_sof amdgpu ath11k_pci snd_sof_utils ath11k snd_soc_core intel_rapl_msr qmi_helpers snd_compress snd_hda_codec_realtek snd_hda_codec_generic ac97_bus intel_rapl_common snd_hda_codec_hdmi snd_pcm_dmaengine edac_mce_amd mac80211 snd_pci_ps snd_hda_intel snd_rpl_pci_acp6x snd_intel_dspcfg snd_acp_pci
Mar 04 07:32:55 ethics-gradient kernel: snd_hda_scodec_cs35l41_spi snd_intel_sdw_acpi drm_buddy kvm_amd snd_hda_codec libarc4 gpu_sched snd_pci_acp6x snd_hda_core drm_ttm_helper kvm snd_pci_acp5x snd_hwdep cfg80211 ttm snd_rn_pci_acp3x think_lmi snd_hda_scodec_cs35l41_i2c thinkpad_acpi snd_hda_scodec_cs35l41 irqbypass drm_display_helper psmouse pcspkr firmware_attributes_class snd_pcm snd_acp_config snd_hda_cs_dsp_ctls wmi_bmof ledtrig_audio snd_soc_acpi cs_dsp rapl cec k10temp snd_timer snd_soc_cs35l41_lib rfkill snd_pci_acp3x thunderbolt mhi i2c_piix4 snd soundcore amd_pmf platform_profile serial_multi_instantiate amd_pmc acpi_cpufreq acpi_tad joydev mousedev mac_hid fuse loop dm_mod ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_sensor_hub crct10dif_pclmul crc32_pclmul crc32c_intel nvme sdhci_pci polyval_clmulni cqhci polyval_generic ucsi_acpi wacom serio_raw gf128mul nvme_core sdhci typec_ucsi video atkbd ghash_clmulni_intel usbhid hid_multitouch libps2 vivaldi_fmap sha512_ssse3 aesni_intel
Mar 04 07:32:55 ethics-gradient kernel: crypto_simd cryptd roles xhci_pci mmc_core ccp sp5100_tco amd_sfh xhci_pci_renesas nvme_common i8042 typec serio wmi i2c_hid_acpi i2c_hid pkcs8_key_parser crypto_user
Mar 04 07:32:55 ethics-gradient kernel: CPU: 10 PID: 1089 Comm: systemd-logind Not tainted 6.2.1-arch1-1 #1 826b345887e8fd845ab37a52cb3a6655383f6b60
Mar 04 07:32:55 ethics-gradient kernel: Hardware name: LENOVO 21D4000KUS/21D4000KUS, BIOS N3GET47W (1.27 ) 12/08/2022
Mar 04 07:32:55 ethics-gradient kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2c87/0x2d00 [amdgpu]
Mar 04 07:32:55 ethics-gradient kernel: Code: 00 50 e8 8c bf 3d d4 4c 8b 9d 08 fd ff ff 48 83 c4 18 83 bd 10 fd ff ff 02 77 26 c7 85 10 fd ff ff 02 00 00 00 e9 f4 fd ff ff <0f> 0b e9 c6 f5 ff ff 0f 0b e9 55 f5 ff ff 0f 0b 0f 0b e9 d5 f5 ff
Mar 04 07:32:55 ethics-gradient kernel: RSP: 0018:ffffb06d01b53528 EFLAGS: 00010002
Mar 04 07:32:55 ethics-gradient kernel: RAX: 0000000000000286 RBX: 0000000000000286 RCX: 0000000000000020
Mar 04 07:32:55 ethics-gradient kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff9a3465c40178
Mar 04 07:32:55 ethics-gradient kernel: RBP: ffffb06d01b53888 R08: ffffb06d01b53454 R09: 0000000000000002
Mar 04 07:32:55 ethics-gradient kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9a34657d9118
Mar 04 07:32:55 ethics-gradient kernel: R13: 0000000000000000 R14: ffff9a35880b6e00 R15: ffff9a34657d9000
Mar 04 07:32:55 ethics-gradient kernel: FS: 00007f90e8b804c0(0000) GS:ffff9a3b7e880000(0000) knlGS:0000000000000000
Mar 04 07:32:55 ethics-gradient kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 04 07:32:55 ethics-gradient kernel: CR2: 000026680236afe0 CR3: 000000010ecec000 CR4: 0000000000750ee0
Mar 04 07:32:55 ethics-gradient kernel: PKRU: 55555554
Mar 04 07:32:55 ethics-gradient kernel: Call Trace:
Mar 04 07:32:55 ethics-gradient kernel: <TASK>
Mar 04 07:32:55 ethics-gradient kernel: commit_tail+0x94/0x130
Mar 04 07:32:55 ethics-gradient kernel: drm_atomic_helper_commit+0x116/0x140
Mar 04 07:32:55 ethics-gradient kernel: drm_atomic_commit+0x9a/0xd0
Mar 04 07:32:55 ethics-gradient kernel: ? __pfx___drm_printfn_info+0x10/0x10
Mar 04 07:32:55 ethics-gradient kernel: drm_client_modeset_commit_atomic+0x206/0x250
Mar 04 07:32:55 ethics-gradient kernel: drm_client_modeset_commit_locked+0x5a/0x160
Mar 04 07:32:55 ethics-gradient kernel: drm_fb_helper_set_par+0x7f/0xe0
Mar 04 07:32:55 ethics-gradient kernel: fb_set_var+0x204/0x430
Mar 04 07:32:55 ethics-gradient kernel: ? update_load_avg+0x7e/0x780
Mar 04 07:32:55 ethics-gradient kernel: ? __flush_work.isra.0+0x1aa/0x280
Mar 04 07:32:55 ethics-gradient kernel: ? update_load_avg+0x7e/0x780
Mar 04 07:32:55 ethics-gradient kernel: fbcon_blank+0x213/0x310
Mar 04 07:32:55 ethics-gradient kernel: do_unblank_screen+0xab/0x150
Mar 04 07:32:55 ethics-gradient kernel: complete_change_console+0x54/0x120
Mar 04 07:32:55 ethics-gradient kernel: vt_ioctl+0x10ec/0x13c0
Mar 04 07:32:55 ethics-gradient kernel: ? kernel_termios_to_user_termios_1+0x13/0x20
Mar 04 07:32:55 ethics-gradient kernel: ? tty_mode_ioctl+0x3ae/0x670
Mar 04 07:32:55 ethics-gradient kernel: tty_ioctl+0x292/0x890
Mar 04 07:32:55 ethics-gradient kernel: ? __seccomp_filter+0x32a/0x4f0
Mar 04 07:32:55 ethics-gradient kernel: ? __seccomp_filter+0x32a/0x4f0
Mar 04 07:32:55 ethics-gradient kernel: __x64_sys_ioctl+0x94/0xd0
Mar 04 07:32:55 ethics-gradient kernel: do_syscall_64+0x5f/0x90
Mar 04 07:32:55 ethics-gradient kernel: ? handle_mm_fault+0x103/0x300
Mar 04 07:32:55 ethics-gradient kernel: ? do_user_addr_fault+0x1e0/0x6a0
Mar 04 07:32:55 ethics-gradient kernel: ? exc_page_fault+0x74/0x170
Mar 04 07:32:55 ethics-gradient kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
Mar 04 07:32:55 ethics-gradient kernel: RIP: 0033:0x7f90e871553f
Mar 04 07:32:55 ethics-gradient kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Mar 04 07:32:55 ethics-gradient kernel: RSP: 002b:00007ffd53d582e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 04 07:32:55 ethics-gradient kernel: RAX: ffffffffffffffda RBX: 0000000000000018 RCX: 00007f90e871553f
Mar 04 07:32:55 ethics-gradient kernel: RDX: 0000000000000001 RSI: 0000000000005605 RDI: 0000000000000018
Mar 04 07:32:55 ethics-gradient kernel: RBP: 0000000000000000 R08: 00007ffd53d582e0 R09: 00007ffd53d58320
Mar 04 07:32:55 ethics-gradient kernel: R10: 0000562da904c090 R11: 0000000000000246 R12: 0000562da904ad20
Mar 04 07:32:55 ethics-gradient kernel: R13: 00007ffd53d583b8 R14: 00007ffd53d583c0 R15: 0000562da904ad20
Mar 04 07:32:55 ethics-gradient kernel: </TASK>
Mar 04 07:32:55 ethics-gradient kernel: ---[ end trace 0000000000000000 ]---