unrecoverable GPU freeze on Picasso after resume (flip_done timed out)
Brief summary of the problem:
amdgpu freezes and gets unrecoverably stuck after resuming from hibernation
Hardware description:
- CPU: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx
- GPU: Picasso
- System Memory: 16 GB
- Display(s): Whatever is in the Lenovo ThinkPad T495s 20NK000GMZ
- Type of Diplay Connection: eDP
System information:
- Distro name and Version: Arch Linux
- Kernel version: 5.12.12
- AMD package version: No package
How to reproduce the issue:
- hibernate the laptop
- sleep for approximately 10 hours
- attempt to resume
Attached files:
dmesg log from my journal
Jun 28 10:16:21 archbook kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:55:plane-3] flip_done timed out
Jun 28 10:16:21 archbook kernel: ------------[ cut here ]------------
Jun 28 10:16:21 archbook kernel: WARNING: CPU: 5 PID: 1 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7960 amdgpu_dm_atomic_commit_tail+0x25c1/0x2630 [amdgpu]
Jun 28 10:16:21 archbook kernel: Modules linked in: udp_diag tcp_diag inet_diag cmac ccm ch341 uas usb_storage iptable_mangle iptable_raw xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_mark ip6table_mangle xt_comment xt_addrtype ip6table_raw ip6_tables joydev mousedev wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 libcurve25519_generic uvcvideo libchacha libblake2s_generic ip6_udp_tunnel udp_tunnel videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc iwlmvm amdgpu intel_rapl_msr intel_rapl_common mac80211 edac_mce_amd kvm_amd libarc4 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm iwlwifi snd_hda_intel snd_intel_dspcfg irqbypass snd_intel_sdw_acpi gpu_sched crct10dif_pclmul crc32_pclmul snd_hda_codec i2c_algo_bit ghash_clmulni_intel drm_ttm_helper btusb aesni_intel snd_hda_core btrtl btbcm ttm snd_hwdep btintel crypto_simd cryptd snd_rn_pci_acp3x bluetooth nls_iso8859_1 rapl
Jun 28 10:16:21 archbook kernel: drm_kms_helper snd_pcm cfg80211 wmi_bmof snd_pci_acp3x k10temp sp5100_tco ecdh_generic cec tpm_crb psmouse ecc i2c_piix4 syscopyarea snd_timer sysfillrect ccp sysimgblt fb_sys_fops r8169 thinkpad_acpi ipmi_devintf realtek ucsi_acpi platform_profile tpm_tis mdio_devres ledtrig_audio typec_ucsi tpm_tis_core ipmi_msghandler libphy snd tpm typec rfkill rng_core roles soundcore mac_hid pinctrl_amd i2c_scmi acpi_cpufreq drm crypto_user fuse acpi_call(OE) agpgart ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 serio_raw atkbd libps2 crc32c_intel sdhci_pci cqhci sdhci xhci_pci mmc_core xhci_pci_renesas wmi i8042 serio video vfat fat
Jun 28 10:16:21 archbook kernel: CPU: 5 PID: 1 Comm: systemd Tainted: G W OE 5.12.12-arch1-1 #1
Jun 28 10:16:21 archbook kernel: Hardware name: LENOVO 20NK000GMZ/20NK000GMZ, BIOS R12ET49W(1.19 ) 01/06/2020
Jun 28 10:16:21 archbook kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x25c1/0x2630 [amdgpu]
Jun 28 10:16:21 archbook kernel: Code: ff ff 01 c7 85 1c fd ff ff 37 00 00 00 c7 85 24 fd ff ff 20 00 00 00 e8 fd 04 13 00 e9 05 fb ff ff 0f 0b e9 30 f9 ff ff 0f 0b <0f> 0b e9 a0 f9 ff ff 0f 0b e9 b9 f9 ff ff 49 8b 06 41 0f b6 8e 2d
Jun 28 10:16:21 archbook kernel: RSP: 0018:ffffb6098006b620 EFLAGS: 00010002
Jun 28 10:16:21 archbook kernel: RAX: 0000000000000002 RBX: 00000000000012e4 RCX: ffff93760f0bb918
Jun 28 10:16:21 archbook kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff937614940178
Jun 28 10:16:21 archbook kernel: RBP: ffffb6098006b9b8 R08: 0000000000000005 R09: 0000000000000000
Jun 28 10:16:21 archbook kernel: R10: ffffb6098006b580 R11: ffffb6098006b584 R12: 0000000000000286
Jun 28 10:16:21 archbook kernel: R13: ffff93760f0bb800 R14: ffff937625825e00 R15: ffff9378ae0ef680
Jun 28 10:16:21 archbook kernel: FS: 00007f25dac4ca40(0000) GS:ffff9378b0b40000(0000) knlGS:0000000000000000
Jun 28 10:16:21 archbook kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 28 10:16:21 archbook kernel: CR2: 00007f6f3c002be0 CR3: 0000000102c7c000 CR4: 00000000003506e0
Jun 28 10:16:21 archbook kernel: Call Trace:
Jun 28 10:16:21 archbook kernel: commit_tail+0x94/0x120 [drm_kms_helper]
Jun 28 10:16:21 archbook kernel: drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
Jun 28 10:16:21 archbook kernel: drm_client_modeset_commit_atomic+0x1fc/0x240 [drm]
Jun 28 10:16:21 archbook kernel: drm_client_modeset_commit_locked+0x56/0x150 [drm]
Jun 28 10:16:21 archbook kernel: drm_fb_helper_pan_display+0xdc/0x210 [drm_kms_helper]
Jun 28 10:16:21 archbook kernel: fb_pan_display+0x83/0x100
Jun 28 10:16:21 archbook kernel: bit_update_start+0x1a/0x40
Jun 28 10:16:21 archbook kernel: fbcon_switch+0x353/0x4f0
Jun 28 10:16:21 archbook kernel: csi_J+0x24c/0x260
Jun 28 10:16:21 archbook kernel: do_con_write+0x14ad/0x2370
Jun 28 10:16:21 archbook kernel: ? terminate_walk+0x61/0xf0
Jun 28 10:16:21 archbook kernel: con_write+0x10/0x30
Jun 28 10:16:21 archbook kernel: n_tty_write+0x156/0x520
Jun 28 10:16:21 archbook kernel: ? __wake_up_sync_key+0x20/0x20
Jun 28 10:16:21 archbook kernel: file_tty_write.constprop.0+0x1af/0x310
Jun 28 10:16:21 archbook kernel: ? n_tty_poll+0x1f0/0x1f0
Jun 28 10:16:21 archbook kernel: new_sync_write+0x159/0x1f0
Jun 28 10:16:21 archbook kernel: vfs_write+0x1ff/0x290
Jun 28 10:16:21 archbook kernel: ksys_write+0x67/0xe0
Jun 28 10:16:21 archbook kernel: do_syscall_64+0x33/0x40
Jun 28 10:16:21 archbook kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 28 10:16:21 archbook kernel: RIP: 0033:0x7f25db5e86ff
Jun 28 10:16:21 archbook kernel: Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 69 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 bc fd ff ff 48
Jun 28 10:16:21 archbook kernel: RSP: 002b:00007ffca45f7e80 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Jun 28 10:16:21 archbook kernel: RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f25db5e86ff
Jun 28 10:16:21 archbook kernel: RDX: 000000000000000a RSI: 00007f25db953622 RDI: 000000000000002e
Jun 28 10:16:21 archbook kernel: RBP: 00007f25db953622 R08: 0000000000000000 R09: 0000000000000000
Jun 28 10:16:21 archbook kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000002e
Jun 28 10:16:21 archbook kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jun 28 10:16:21 archbook kernel: ---[ end trace e09f70e4b7f9cf19 ]---
Jun 28 10:16:21 archbook kernel: ------------[ cut here ]------------
Jun 28 10:16:21 archbook kernel: WARNING: CPU: 5 PID: 1 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7560 amdgpu_dm_atomic_commit_tail+0x25c8/0x2630 [amdgpu]
Jun 28 10:16:21 archbook kernel: Modules linked in: udp_diag tcp_diag inet_diag cmac ccm ch341 uas usb_storage iptable_mangle iptable_raw xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_mark ip6table_mangle xt_comment xt_addrtype ip6table_raw ip6_tables joydev mousedev wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 libcurve25519_generic uvcvideo libchacha libblake2s_generic ip6_udp_tunnel udp_tunnel videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc iwlmvm amdgpu intel_rapl_msr intel_rapl_common mac80211 edac_mce_amd kvm_amd libarc4 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm iwlwifi snd_hda_intel snd_intel_dspcfg irqbypass snd_intel_sdw_acpi gpu_sched crct10dif_pclmul crc32_pclmul snd_hda_codec i2c_algo_bit ghash_clmulni_intel drm_ttm_helper btusb aesni_intel snd_hda_core btrtl btbcm ttm snd_hwdep btintel crypto_simd cryptd snd_rn_pci_acp3x bluetooth nls_iso8859_1 rapl
Jun 28 10:16:21 archbook kernel: drm_kms_helper snd_pcm cfg80211 wmi_bmof snd_pci_acp3x k10temp sp5100_tco ecdh_generic cec tpm_crb psmouse ecc i2c_piix4 syscopyarea snd_timer sysfillrect ccp sysimgblt fb_sys_fops r8169 thinkpad_acpi ipmi_devintf realtek ucsi_acpi platform_profile tpm_tis mdio_devres ledtrig_audio typec_ucsi tpm_tis_core ipmi_msghandler libphy snd tpm typec rfkill rng_core roles soundcore mac_hid pinctrl_amd i2c_scmi acpi_cpufreq drm crypto_user fuse acpi_call(OE) agpgart ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 serio_raw atkbd libps2 crc32c_intel sdhci_pci cqhci sdhci xhci_pci mmc_core xhci_pci_renesas wmi i8042 serio video vfat fat
Jun 28 10:16:21 archbook kernel: CPU: 5 PID: 1 Comm: systemd Tainted: G W OE 5.12.12-arch1-1 #1
Jun 28 10:16:21 archbook kernel: Hardware name: LENOVO 20NK000GMZ/20NK000GMZ, BIOS R12ET49W(1.19 ) 01/06/2020
Jun 28 10:16:21 archbook kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x25c8/0x2630 [amdgpu]
Jun 28 10:16:21 archbook kernel: Code: ff ff 37 00 00 00 c7 85 24 fd ff ff 20 00 00 00 e8 fd 04 13 00 e9 05 fb ff ff 0f 0b e9 30 f9 ff ff 0f 0b 0f 0b e9 a0 f9 ff ff <0f> 0b e9 b9 f9 ff ff 49 8b 06 41 0f b6 8e 2d 01 00 00 48 c7 c6 98
Jun 28 10:16:21 archbook kernel: RSP: 0018:ffffb6098006b620 EFLAGS: 00010086
Jun 28 10:16:21 archbook kernel: RAX: 0000000000000001 RBX: 00000000000012e4 RCX: ffff93760f0bb918
Jun 28 10:16:21 archbook kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff937614940178
Jun 28 10:16:21 archbook kernel: RBP: ffffb6098006b9b8 R08: 0000000000000005 R09: 0000000000000000
Jun 28 10:16:21 archbook kernel: R10: ffffb6098006b580 R11: ffffb6098006b584 R12: 0000000000000286
Jun 28 10:16:21 archbook kernel: R13: ffff93760f0bb800 R14: ffff937625825e00 R15: ffff9378ae0ef680
Jun 28 10:16:21 archbook kernel: FS: 00007f25dac4ca40(0000) GS:ffff9378b0b40000(0000) knlGS:0000000000000000
Jun 28 10:16:21 archbook kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 28 10:16:21 archbook kernel: CR2: 00007f6f3c002be0 CR3: 0000000102c7c000 CR4: 00000000003506e0
Jun 28 10:16:21 archbook kernel: Call Trace:
Jun 28 10:16:21 archbook kernel: commit_tail+0x94/0x120 [drm_kms_helper]
Jun 28 10:16:21 archbook kernel: drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
Jun 28 10:16:21 archbook kernel: drm_client_modeset_commit_atomic+0x1fc/0x240 [drm]
Jun 28 10:16:21 archbook kernel: drm_client_modeset_commit_locked+0x56/0x150 [drm]
Jun 28 10:16:21 archbook kernel: drm_fb_helper_pan_display+0xdc/0x210 [drm_kms_helper]
Jun 28 10:16:21 archbook kernel: fb_pan_display+0x83/0x100
Jun 28 10:16:21 archbook kernel: bit_update_start+0x1a/0x40
Jun 28 10:16:21 archbook kernel: fbcon_switch+0x353/0x4f0
Jun 28 10:16:21 archbook kernel: csi_J+0x24c/0x260
Jun 28 10:16:21 archbook kernel: do_con_write+0x14ad/0x2370
Jun 28 10:16:21 archbook kernel: ? terminate_walk+0x61/0xf0
Jun 28 10:16:21 archbook kernel: con_write+0x10/0x30
Jun 28 10:16:21 archbook kernel: n_tty_write+0x156/0x520
Jun 28 10:16:21 archbook kernel: ? __wake_up_sync_key+0x20/0x20
Jun 28 10:16:21 archbook kernel: file_tty_write.constprop.0+0x1af/0x310
Jun 28 10:16:21 archbook kernel: ? n_tty_poll+0x1f0/0x1f0
Jun 28 10:16:21 archbook kernel: new_sync_write+0x159/0x1f0
Jun 28 10:16:21 archbook kernel: vfs_write+0x1ff/0x290
Jun 28 10:16:21 archbook kernel: ksys_write+0x67/0xe0
Jun 28 10:16:21 archbook kernel: do_syscall_64+0x33/0x40
Jun 28 10:16:21 archbook kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Jun 28 10:16:21 archbook kernel: RIP: 0033:0x7f25db5e86ff
Jun 28 10:16:21 archbook kernel: Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 69 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 bc fd ff ff 48
Jun 28 10:16:21 archbook kernel: RSP: 002b:00007ffca45f7e80 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Jun 28 10:16:21 archbook kernel: RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f25db5e86ff
Jun 28 10:16:21 archbook kernel: RDX: 000000000000000a RSI: 00007f25db953622 RDI: 000000000000002e
Jun 28 10:16:21 archbook kernel: RBP: 00007f25db953622 R08: 0000000000000000 R09: 0000000000000000
Jun 28 10:16:21 archbook kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000002e
Jun 28 10:16:21 archbook kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jun 28 10:16:21 archbook kernel: ---[ end trace e09f70e4b7f9cf1a ]---
Jun 28 10:16:31 archbook kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:67:crtc-0] flip_done timed out