[Navi] GPU hang when doing nothing.
Brief summary of the problem:
GPU hang for unknown reasons. Computer was locked, I woke up this morning to it being completely hung. It appeared to happen just after I locked my computer last night according to the logs.
Hardware description:
- CPU: Ryzen 3900X
- GPU: Navi 5700XT
- System Memory: 32GB
- Display(s): 2x 1440p@144 Hz, 1x 4k@60
- Type of Diplay Connection: 3x DP 1.4
System informtion:
- Distro name and Version: Arch
- Kernel version: 5.7.5
- Custom kernel: No
- AMD package version: None
Extra Info
- DE/WM: sway version v1.5-rc2-ea3ba203
How to reproduce the issue:
Unknown, if I learn how to reproduce this I will attempt to do so.
Attached files:
Jul 07 22:21:01 mami kernel: general protection fault, probably for non-canonical address 0xae18417d2d3224cd: 0000 [#1] PREEMPT SMP NOPTI
Jul 07 22:21:01 mami kernel: CPU: 20 PID: 1415 Comm: kworker/u64:24 Not tainted 5.7.5-arch1-1 #1
Jul 07 22:21:01 mami kernel: Hardware name: System manufacturer System Product Name/Pro WS X570-ACE, BIOS 1201 11/18/2019
Jul 07 22:21:01 mami kernel: Workqueue: events_unbound commit_work [drm_kms_helper]
Jul 07 22:21:01 mami kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2aa/0x2310 [amdgpu]
Jul 07 22:21:01 mami kernel: Code: 4f 08 8b 81 e0 02 00 00 41 83 c5 01 44 39 e8 0f 87 46 ff ff ff 48 83 bd f0 fc ff ff 00 0f 84 03 01 00 00 48 8b bd f0 fc ff ff <80> bf b0 01 00 00 01 0f 86 ac 00 00 00 48 b9 00 00 00 00 01 00 00
Jul 07 22:21:01 mami kernel: RSP: 0018:ffffb2f8c1693af8 EFLAGS: 00010282
Jul 07 22:21:01 mami kernel: RAX: 0000000000000006 RBX: ffffa01b9c546800 RCX: ffffa01bda907000
Jul 07 22:21:01 mami kernel: RDX: ffffa01bc6f07000 RSI: ffffffffc14a11e0 RDI: ae18417d2d3224cd
Jul 07 22:21:01 mami kernel: RBP: ffffb2f8c1693e60 R08: 0000000000000001 R09: 0000000000000001
Jul 07 22:21:01 mami kernel: R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000
Jul 07 22:21:01 mami kernel: R13: 0000000000000006 R14: ffffa01b9c541000 R15: ffffa01b64b69480
Jul 07 22:21:01 mami kernel: FS: 0000000000000000(0000) GS:ffffa01bdef00000(0000) knlGS:0000000000000000
Jul 07 22:21:01 mami kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 07 22:21:01 mami kernel: CR2: 00007f80a1248000 CR3: 00000007ed6a6000 CR4: 0000000000340ee0
Jul 07 22:21:01 mami kernel: Call Trace:
Jul 07 22:21:01 mami kernel: ? prep_new_page+0x8a/0xb0
Jul 07 22:21:01 mami kernel: commit_tail+0x94/0x130 [drm_kms_helper]
Jul 07 22:21:01 mami kernel: process_one_work+0x1da/0x3d0
Jul 07 22:21:01 mami kernel: worker_thread+0x4d/0x3e0
Jul 07 22:21:01 mami kernel: ? rescuer_thread+0x3f0/0x3f0
Jul 07 22:21:01 mami kernel: kthread+0x13e/0x160
Jul 07 22:21:01 mami kernel: ? __kthread_bind_mask+0x60/0x60
Jul 07 22:21:01 mami kernel: ret_from_fork+0x22/0x40
Jul 07 22:21:01 mami kernel: Modules linked in: xfs cfg80211 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache nct6775 hwmon_vid raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c md_mod amdgpu sunrpc snd_hda_codec_realtek wireguard curve25519_x86_64 snd_hda_codec_generic libchacha20poly1305 chacha_x86_64 ledtrig_audio poly1305_x86_64 snd_hda_codec_hdmi libblake2s nls_iso8859_1 blake2s_x86_64 nls_cp437 ip6_udp_tunnel udp_tunnel gpu_sched libcurve25519_generic vfat libchacha snd_hda_intel fat ttm libblake2s_generic snd_intel_dspcfg drm_kms_helper snd_hda_codec edac_mce_amd kvm_amd snd_hda_core snd_hwdep cec eeepc_wmi joydev kvm snd_pcm asus_wmi rc_core snd_timer snd battery syscopyarea sp5100_tco sparse_keymap sysfillrect mousedev rfkill k10temp sysimgblt wmi_bmof mxm_wmi i2c_piix4 irqbypass input_leds pcspkr fb_sys_fops soundcore tpm_crb tpm_tis evdev tpm_tis_core mac_hid tpm acpi_cpufreq drm agpgart ip_tables x_tables ext4 crc32c_generic crc16
Jul 07 22:21:01 mami kernel: mbcache jbd2 dm_crypt hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ccp glue_helper xhci_pci crypto_simd cryptd igb rng_core xhci_hcd sr_mod cdrom i2c_algo_bit dca wmi pinctrl_amd dm_mirror dm_region_hash dm_log dm_mod xpad ff_memless sg bonding
Jul 07 22:21:01 mami kernel: ---[ end trace 70c241f6f08723e9 ]---
Jul 07 22:21:01 mami kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2aa/0x2310 [amdgpu]
Jul 07 22:21:01 mami kernel: Code: 4f 08 8b 81 e0 02 00 00 41 83 c5 01 44 39 e8 0f 87 46 ff ff ff 48 83 bd f0 fc ff ff 00 0f 84 03 01 00 00 48 8b bd f0 fc ff ff <80> bf b0 01 00 00 01 0f 86 ac 00 00 00 48 b9 00 00 00 00 01 00 00
Jul 07 22:21:01 mami kernel: RSP: 0018:ffffb2f8c1693af8 EFLAGS: 00010282
Jul 07 22:21:01 mami kernel: RAX: 0000000000000006 RBX: ffffa01b9c546800 RCX: ffffa01bda907000
Jul 07 22:21:01 mami kernel: RDX: ffffa01bc6f07000 RSI: ffffffffc14a11e0 RDI: ae18417d2d3224cd
Jul 07 22:21:01 mami kernel: RBP: ffffb2f8c1693e60 R08: 0000000000000001 R09: 0000000000000001
Jul 07 22:21:01 mami kernel: R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000
Jul 07 22:21:01 mami kernel: R13: 0000000000000006 R14: ffffa01b9c541000 R15: ffffa01b64b69480
Jul 07 22:21:01 mami kernel: FS: 0000000000000000(0000) GS:ffffa01bdef00000(0000) knlGS:0000000000000000
Jul 07 22:21:01 mami kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 07 22:21:01 mami kernel: CR2: 00007f80a1248000 CR3: 00000007ed6a6000 CR4: 0000000000340ee0
Jul 08 08:03:15 mami kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:64:crtc-1] hw_done or flip_done timed out
Jul 08 08:05:23 mami kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:64:crtc-1] hw_done or flip_done timed out