[amdgpu] kernel crash when trying to suspend or resume (black screen)
Brief summary of the problem:
I suspended the system as every day, but the computer kept running (wasting energy). Only the screen went black. When i tried to resume the system the next day, the screen stayed black.
I only had like 80% RAM in use, so it's a bit different from #2635 which otherwise showed a similar behavior.
Frequency: Rare (2 times in the last weeks)
I can't remember having such issues with kernel 6.6.6. Maybe new regression with 6.7.0.
Hardware description:
- CPU: Intel i9-9900K
- GPU: AMD Radeon RX 6600 XT (Navi 23)
- System Memory: G.Skill Ripjaws 32 GB DDR4 RAM (3200 MHz)
- Display(s): Dell UP2716D 27" 2.5K IPS LED Display
- Type of Display Connection: DP
System information:
- Distro name and Version: NixOS 23.11.3019.8bf65f17d807
- Kernel version: 6.7.0
- Custom kernel: no
- AMD official driver version: no, amdgpu
How to reproduce the issue:
- suspend
- resume
Attached files:
Log files (for system lockups / game freezes / crashes)
system journal:
suspend
Feb 24 01:11:40 gaming ModemManager[2010]: <info> [sleep-monitor-systemd] system is about to suspend
Feb 24 01:11:40 gaming systemd[1]: Starting Network Manager Script Dispatcher Service...
Feb 24 01:11:40 gaming dbus-daemon[1200]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Feb 24 01:11:40 gaming systemd[1]: Started Network Manager Script Dispatcher Service.
Feb 24 01:11:40 gaming avahi-daemon[1197]: Withdrawing address record for fe80::9528:9ad5:bd32:243f on enp6s0.
Feb 24 01:11:40 gaming avahi-daemon[1197]: Leaving mDNS multicast group on interface enp6s0.IPv6 with address fe80::9528:9ad5:bd32:243f.
Feb 24 01:11:40 gaming avahi-daemon[1197]: Interface enp6s0.IPv6 no longer relevant for mDNS.
Feb 24 01:11:40 gaming avahi-daemon[1197]: Withdrawing address record for 10.0.4.12 on enp6s0.
Feb 24 01:11:40 gaming avahi-daemon[1197]: Leaving mDNS multicast group on interface enp6s0.IPv4 with address 10.0.4.12.
Feb 24 01:11:40 gaming avahi-daemon[1197]: Interface enp6s0.IPv4 no longer relevant for mDNS.
Feb 24 01:11:40 gaming systemd-resolved[1068]: enp6s0: Bus client reset search domain list.
Feb 24 01:11:40 gaming systemd-resolved[1068]: enp6s0: Bus client set default route setting: no
Feb 24 01:11:40 gaming systemd-resolved[1068]: enp6s0: Bus client reset DNS server list.
Feb 24 01:11:40 gaming systemd-resolved[1068]: Switching to fallback DNS server 1.1.1.1#cloudflare-dns.com.
Feb 24 01:11:40 gaming kernel: r8169 0000:06:00.0 enp6s0: Link is Down
Feb 24 01:11:40 gaming systemd[1]: Starting Pre-Sleep Actions...
Feb 24 01:11:40 gaming systemd[1]: pre-sleep.service: Deactivated successfully.
Feb 24 01:11:40 gaming systemd[1]: Finished Pre-Sleep Actions.
Feb 24 01:11:40 gaming systemd[1]: Reached target Sleep.
Feb 24 01:11:40 gaming systemd[1]: Starting Restart Syncthing after resume...
Feb 24 01:11:40 gaming systemd[1]: Starting System Suspend...
Feb 24 01:11:40 gaming systemd-sleep[490361]: Entering sleep state 'suspend'...
Feb 24 01:11:40 gaming kernel: PM: suspend entry (deep)
Feb 24 01:11:40 gaming kernel: Filesystems sync: 0.026 seconds
Feb 24 01:11:41 gaming kernel: Freezing user space processes
Feb 24 01:11:41 gaming kernel: Freezing user space processes completed (elapsed 0.471 seconds)
Feb 24 01:11:41 gaming kernel: OOM killer disabled.
Feb 24 01:11:41 gaming kernel: Freezing remaining freezable tasks
Feb 24 01:11:41 gaming kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
Feb 24 01:11:41 gaming kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Feb 24 01:11:41 gaming kernel: systemd-sleep: vmalloc error: size 0, failed to allocate pages, mode:0xdc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO), nodemask=(null),cpuset=systemd-suspend.service,mems_allowed=0
Feb 24 01:11:41 gaming kernel: CPU: 0 PID: 490361 Comm: systemd-sleep Not tainted 6.7.0 #1-NixOS
Feb 24 01:11:41 gaming kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 UD/Z390 UD, BIOS F10 11/05/2021
Feb 24 01:11:41 gaming kernel: Call Trace:
Feb 24 01:11:41 gaming kernel: <TASK>
Feb 24 01:11:41 gaming kernel: dump_stack_lvl+0x47/0x60
Feb 24 01:11:41 gaming kernel: warn_alloc+0x165/0x1e0
Feb 24 01:11:41 gaming kernel: ? alloc_pages_mpol+0x95/0x1f0
Feb 24 01:11:41 gaming kernel: __vmalloc_node_range+0x832/0x8b0
Feb 24 01:11:41 gaming kernel: ? ttm_sg_tt_init+0x81/0xb0 [ttm]
Feb 24 01:11:41 gaming kernel: kvmalloc_node+0xa6/0xd0
Feb 24 01:11:41 gaming kernel: ? ttm_sg_tt_init+0x81/0xb0 [ttm]
Feb 24 01:11:41 gaming kernel: ttm_sg_tt_init+0x81/0xb0 [ttm]
Feb 24 01:11:41 gaming kernel: amdgpu_ttm_tt_create+0xd1/0xf0 [amdgpu]
Feb 24 01:11:41 gaming kernel: ttm_tt_create+0x67/0xa0 [ttm]
Feb 24 01:11:41 gaming kernel: ttm_bo_handle_move_mem+0x77/0x170 [ttm]
Feb 24 01:11:41 gaming kernel: ttm_mem_evict_first+0x20f/0x550 [ttm]
Feb 24 01:11:41 gaming kernel: ttm_resource_manager_evict_all+0xa7/0x1d0 [ttm]
Feb 24 01:11:41 gaming kernel: ? __pfx_pci_pm_prepare+0x10/0x10
Feb 24 01:11:41 gaming kernel: amdgpu_device_prepare+0x4e/0xd0 [amdgpu]
Feb 24 01:11:41 gaming kernel: pci_pm_prepare+0x31/0x70
Feb 24 01:11:41 gaming kernel: dpm_prepare+0x266/0x440
Feb 24 01:11:41 gaming kernel: dpm_suspend_start+0x1e/0x90
Feb 24 01:11:41 gaming kernel: suspend_devices_and_enter+0x165/0x960
Feb 24 01:11:41 gaming kernel: pm_suspend+0x25e/0x590
Feb 24 01:11:41 gaming kernel: state_store+0x6c/0xd0
Feb 24 01:11:41 gaming kernel: kernfs_fop_write_iter+0x11f/0x200
Feb 24 01:11:41 gaming kernel: vfs_write+0x23a/0x400
Feb 24 01:11:41 gaming kernel: ksys_write+0x6f/0xf0
Feb 24 01:11:41 gaming kernel: do_syscall_64+0x43/0xf0
Feb 24 01:11:41 gaming kernel: entry_SYSCALL_64_after_hwframe+0x6f/0x77
Feb 24 01:11:41 gaming kernel: RIP: 0033:0x7f2af6118e34
Feb 24 01:11:41 gaming kernel: Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 05 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
Feb 24 01:11:41 gaming kernel: RSP: 002b:00007ffc8c61b7a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
Feb 24 01:11:41 gaming kernel: RAX: ffffffffffffffda RBX: 00007ffc8c61b880 RCX: 00007f2af6118e34
Feb 24 01:11:41 gaming kernel: RDX: 0000000000000004 RSI: 00007ffc8c61b880 RDI: 0000000000000004
Feb 24 01:11:41 gaming kernel: RBP: 0000000000000004 R08: 0000000000000005 R09: 0000000000000004
Feb 24 01:11:41 gaming kernel: R10: 000000000000000b R11: 0000000000000202 R12: 0000000000000004
Feb 24 01:11:41 gaming kernel: R13: 000055a94f609330 R14: 00007f2af61eff20 R15: 00000000fffffff7
Feb 24 01:11:41 gaming kernel: </TASK>
Feb 24 01:11:41 gaming kernel: Mem-Info:
Feb 24 01:11:41 gaming kernel: active_anon:4178420 inactive_anon:1036578 isolated_anon:0
active_file:177527 inactive_file:990101 isolated_file:0
unevictable:158 dirty:31 writeback:0
slab_reclaimable:262697 slab_unreclaimable:171912
mapped:447198 shmem:476992 pagetables:68901
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:100572 free_pcp:214 free_cma:0
Feb 24 01:11:42 gaming kernel: Node 0 active_anon:16713680kB inactive_anon:4146312kB active_file:710108kB inactive_file:3960404kB unevictable:632kB isolated(anon):0kB isolated(file):0kB mapped:1788792kB dirty:124kB writeback:0kB shmem:1907968kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel>
Feb 24 01:11:42 gaming kernel: Node 0 DMA free:11264kB boost:0kB min:28kB low:40kB high:52kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Feb 24 01:11:42 gaming kernel: lowmem_reserve[]: 0 840 31950 31950 31950
Feb 24 01:11:42 gaming kernel: Node 0 DMA32 free:131232kB boost:5244kB min:7020kB low:7880kB high:8740kB reserved_highatomic:0KB active_anon:72292kB inactive_anon:55552kB active_file:208kB inactive_file:331784kB unevictable:0kB writepending:0kB present:979780kB managed:914244kB mlocked:0kB bounce:0kB free_pcp:248kB>
Feb 24 01:11:42 gaming kernel: lowmem_reserve[]: 0 0 31109 31109 31109
Feb 24 01:11:42 gaming kernel: Node 0 Normal free:259792kB boost:194224kB min:259996kB low:291852kB high:323708kB reserved_highatomic:0KB active_anon:16641444kB inactive_anon:4090592kB active_file:709900kB inactive_file:3628620kB unevictable:632kB writepending:124kB present:32473088kB managed:31862240kB mlocked:632>
Feb 24 01:11:42 gaming kernel: lowmem_reserve[]: 0 0 0 0 0
Feb 24 01:11:42 gaming kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB
Feb 24 01:11:42 gaming kernel: Node 0 DMA32: 2232*4kB (UME) 2310*8kB (UME) 1533*16kB (UME) 1060*32kB (UME) 529*64kB (UME) 90*128kB (UME) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 131232kB
Feb 24 01:11:42 gaming kernel: Node 0 Normal: 16098*4kB (UME) 11717*8kB (UME) 6354*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 259792kB
Feb 24 01:11:42 gaming kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Feb 24 01:11:42 gaming kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb 24 01:11:42 gaming kernel: 1855016 total pagecache pages
Feb 24 01:11:42 gaming kernel: 210396 pages in swap cache
Feb 24 01:11:42 gaming kernel: Free swap = 27227836kB
Feb 24 01:11:42 gaming kernel: Total swap = 33554424kB
Feb 24 01:11:42 gaming kernel: 8367213 pages RAM
Feb 24 01:11:42 gaming kernel: 0 pages HighMem/MovableOnly
Feb 24 01:11:42 gaming kernel: 169252 pages reserved
Feb 24 01:11:42 gaming kernel: 0 pages cma reserved
Feb 24 01:11:42 gaming kernel: [TTM] Failed allocating page table
Feb 24 01:11:42 gaming kernel: [TTM] Buffer eviction failed
Feb 24 01:11:42 gaming kernel: [drm] evicting device resources failed
Feb 24 01:11:42 gaming kernel: amdgpu 0000:03:00.0: PM: device_prepare(): pci_pm_prepare+0x0/0x70 returns -12
Feb 24 01:11:42 gaming kernel: amdgpu 0000:03:00.0: PM: not prepared for power transition: code -12
Feb 24 01:11:42 gaming kernel: PM: Some devices failed to suspend, or early wake event detected
Feb 24 01:11:42 gaming kernel: OOM killer enabled.
Feb 24 01:11:42 gaming kernel: Restarting tasks ... done.
Feb 24 01:11:42 gaming kernel: random: crng reseeded on system resumption
Feb 24 01:11:42 gaming kernel: PM: suspend exit
Feb 24 01:11:42 gaming kernel: PM: suspend entry (s2idle)
Feb 24 01:11:42 gaming bluetoothd[1198]: Controller resume with wake event 0x0
Feb 24 01:11:42 gaming kernel: Filesystems sync: 0.074 seconds
resume
Feb 24 13:25:19 gaming kernel: Freezing user space processes
Feb 24 13:25:19 gaming kernel: Freezing user space processes completed (elapsed 0.311 seconds)
Feb 24 13:25:19 gaming kernel: OOM killer disabled.
Feb 24 13:25:19 gaming kernel: Freezing remaining freezable tasks
Feb 24 13:25:19 gaming kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
Feb 24 13:25:19 gaming kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Feb 24 13:25:19 gaming kernel: serial 00:01: disabled
Feb 24 13:25:19 gaming kernel: sd 4:0:0:0: [sda] Synchronizing SCSI cache
Feb 24 13:25:19 gaming kernel: ata5.00: Entering standby power mode
Feb 24 13:25:19 gaming kernel: intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 2600 ms delay
Feb 24 13:25:19 gaming kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
Feb 24 13:25:19 gaming kernel: [drm] PSP is resuming...
Feb 24 13:25:19 gaming kernel: serial 00:01: activated
Feb 24 13:25:19 gaming kernel: nvme nvme0: Shutdown timeout set to 8 seconds
Feb 24 13:25:19 gaming kernel: mei_me 0000:00:16.0: running w/o dma ring
Feb 24 13:25:19 gaming kernel: nvme nvme0: 16/0/0 default/read/poll queues
Feb 24 13:25:19 gaming kernel: [drm] reserve 0xa00000 from 0x81fd000000 for PSP TMR
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2f00 (59.47.0)
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
Feb 24 13:25:19 gaming kernel: ata3: SATA link down (SStatus 4 SControl 300)
Feb 24 13:25:19 gaming kernel: ata1: SATA link down (SStatus 4 SControl 300)
Feb 24 13:25:19 gaming kernel: ata6: SATA link down (SStatus 4 SControl 300)
Feb 24 13:25:19 gaming kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 24 13:25:19 gaming kernel: ata4: SATA link down (SStatus 4 SControl 300)
Feb 24 13:25:19 gaming kernel: ata2: SATA link down (SStatus 4 SControl 300)
Feb 24 13:25:19 gaming kernel: ata5.00: supports DRM functions and may not be fully accessible
Feb 24 13:25:19 gaming kernel: ata5.00: supports DRM functions and may not be fully accessible
Feb 24 13:25:19 gaming kernel: ata5.00: configured for UDMA/133
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000036 SMN_C2PMSG_82:0x00000000
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: RunDcBtc failed!
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!
Feb 24 13:25:19 gaming kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -62
Feb 24 13:25:19 gaming kernel: amdgpu 0000:03:00.0: PM: failed to resume async: error -62
Feb 24 13:25:19 gaming kernel: OOM killer enabled.
Feb 24 13:25:19 gaming kernel: Restarting tasks ... done.
Feb 24 13:25:19 gaming kernel: random: crng reseeded on system resumption
Feb 24 13:25:19 gaming kernel: PM: suspend exit
Feb 24 13:25:19 gaming kernel: snd_hda_intel 0000:03:00.1: Refused to change power state from D0 to D3hot
Feb 24 13:25:19 gaming rtkit-daemon[1896]: The canary thread is apparently starving. Taking action.
Feb 24 13:25:19 gaming systemd-resolved[1068]: Clock change detected. Flushing caches.
Feb 24 13:25:19 gaming rtkit-daemon[1896]: Demoting known real-time threads.
Feb 24 13:25:19 gaming systemd-sleep[490361]: System returned from sleep state.
Feb 24 13:25:19 gaming rtkit-daemon[1896]: Demoted 0 threads.
Feb 24 13:25:19 gaming systemd[1]: Started Enable internet service.
Feb 24 13:25:19 gaming bluetoothd[1198]: Controller resume with wake event 0x0
Feb 24 13:25:19 gaming systemd[1]: Starting Refresh fwupd metadata and update motd...
Feb 24 13:25:19 gaming systemd[1]: Started Logrotate Service.
Feb 24 13:25:19 gaming systemd[1]: Starting restic-backups-nas.service...
Feb 24 13:25:19 gaming systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
Feb 24 13:25:19 gaming rtkit-daemon[1896]: Recovering from system lockup, not allowing further RT threads.
Feb 24 13:25:19 gaming systemd-logind[1314]: Operation 'sleep' finished.
Feb 24 13:25:19 gaming ModemManager[2010]: <info> [sleep-monitor-systemd] system is resuming
Feb 24 13:25:22 gaming ModemManager[2010]: <info> [base-manager] couldn't check support for device '/sys/devices/pci0000:00/0000:00:1c.2/0000:06:00.0': not supported by any plugin
Feb 24 13:25:22 gaming kernel: r8169 0000:06:00.0 enp6s0: Link is Up - 1Gbps/Full - flow control rx/tx
Feb 24 13:25:22 gaming avahi-daemon[1197]: Joining mDNS multicast group on interface enp6s0.IPv6 with address fe80::9528:9ad5:bd32:243f.
Feb 24 13:25:22 gaming avahi-daemon[1197]: New relevant interface enp6s0.IPv6 for mDNS.
Feb 24 13:25:22 gaming avahi-daemon[1197]: Registering new address record for fe80::9528:9ad5:bd32:243f on enp6s0.*.
Feb 24 13:25:23 gaming kernel: snd_hda_intel 0000:03:00.1: Refused to change power state from D0 to D3hot
Feb 24 13:25:26 gaming avahi-daemon[1197]: Joining mDNS multicast group on interface enp6s0.IPv4 with address 10.0.4.12.
Feb 24 13:25:26 gaming avahi-daemon[1197]: New relevant interface enp6s0.IPv4 for mDNS.
Feb 24 13:25:26 gaming avahi-daemon[1197]: Registering new address record for 10.0.4.12 on enp6s0.IPv4.
Feb 24 13:25:26 gaming systemd-resolved[1068]: enp6s0: Bus client set search domain list to: lan
Feb 24 13:25:26 gaming systemd-resolved[1068]: enp6s0: Bus client set default route setting: yes
Feb 24 13:25:26 gaming systemd-resolved[1068]: enp6s0: Bus client set DNS server list to: 10.0.0.1
Feb 24 13:25:26 gaming dbus-daemon[1200]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.10' (uid=0 pid=1350 comm="/nix/store/q5b4iwjcyvcvw1a3w8d20rbxwn7lcw1n-networ" label="kernel")
Feb 24 13:25:26 gaming systemd[1]: Starting Network Manager Script Dispatcher Service...
Feb 24 13:25:26 gaming dbus-daemon[1200]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Feb 24 13:25:26 gaming systemd[1]: Started Network Manager Script Dispatcher Service.
Feb 24 13:25:29 gaming kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=747321, emitted seq=747324
Feb 24 13:25:29 gaming kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Feb 24 13:25:29 gaming kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Feb 24 13:25:29 gaming kernel: amdgpu 0000:03:00.0: amdgpu: Failed to disallow df cstate
Feb 24 13:25:29 gaming kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Feb 24 13:25:29 gaming kernel: #PF: supervisor read access in kernel mode
Feb 24 13:25:29 gaming kernel: #PF: error_code(0x0000) - not-present page
Feb 24 13:25:29 gaming kernel: PGD 0 P4D 0
Feb 24 13:25:29 gaming kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Feb 24 13:25:29 gaming kernel: CPU: 5 PID: 490450 Comm: kworker/u32:75 Not tainted 6.7.0 #1-NixOS
Feb 24 13:25:29 gaming kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 UD/Z390 UD, BIOS F10 11/05/2021
Feb 24 13:25:29 gaming kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Feb 24 13:25:29 gaming kernel: RIP: 0010:dc_resource_state_copy_construct+0x27/0x180 [amdgpu]
Feb 24 13:25:29 gaming kernel: Code: 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 49 89 f4 55 31 ed 53 48 8b 87 08 5b 00 00 48 89 fb 44 8b b6 48 b5 03 00 <48> 8b 00 48 8b 00 80 b8 7f 01 00 00 00 74 07 48 8b ae c0 aa 03 00
Feb 24 13:25:29 gaming kernel: RSP: 0018:ffffb8d447ebfc08 EFLAGS: 00010246
Feb 24 13:25:29 gaming kernel: RAX: 0000000000000000 RBX: ffffb8d445ab1000 RCX: 0000000000000005
Feb 24 13:25:29 gaming kernel: RDX: 0000000000000010 RSI: ffff9633d1a80000 RDI: ffffb8d445ab1000
Feb 24 13:25:29 gaming kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Feb 24 13:25:29 gaming kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9633d1a80000
Feb 24 13:25:29 gaming kernel: R13: ffff9633ceb40000 R14: 0000000000000001 R15: 0000000000000000
Feb 24 13:25:29 gaming kernel: FS: 0000000000000000(0000) GS:ffff963b5d880000(0000) knlGS:0000000000000000
Feb 24 13:25:29 gaming kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 24 13:25:29 gaming kernel: CR2: 0000000000000000 CR3: 00000001ffc20002 CR4: 00000000003706f0
Feb 24 13:25:29 gaming kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 24 13:25:29 gaming kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb 24 13:25:29 gaming kernel: Call Trace:
Feb 24 13:25:29 gaming kernel: <TASK>
Feb 24 13:25:29 gaming kernel: ? __die+0x23/0x70
Feb 24 13:25:29 gaming kernel: ? page_fault_oops+0x17d/0x4b0
Feb 24 13:25:29 gaming kernel: ? exc_page_fault+0x6e/0x160
Feb 24 13:25:29 gaming kernel: ? asm_exc_page_fault+0x26/0x30
Feb 24 13:25:29 gaming kernel: ? dc_resource_state_copy_construct+0x27/0x180 [amdgpu]
Feb 24 13:25:29 gaming kernel: dm_suspend+0x131/0x1e0 [amdgpu]
Feb 24 13:25:29 gaming kernel: amdgpu_device_ip_suspend_phase1+0x6e/0xe0 [amdgpu]
Feb 24 13:25:29 gaming kernel: amdgpu_device_ip_suspend+0x29/0x70 [amdgpu]
Feb 24 13:25:29 gaming kernel: amdgpu_device_pre_asic_reset+0xd3/0x2a0 [amdgpu]
Feb 24 13:25:29 gaming kernel: amdgpu_device_gpu_recover+0x438/0xda0 [amdgpu]
Feb 24 13:25:29 gaming kernel: amdgpu_job_timedout+0x186/0x270 [amdgpu]
Feb 24 13:25:29 gaming kernel: ? sched_clock_cpu+0xf/0x190
Feb 24 13:25:29 gaming kernel: drm_sched_job_timedout+0x77/0x110 [gpu_sched]
Feb 24 13:25:29 gaming kernel: process_one_work+0x173/0x340
Feb 24 13:25:29 gaming kernel: worker_thread+0x27b/0x3a0
Feb 24 13:25:29 gaming kernel: ? __pfx_worker_thread+0x10/0x10
Feb 24 13:25:29 gaming kernel: kthread+0xd4/0x100
Feb 24 13:25:29 gaming kernel: ? __pfx_kthread+0x10/0x10
Feb 24 13:25:29 gaming kernel: ret_from_fork+0x31/0x50
Feb 24 13:25:29 gaming kernel: ? __pfx_kthread+0x10/0x10
Feb 24 13:25:29 gaming kernel: ret_from_fork_asm+0x1b/0x30
Feb 24 13:25:29 gaming kernel: </TASK>
Feb 24 13:25:29 gaming kernel: Modules linked in: nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype overlay qrtr af_packet rfcomm xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat cmac algif_hash algif_skcipher af_alg bnep msr snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundw>
Feb 24 13:25:29 gaming kernel: sha256_ssse3 sha1_ssse3 snd_rawmidi aesni_intel snd_hda_core snd_seq_device mc crypto_simd cryptd snd_hwdep cmdlinepart r8169 iTCO_wdt snd_pcm spi_nor intel_pmc_bxt realtek rapl mtd mei_hdcp mei_pxp watchdog ee1004 mdio_devres snd_timer ecdh_generic intel_cstate intel_uncore rfkill s>
Feb 24 13:25:29 gaming kernel: usbhid hid sd_mod ahci libahci xhci_pci xhci_pci_renesas xhci_hcd libata nvme nvme_core usbcore scsi_mod t10_pi crc32c_intel crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common usb_common scsi_common rtc_cmos dm_mod dax amdgpu i2c_algo_bit drm_ttm_help>
Feb 24 13:25:29 gaming kernel: CR2: 0000000000000000
Feb 24 13:25:29 gaming kernel: ---[ end trace 0000000000000000 ]---
Feb 24 13:25:29 gaming kernel: RIP: 0010:dc_resource_state_copy_construct+0x27/0x180 [amdgpu]
Feb 24 13:25:29 gaming kernel: Code: 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 49 89 f4 55 31 ed 53 48 8b 87 08 5b 00 00 48 89 fb 44 8b b6 48 b5 03 00 <48> 8b 00 48 8b 00 80 b8 7f 01 00 00 00 74 07 48 8b ae c0 aa 03 00
Feb 24 13:25:29 gaming kernel: RSP: 0018:ffffb8d447ebfc08 EFLAGS: 00010246
Feb 24 13:25:29 gaming kernel: RAX: 0000000000000000 RBX: ffffb8d445ab1000 RCX: 0000000000000005
Feb 24 13:25:29 gaming kernel: RDX: 0000000000000010 RSI: ffff9633d1a80000 RDI: ffffb8d445ab1000
Feb 24 13:25:29 gaming kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Feb 24 13:25:29 gaming kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9633d1a80000
Feb 24 13:25:29 gaming kernel: R13: ffff9633ceb40000 R14: 0000000000000001 R15: 0000000000000000
Feb 24 13:25:29 gaming kernel: FS: 0000000000000000(0000) GS:ffff963b5d880000(0000) knlGS:0000000000000000
Feb 24 13:25:29 gaming kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 24 13:25:29 gaming kernel: CR2: 0000000000000000 CR3: 00000001ffc20002 CR4: 00000000003706f0
Feb 24 13:25:29 gaming kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 24 13:25:29 gaming kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb 24 13:25:29 gaming kernel: note: kworker/u32:75[490450] exited with irqs disabled
Feb 24 13:25:34 gaming kernel: amdgpu: Move buffer fallback to memcpy unavailable
Feb 24 13:25:34 gaming kernel: [drm:amdgpu_cs_parser_bos.isra.0 [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed.
Feb 24 13:25:34 gaming .io.elementary.[2670]: DisplayWidget.vala:152: Unknown network state, cannot show the good icon: NETWORK_STATE_FAILED
Feb 24 13:25:34 gaming .io.elementary.[2670]: NotificationEntry.vala:79: Unable to mask image: Failed to open file “/tmp/.com.google.Chrome.zNAhV4”: No such file or directory
Feb 24 13:25:36 gaming systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
Feb 24 13:25:49 gaming kernel: amdgpu: Move buffer fallback to memcpy unavailable
Feb 24 13:25:49 gaming kernel: [drm:amdgpu_cs_parser_bos.isra.0 [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed.
Feb 24 13:25:56 gaming signal-desktop.desktop[335987]: [335987:0224/132556.044242:ERROR:connection.cc(46)] X connection error received.
Feb 24 13:25:56 gaming at-spi-bus-launcher[2310]: X connection to :0 broken (explicit kill or server shutdown).
Feb 24 13:25:56 gaming systemd[2210]: bamfdaemon.service: Main process exited, code=exited, status=1/FAILURE
...
Feb 24 13:25:56 gaming pantheon-org.gnome.SettingsDaemon.Power.desktop[490771]: Cannot open display:
Feb 24 13:25:56 gaming pantheon-org.gnome.SettingsDaemon.MediaKeys.desktop[490755]: Cannot open display:
Feb 24 13:25:56 gaming .gala-daemon-wr[490792]: cannot open display: :0
Feb 24 13:25:56 gaming .bamfdaemon-wra[490791]: cannot open display: :0
Feb 24 13:25:56 gaming .io.elementary.[490784]: cannot open display: :0
Feb 24 13:25:56 gaming pantheon-org.gnome.SettingsDaemon.XSettings.desktop[490761]: Cannot open display:
Feb 24 13:25:56 gaming systemd[2210]: bamfdaemon.service: Main process exited, code=exited, status=1/FAILURE
Feb 24 13:25:56 gaming systemd[2210]: bamfdaemon.service: Failed with result 'exit-code'.
Feb 24 13:25:56 gaming systemd[2210]: Failed to start BAMF Application Matcher Framework.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Stopping timed out. Killing.
Feb 24 13:27:26 gaming systemd[2210]: bamfdaemon.service: start operation timed out. Terminating.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Killing process 1925 (lightdm) with signal SIGKILL.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Killing process 2230 (.gnome-session-) with signal SIGKILL.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Killing process 2196 (pool-spawner) with signal SIGKILL.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Killing process 2295 (gmain) with signal SIGKILL.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Killing process 2300 (n/a) with signal SIGKILL.
Feb 24 13:27:26 gaming systemd[1]: session-2.scope: Killing process 2570 (gala:disk$0) with signal SIGKILL.
Feb 24 13:27:26 gaming systemd[2210]: bamfdaemon.service: Failed with result 'timeout'.
Feb 24 13:27:26 gaming systemd[2210]: Failed to start BAMF Application Matcher Framework.
Feb 24 13:27:26 gaming systemd[2210]: bamfdaemon.service: Scheduled restart job, restart counter is at 3.