Annoying GPU stucks are continued on Vega 20 with Kernel 5.4 + mesa 9.3.0 + llvm 9.0.0 [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
@Mikhail
Submitted by Mikhail Gavrilov Assigned to Default DRI bug account
Link to original bug (#111803)
Description
Created attachment 145490
dmesg
Annoying GPU stucks are continued on Vega 20 with Kernel 5.4 + mesa 9.3.0 + llvm 9.0.0
For reproducing is enough on the machine when happened memory pressing launch the game Supraland from steam store.
[48662.086736] INFO: task OnlineA-nstance:153979 blocked for more than 122 seconds.
[48662.086740] Not tainted 5.4.0-0.rc0.git4.1a.fc32.x86_64 #1 (closed)
[48662.086743] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[48662.086746] OnlineA-nstance D12600 153979 153907 0x80004002
[48662.086753] Call Trace:
[48662.086760] ? __schedule+0x307/0x950
[48662.086770] schedule+0x40/0xc0
[48662.086775] schedule_timeout+0x289/0x3c0
[48662.086782] ? mark_held_locks+0x50/0x80
[48662.086787] ? _raw_spin_unlock_irqrestore+0x4b/0x60
[48662.086792] ? lockdep_hardirqs_on+0xf0/0x180
[48662.086803] dma_fence_wait_any_timeout+0x208/0x275
[48662.086881] amdgpu_sa_bo_new+0x44b/0x510 [amdgpu]
[48662.086982] amdgpu_ib_get+0x31/0x80 [amdgpu]
[48662.087075] amdgpu_job_alloc_with_ib+0x46/0x70 [amdgpu]
[48662.087081] ? find_held_lock+0x32/0x90
[48662.087154] amdgpu_vm_sdma_prepare+0x30/0x90 [amdgpu]
[48662.087243] amdgpu_vm_bo_update_mapping+0x7b/0xe0 [amdgpu]
[48662.087318] amdgpu_vm_clear_freed+0xd5/0x1d0 [amdgpu]
[48662.087395] amdgpu_gem_object_close+0x159/0x1b0 [amdgpu]
[48662.087407] ? lockdep_hardirqs_on+0xf0/0x180
[48662.087432] drm_gem_object_release_handle+0x30/0x90 [drm]
[48662.087447] ? drm_gem_object_handle_put_unlocked+0xa0/0xa0 [drm]
[48662.087453] idr_for_each+0x5e/0xd0
[48662.087459] ? mark_held_locks+0x50/0x80
[48662.087477] drm_gem_release+0x1c/0x30 [drm]
[48662.087492] drm_file_free.part.0+0x22e/0x270 [drm]
[48662.087509] drm_release+0xab/0xe0 [drm]
[48662.087517] __fput+0xdd/0x270
[48662.087525] task_work_run+0x93/0xd0
[48662.087533] do_exit+0x349/0xcd0
[48662.087539] ? find_held_lock+0x32/0x90
[48662.087548] do_group_exit+0x47/0xb0
[48662.087554] get_signal+0x17e/0xcb0
[48662.087565] do_signal+0x36/0x680
[48662.087580] exit_to_usermode_loop+0x8d/0x120
[48662.087588] syscall_return_slowpath+0x205/0x330
[48662.087594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[48662.087599] RIP: 0033:0x7f0b10b4ffaa
[48662.087606] Code: Bad RIP value.
[48662.087610] RSP: 002b:00007f0ae77fdc40 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[48662.087615] RAX: fffffffffffffdfc RBX: 00000000000051ac RCX: 00007f0b10b4ffaa
[48662.087619] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f0b0ebf1170
[48662.087622] RBP: 00007f0b0ebf1148 R08: 0000000000000000 R09: 00000000ffffffff
[48662.087626] R10: 00007f0ae77fdd48 R11: 0000000000000246 R12: 0000000000000000
[48662.087629] R13: 00007f0b0ebf1120 R14: 00007f0b0ebf1170 R15: 00007f0ae77fdc80
[48662.087646]
Showing all locks held in the system:
[48662.087662] 1 lock held by khungtaskd/96:
[48662.087665] #0: ffffffff8d693760 (rcu_read_lock){....}, at: debug_show_all_locks+0x15/0x174
[48662.087738] 1 lock held by CPU 0/KVM/3098:
[48662.087833] 2 locks held by dnf/104312:
[48662.087836] #0: ffff8d88dacc80a0 (&tty->ldisc_sem)++, at: tty_ldisc_ref_wait+0x24/0x50
[48662.087844] #1 (closed): ffffa1088052a2f0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xe3/0x980
[48662.088002] 3 locks held by kworker/15:0/152888:
[48662.088005] #0: ffff8d8936c21548 ((wq_completion)events){+.+.}, at: process_one_work+0x1e9/0x5a0
[48662.088012] #1 (closed): ffffa1088d61fe50 ((work_completion)(&(&bdev->wq)->work)){+.+.}, at: process_one_work+0x1e9/0x5a0
[48662.088018] #2: ffff8d892bf5c9f8 (reservation_ww_class_mutex){+.+.}, at: ttm_bo_delayed_delete+0x8d/0x200 [ttm]
[48662.088032] 3 locks held by OnlineA-nstance/153979:
[48662.088035] #0: ffffffffc0303070 (drm_global_mutex){+.+.}, at: drm_release+0x2c/0xe0 [drm]
[48662.088054] #1 (closed): ffffa1088d457b30 (reservation_ww_class_acquire){+.+.}, at: amdgpu_gem_object_close+0xce/0x1b0 [amdgpu]
[48662.088126] #2: ffff8d892bf5c9f8 (reservation_ww_class_mutex){+.+.}, at: ttm_eu_reserve_buffers+0x349/0x620 [ttm]
[48662.088146] =============================================
Attachment 145490, "dmesg":
dmesg.txt