radv: GPU hang with Remnant: From The Ashes
System information
- OS: Arch Linux
- GPU: 0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] (rev c3)
- Kernel version: 5.8.1-arch1-1
- Mesa version: OpenGL version string: 4.6 (Compatibility Profile) Mesa 20.1.5
- Xserver version (if applicable): X.Org X Server 1.20.8
- Desktop manager and compositor: i3
If applicable
- Wine/Proton version: 5.9-GE-5-ST but I have also tried 5.0-9
Describe the issue
Remnant: From the Ashes (currently free on EGS) when launched (just past the logos, it doesn't even go to the menu) makes the display go dark and fans on the GPU start spinning full speed. This happens with any combination of Sway / i3 and ACO / AMDGPU LLVM.
22:29:03: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=105872, emitted seq=105875
22:29:03: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Remnant-Win64-S pid 2512 thread Remnant-Wi:cs0 pid 2598
22:29:03: amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
22:29:04: snd_hda_intel 0000:0b:00.1: can't change power state from D0 to D3hot (config space inaccessible)
22:29:04: [drm:dm_gpureset_toggle_interrupts [amdgpu]] crtc 0 - vupdate irq disabling: r=0
22:29:04: [drm:dm_gpureset_toggle_interrupts [amdgpu]] crtc 1 - vupdate irq disabling: r=0
22:29:04: [drm:dc_commit_state [amdgpu]] dc_commit_state: 0 streams
22:29:04: amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff
22:29:04: amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0xffffffff
22:29:04: amdgpu: [powerplay] Failed message: 0xa, input parameter: 0xa0b000, error code: 0xffffffff
22:29:04: amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff
22:29:04: amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0xffffffff
22:29:04: amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff
22:29:05: [drm:handle_cursor_update [amdgpu]] handle_cursor_update: crtc_id=0 with size 128 to 128
22:29:24: [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
22:29:24: [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing E028 (len 824, WS 0, PS 0) @ 0xE1A8
22:29:24: [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing DEE2 (len 326, WS 0, PS 0) @ 0xDFD2
22:29:24: [drm:dce110_link_encoder_disable_output [amdgpu]] *ERROR* dce110_link_encoder_disable_output: Failed to execute VBIOS command table!
22:29:24: ------------[ cut here ]------------
22:29:24: WARNING: CPU: 3 PID: 130 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_link_encoder.c:1100 dce110_link_encoder_disable_output+0x141/0x150 [amdgpu]
22:29:24: Modules linked in: rfcomm fuse xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter tun br>
22:29:24: evdev wmi mac_hid v4l2loopback(OE) videodev mc snd_mixer_oss snd soundcore acpi_cpufreq cpufreq_powersave cpufreq_conservative cpufreq_userspace drm crypto_user agpgart ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq hid_generic usbhid hid dm_crypt cbc>
22:29:24: CPU: 3 PID: 130 Comm: kworker/3:2 Tainted: G OE 5.8.1-arch1-1 #1
22:29:24: Hardware name: System manufacturer System Product Name/ROG STRIX X570-I GAMING, BIOS 2407 07/01/2020
22:29:24: Workqueue: events drm_sched_job_timedout [gpu_sched]
22:29:24: RIP: 0010:dce110_link_encoder_disable_output+0x141/0x150 [amdgpu]
22:29:24: Code: 44 24 38 65 48 2b 04 25 28 00 00 00 75 20 48 83 c4 40 5b 5d 41 5c c3 48 c7 c6 c0 cb 4e c1 48 c7 c7 28 b4 56 c1 e8 af a9 24 ff <0f> 0b eb d0 e8 86 35 68 c3 66 0f 1f 44 00 00 0f 1f 44 00 00 41 57
22:29:24: RSP: 0018:ffff9f6d00633a50 EFLAGS: 00010246
22:29:24: RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
22:29:24: RDX: 0000000000000000 RSI: ffffffff85369a67 RDI: 00000000ffffffff
22:29:24: RBP: ffff98680aa75f00 R08: 00000000000029da R09: 0000000000000001
22:29:24: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9f6d00633a54
22:29:24: R13: 0000000000000000 R14: 0000000000000004 R15: ffff9868082f3300
22:29:24: FS: 0000000000000000(0000) GS:ffff98680eac0000(0000) knlGS:0000000000000000
22:29:24: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
22:29:24: CR2: 000014a76591cb70 CR3: 00000006ec81a000 CR4: 0000000000340ee0
22:29:24: Call Trace:
22:29:24: disable_link+0x3b/0xa0 [amdgpu]
22:29:24: core_link_disable_stream+0xea/0x220 [amdgpu]
22:29:24: dce110_reset_hw_ctx_wrap+0xbe/0x240 [amdgpu]
22:29:24: dce110_apply_ctx_to_hw+0x4f/0x570 [amdgpu]
22:29:24: ? hwmgr_handle_task+0x98/0xf0 [amdgpu]
22:29:24: ? pp_dpm_dispatch_tasks+0x45/0x60 [amdgpu]
22:29:24: ? dm_pp_apply_display_requirements+0x19e/0x1c0 [amdgpu]
22:29:24: dc_commit_state+0x323/0x970 [amdgpu]
22:29:24: ? dce112_validate_bandwidth+0x75/0x1c0 [amdgpu]
22:29:24: ? dc_rem_all_planes_for_stream+0xcb/0x110 [amdgpu]
22:29:24: amdgpu_dm_commit_zero_streams+0x12d/0x140 [amdgpu]
22:29:24: ? dce110_vblank_set+0x70/0xa0 [amdgpu]
22:29:24: dm_suspend+0x9a/0xb0 [amdgpu]
22:29:24: amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
22:29:24: ? amdgpu_fence_process+0x4d/0x140 [amdgpu]
22:29:24: amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
22:29:24: amdgpu_device_gpu_recover.cold+0x653/0xfd4 [amdgpu]
22:29:24: amdgpu_job_timedout+0x121/0x140 [amdgpu]
22:29:24: drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
22:29:24: process_one_work+0x1da/0x3d0
22:29:24: worker_thread+0x4d/0x3d0
22:29:24: ? rescuer_thread+0x410/0x410
22:29:24: kthread+0x142/0x160
22:29:24: ? __kthread_bind_mask+0x60/0x60
22:29:24: ret_from_fork+0x22/0x30
22:29:24: ---[ end trace 447de3890a156791 ]---
22:29:26: IPv6: MLD: clamping QRV from 1 to 2!
22:29:44: [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
22:29:44: [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C1F2 (len 62, WS 0, PS 0) @ 0xC20E
Full logs
Edited by Arkadiusz Hiler