Starship Troopers Extermination game results in an unrecoverable ring gfx_low timeout on a Vega 56
Brief summary of the problem:
When launching Starship Troopers: Extermination via steam the game starts, shows the first video sequence, but upon clicking on the button that warns about being an Early Access title, my display goes black and the gpu fans start running on 100 %. Only a hardreset makes the system usable again.
I tried different proton versions, but a game should never be able to crash a system therefore I'm writing this report.
The log mentions a "GPU reset begin" but the reset never happens (or waiting for 20-30s wasn't enough).
Hardware description:
- CPU: AMD Ryzen 7 2700X Eight-Core Processor
- GPU: [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] [1002:687f] (rev c3)
- System Memory: 32 GB
- Display(s): DisplayPort-0 connected primary 2560x1440+0+0
System information:
- Distro name and Version: Debian Unstable
- Kernel version: 6.7.9-amd64
How to reproduce the issue:
- Start Starship Troopers through steam
- Wait for intro to end (or skip it)
- Click on the Acknowledge button about being Early Access.
The intro has a ridiculous high framerate of over 1000 fps...
Attached files:
Log files (for system lockups / game freezes / crashes)
Mär 19 12:32:55 powerbroom kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=195569, emitted seq=195571
Mär 19 12:32:55 powerbroom kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 7230 thread dxvk-submit pid 7327
Mär 19 12:32:55 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Mär 19 12:32:55 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0xffffffff
Mär 19 12:32:55 powerbroom kernel: amdgpu: [powerplay] Failed message: 0xa, input parameter: 0x103000, error code: 0xffffffff
Mär 19 12:32:55 powerbroom kernel: amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:32:55 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0xffffffff
Mär 19 12:32:55 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:32:55 powerbroom kernel: amdgpu 0000:0c:00.0: [drm] REG_WAIT timeout 10us * 3000 tries - dce110_stream_encoder_dp_blank line:936
Mär 19 12:33:15 powerbroom assert_20240319123314_38.dmp[7596]: Uploading dump (out-of-process)
/tmp/dumps/assert_20240319123314_38.dmp
Mär 19 12:33:15 powerbroom kernel: [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
Mär 19 12:33:15 powerbroom kernel: [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing E028 (len 824, WS 0, PS 0) @ 0xE1A8
Mär 19 12:33:15 powerbroom kernel: [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing DEE2 (len 326, WS 0, PS 0) @ 0xDFD2
Mär 19 12:33:15 powerbroom kernel: amdgpu 0000:0c:00.0: [drm] *ERROR* dce110_link_encoder_disable_output: Failed to execute VBIOS command table!
Mär 19 12:33:17 powerbroom assert_20240319123314_38.dmp[7596]: Finished uploading minidump (out-of-process): success = yes
Mär 19 12:33:17 powerbroom assert_20240319123314_38.dmp[7596]: response: CrashID=bp-d1dec3e5-fc7a-4e64-80fb-880002240319
Mär 19 12:33:17 powerbroom assert_20240319123314_38.dmp[7596]: file ''/tmp/dumps/assert_20240319123314_38.dmp'', upload yes: ''CrashID=bp-d1dec3e5-fc7a-4e64-80fb-880002240319''
Mär 19 12:33:29 powerbroom systemd-logind[1327]: Power key pressed short.
Mär 19 12:33:35 powerbroom kernel: [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs aborting
Mär 19 12:33:35 powerbroom kernel: [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing C1F2 (len 62, WS 0, PS 0) @ 0xC20E
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x4c, input parameter: 0x1, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x4c, input parameter: 0x3, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0xa, input parameter: 0x103000, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu 0000:0c:00.0: [drm] *ERROR* Failed to get VBLANK!
Mär 19 12:33:35 powerbroom kernel: ------------[ cut here ]------------
Mär 19 12:33:35 powerbroom kernel: amdgpu 0000:0c:00.0: drm_WARN_ON_ONCE(cur_vblank != vblank->last)
Mär 19 12:33:35 powerbroom kernel: WARNING: CPU: 5 PID: 1561 at drivers/gpu/drm/drm_vblank.c:354 drm_update_vblank_count+0x2f7/0x3c0 [drm]
Mär 19 12:33:35 powerbroom kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge wireguard stp libchacha20poly1305 llc chacha_x86_64 poly1305_x86_64 nf_tables curve25519_x86_64 libcurve25519>
Mär 19 12:33:35 powerbroom kernel: crc16 mbcache jbd2 crc32c_generic dm_crypt dm_mod hid_roccat_arvo hid_roccat hid_roccat_common hid_generic amdgpu drm_exec amdxcp drm_buddy gpu_sched video i2c_algo_bit usbhid drm_suballoc_helper drm_display_helper hid cec sd_mod rc_core drm_ttm_helper crc32_pclmul ttm crc32c_intel ahci drm_kms_helper nvme xhci_pci ghash_clmu>
Mär 19 12:33:35 powerbroom kernel: CPU: 5 PID: 1561 Comm: Xorg Not tainted 6.7.9-amd64 #1 Debian 6.7.9-1
Mär 19 12:33:35 powerbroom kernel: Hardware name: System manufacturer System Product Name/TUF X470-PLUS GAMING, BIOS 5861 08/10/2021
Mär 19 12:33:35 powerbroom kernel: RIP: 0010:drm_update_vblank_count+0x2f7/0x3c0 [drm]
Mär 19 12:33:35 powerbroom kernel: Code: 48 8b 5f 50 48 85 db 75 03 48 8b 1f e8 d2 6f 55 f2 48 c7 c1 c0 f9 79 c0 48 89 da 48 c7 c7 8a 29 7a c0 48 89 c6 e8 09 c4 d7 f1 <0f> 0b e9 4b fe ff ff 48 8b 4c 24 18 e9 31 fe ff ff 31 f6 48 85 db
Mär 19 12:33:35 powerbroom kernel: RSP: 0018:ffffae0acac17910 EFLAGS: 00010086
Mär 19 12:33:35 powerbroom kernel: RAX: 0000000000000000 RBX: ffff92ff019b4e90 RCX: 0000000000000027
Mär 19 12:33:35 powerbroom kernel: RDX: ffff93060eb61408 RSI: 0000000000000001 RDI: ffff93060eb61400
Mär 19 12:33:35 powerbroom kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffae0acac17798
Mär 19 12:33:35 powerbroom kernel: R10: 0000000000000003 R11: ffff93062f2fcfe8 R12: 0000000000000000
Mär 19 12:33:35 powerbroom kernel: R13: ffff92ff28fa3028 R14: 0000000000000003 R15: 0000000000000000
Mär 19 12:33:35 powerbroom kernel: FS: 00007fd79920cac0(0000) GS:ffff93060eb40000(0000) knlGS:0000000000000000
Mär 19 12:33:35 powerbroom kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mär 19 12:33:35 powerbroom kernel: CR2: 00007efd317010a0 CR3: 0000000138cd4000 CR4: 00000000003506f0
Mär 19 12:33:35 powerbroom kernel: Call Trace:
Mär 19 12:33:35 powerbroom kernel: <TASK>
Mär 19 12:33:35 powerbroom kernel: ? drm_update_vblank_count+0x2f7/0x3c0 [drm]
Mär 19 12:33:35 powerbroom kernel: ? __warn+0x81/0x130
Mär 19 12:33:35 powerbroom kernel: ? drm_update_vblank_count+0x2f7/0x3c0 [drm]
Mär 19 12:33:35 powerbroom kernel: ? report_bug+0x171/0x1a0
Mär 19 12:33:35 powerbroom kernel: ? srso_return_thunk+0x5/0x5f
Mär 19 12:33:35 powerbroom kernel: ? console_unlock+0x78/0x120
Mär 19 12:33:35 powerbroom kernel: ? handle_bug+0x3c/0x80
Mär 19 12:33:35 powerbroom kernel: ? exc_invalid_op+0x17/0x70
Mär 19 12:33:35 powerbroom kernel: ? asm_exc_invalid_op+0x1a/0x20
Mär 19 12:33:35 powerbroom kernel: ? drm_update_vblank_count+0x2f7/0x3c0 [drm]
Mär 19 12:33:35 powerbroom kernel: drm_vblank_enable+0x14c/0x180 [drm]
Mär 19 12:33:35 powerbroom kernel: drm_vblank_get+0x9a/0xe0 [drm]
Mär 19 12:33:35 powerbroom kernel: amdgpu_dm_atomic_commit_tail+0x2a7a/0x3950 [amdgpu]
Mär 19 12:33:35 powerbroom kernel: commit_tail+0x94/0x130 [drm_kms_helper]
Mär 19 12:33:35 powerbroom kernel: drm_atomic_helper_commit+0x11a/0x140 [drm_kms_helper]
Mär 19 12:33:35 powerbroom kernel: drm_atomic_commit+0x9a/0xd0 [drm]
Mär 19 12:33:35 powerbroom kernel: ? __pfx___drm_printfn_info+0x10/0x10 [drm]
Mär 19 12:33:35 powerbroom kernel: drm_mode_obj_set_property_ioctl+0x157/0x3d0 [drm]
Mär 19 12:33:35 powerbroom kernel: ? __pfx_drm_mode_obj_set_property_ioctl+0x10/0x10 [drm]
Mär 19 12:33:35 powerbroom kernel: drm_ioctl_kernel+0xd6/0x180 [drm]
Mär 19 12:33:35 powerbroom kernel: drm_ioctl+0x26d/0x4b0 [drm]
Mär 19 12:33:35 powerbroom kernel: ? __pfx_drm_mode_obj_set_property_ioctl+0x10/0x10 [drm]
Mär 19 12:33:35 powerbroom kernel: amdgpu_drm_ioctl+0x4e/0x90 [amdgpu]
Mär 19 12:33:35 powerbroom kernel: __x64_sys_ioctl+0x97/0xd0
Mär 19 12:33:35 powerbroom kernel: do_syscall_64+0x64/0x120
Mär 19 12:33:35 powerbroom kernel: ? srso_return_thunk+0x5/0x5f
Mär 19 12:33:35 powerbroom kernel: ? syscall_exit_to_user_mode+0x22/0x40
Mär 19 12:33:35 powerbroom kernel: ? srso_return_thunk+0x5/0x5f
Mär 19 12:33:35 powerbroom kernel: ? do_syscall_64+0x70/0x120
Mär 19 12:33:35 powerbroom kernel: entry_SYSCALL_64_after_hwframe+0x6e/0x76
Mär 19 12:33:35 powerbroom kernel: RIP: 0033:0x7fd79951b5cb
Mär 19 12:33:35 powerbroom kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Mär 19 12:33:35 powerbroom kernel: RSP: 002b:00007ffe590b17f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mär 19 12:33:35 powerbroom kernel: RAX: ffffffffffffffda RBX: 0000563040944790 RCX: 00007fd79951b5cb
Mär 19 12:33:35 powerbroom kernel: RDX: 00007ffe590b1880 RSI: 00000000c01864ba RDI: 0000000000000011
Mär 19 12:33:35 powerbroom kernel: RBP: 00007ffe590b1880 R08: 0000000000000000 R09: 0000000000000000
Mär 19 12:33:35 powerbroom kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 00000000c01864ba
Mär 19 12:33:35 powerbroom kernel: R13: 0000000000000011 R14: 0000000000000001 R15: 0000563040940040
Mär 19 12:33:35 powerbroom kernel: </TASK>
Mär 19 12:33:35 powerbroom kernel: ---[ end trace 0000000000000000 ]---
Mär 19 12:33:35 powerbroom kernel: [drm] Timeout wait for RLC serdes 0,0
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x800000, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x22, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x25, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x30, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0xa, input parameter: 0x103000, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x10000, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x4000, error code: 0xffffffff
Mär 19 12:33:35 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x8000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x8000000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x400, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x1000000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x30f, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x800, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x1000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x2000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x80000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x40, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: amdgpu: [powerplay] Failed message: 0x5, input parameter: 0x10000000, error code: 0xffffffff
Mär 19 12:33:36 powerbroom kernel: [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* ring_buffer_start = 000000005e1e8a89; ring_buffer_end = 000000001cb4b467; write_frame = 000000003dee510a
Mär 19 12:33:36 powerbroom kernel: [drm:psp_ring_cmd_submit [amdgpu]] *ERROR* write_frame is pointing to address out of bounds
Mär 19 12:33:36 powerbroom kernel: [drm:psp_suspend [amdgpu]] *ERROR* Failed to terminate asd
Mär 19 12:33:36 powerbroom kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <psp> failed -22
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: MODE1 reset
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU psp mode1 reset
Mär 19 12:33:36 powerbroom kernel: [drm] psp is not working correctly before mode1 reset!
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset failed
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: ASIC reset failed with error, -22 for drm dev, 0000:0c:00.0
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset(2) failed
Mär 19 12:33:36 powerbroom kernel: [drm] Skip scheduling IBs!
Mär 19 12:33:36 powerbroom kernel: snd_hda_intel 0000:0c:00.1: Unable to change power state from D3cold to D0, device inaccessible
Mär 19 12:33:36 powerbroom kernel: snd_hda_intel 0000:0c:00.1: CORB reset timeout#2, CORBRP = 65535
Mär 19 12:33:36 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset end with ret = -22
Mär 19 12:33:36 powerbroom kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -22
Mär 19 12:33:46 powerbroom kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring page1 timeout, signaled seq=3222, emitted seq=3224
Mär 19 12:33:46 powerbroom kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
Mär 19 12:33:46 powerbroom kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Please let me know if you need anything else and many thanks for looking into this.