Battlefield 1 and other games crashing with *ERROR* ring gfx_low timeout!
UPDATE - As the original crash report was one year old and I still get similar crashes on other games, I am updating this post with fresh logs and information. I can rule out a hardware issue as Windows 11 with Adrenalin 22.12.2 runs Battlefield 1 fine for many hours.
System information:
❯ inxi -GSC -xx
System:
Host: klx99 Kernel: 6.1.3-3.2-cachyos-bore-lto arch: x86_64 bits: 64
compiler: clang v: 16.0.0 Desktop: KDE Plasma v: 5.26.80 tk: Qt v: 5.15.7
wm: kwin_x11 dm: SDDM Distro: CachyOS
CPU:
Info: 18-core model: Intel Xeon E5-2696 v3 bits: 64 type: MT MCP
arch: Haswell rev: 2 cache: L1: 1.1 MiB L2: 4.5 MiB L3: 45 MiB
Speed (MHz): avg: 2387 high: 3796 min/max: 1200/2301 boost: enabled cores:
1: 2301 2: 2301 3: 2301 4: 2301 5: 2301 6: 2301 7: 2301 8: 2301 9: 2301
10: 2301 11: 2301 12: 2301 13: 3523 14: 2301 15: 2301 16: 2301 17: 2301
18: 2301 19: 2301 20: 2301 21: 2301 22: 2301 23: 2301 24: 2301 25: 2301
26: 2301 27: 2301 28: 2301 29: 2301 30: 2301 31: 2301 32: 2301 33: 2301
34: 2698 35: 2301 36: 3796 bogomips: 165744
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
Device-1: AMD Vega 10 XL/XT [Radeon RX 56/64] vendor: Micro-Star MSI
driver: amdgpu v: kernel arch: GCN-5 pcie: speed: 8 GT/s lanes: 16 ports:
active: DP-3 empty: DP-1,DP-2,HDMI-A-1 bus-ID: 04:00.0 chip-ID: 1002:687f
Display: x11 server: X.Org v: 21.1.99 with: Xwayland v: 22.1.7
compositor: kwin_x11 driver: X: loaded: amdgpu unloaded: modesetting
alternate: fbdev,vesa dri: radeonsi gpu: amdgpu display-ID: :0 screens: 1
Screen-1: 0 s-res: 2560x1440 s-dpi: 96
Monitor-1: DP-3 mapped: DisplayPort-2 model: HP X27q res: 2560x1440
dpi: 109 diag: 685mm (27")
API: OpenGL v: 4.6 Mesa 23.0.0-devel (git-32925bf708) renderer: AMD
Radeon RX Vega (vega10 LLVM 16.0.0 DRM 3.49 6.1.3-3.2-cachyos-bore-lto)
direct render: Yes
crash3.log - with Total War: Troy (DX12)
[ +0,000003] ------------[ cut here ]------------
[ +0,000001] refcount_t: underflow; use-after-free.
[ +0,000012] WARNING: CPU: 6 PID: 769 at lib/refcount.c:28 refcount_warn_saturate+0x9b/0xe0
[ +0,000005] Modules linked in: intel_rapl_msr
[ +0,000002] amdgpu 0000:05:00.0: amdgpu: GPU reset(2) succeeded!
[ +0,000001] mei_wdt mousedev intel_rapl_common sb_edac x86_pkg_temp_thermal r8169 intel_powerclamp i2c_i801 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate intel_uncore i2c_smbus realtek lpc_ich mdio_devres snd_hda_codec_realtek mei_me amdgpu snd_hda_codec_generic libphy ledtrig_audio mei snd_hda_codec_hdmi drm_buddy snd_hda_intel drm_ttm_helper snd_intel_dspcfg ttm snd_hda_codec gpu_sched drm_display_helper snd_hwdep snd_hda_core cec snd_pcm drm_kms_helper snd_timer sysimgblt syscopyarea snd sysfillrect fb_sys_fops soundcore igb wmi vfat fat acpi_cpufreq cfg80211 razerkbd(O) usbhid sch_cake usbip_host usbip_core pkcs8_key_parser crypto_user zram ext4 mbcache crc16 jbd2 crc32c_intel xhci_pci xhci_pci_renesas
[ +0,000033] CPU: 6 PID: 769 Comm: gfx Tainted: G O 6.0.8-3.1-cachyos-bore-lto #1 a32778937e548d5c3f330ce464c1250c470fe7df
[ +0,000002] Hardware name: LENOVO GAMING TF/X99-TF Gaming, BIOS CX99DE26 10/10/2020
[ +0,000001] RIP: 0010:refcount_warn_saturate+0x9b/0xe0
[ +0,000003] Code: c7 c7 b4 ba 1c 95 e8 e4 99 b3 ff 0f 0b c3 80 3d a4 36 2d 01 00 75 99 c6 05 9b 36 2d 01 01 48 c7 c7 a0 b6 24 95 e8 c5 99 b3 ff <0f> 0b c3 80 3d 86 36 2d 01 00 0f 85 76 ff ff ff c6 05 79 36 2d 01
[ +0,000001] RSP: 0018:ffffa96d4908be58 EFLAGS: 00010282
[ +0,000002] RAX: 0000000000000026 RBX: ffff897897409678 RCX: 0000000000000001
[ +0,000001] RDX: c0000000ffffbfff RSI: 0000000000000000 RDI: ffff89877f79c608
[ +0,000001] RBP: ffff89871b4540a8 R08: 0000000000003fff R09: ffff8987bff0c7c0
[ +0,000000] R10: 000000000000bffd R11: 0000000000000004 R12: ffffa96d4908be70
[ +0,000001] R13: ffff89818bca9380 R14: ffff897897409778 R15: ffff898379dfc800
[ +0,000001] FS: 0000000000000000(0000) GS:ffff89877f780000(0000) knlGS:0000000000000000
[ +0,000001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0,000001] CR2: 000000011220b006 CR3: 000000042900e004 CR4: 00000000001706e0
[ +0,000001] Call Trace:
[ +0,000002] <TASK>
[ +0,000001] ? drm_sched_main+0x69/0x2e0 [gpu_sched cd365bf7e7149d14ecdf8febffd2335c36596c8b]
[ +0,000005] ? init_wait_entry+0x40/0x40
[ +0,000003] ? kthread+0x1f4/0x240
[ +0,000003] ? drm_sched_job_timedout+0xe0/0xe0 [gpu_sched cd365bf7e7149d14ecdf8febffd2335c36596c8b]
[ +0,000003] ? kthreadd+0x2c0/0x2c0
[ +0,000002] ? ret_from_fork+0x1f/0x30
[ +0,000002] </TASK>
[ +0,000001] ---[ end trace 0000000000000000 ]---
crash2.log - with Battlefield 1 (DX12)
crash.log - with Battlefield 1 (DX12)
This one is also visible when such a crash happens which might be a related or a seperate issue:
[ 8079.028290] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 8079.035291] amdgpu 0000:05:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:216 vmid:2 pasid:32776, for process plasmashell pid 1468 thread plasmashel:cs0 pid 1522)
[ 8079.035295] amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800000028000 from IH client 0x1b (UTCL2)
[ 8079.035298] amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x002009B0
[ 8079.035300] amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CPF (0x4)
[ 8079.035301] amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0
[ 8079.035302] amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0
[ 8079.035302] amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0xb
[ 8079.035303] amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 8079.035304] amdgpu 0000:05:00.0: amdgpu: RW: 0x0
[ 8089.671248] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=1217300, emitted seq=1217302
[ 8089.671462] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process plasmashell pid 1468 thread plasmashel:cs0 pid 1522
This log is from Sniper Elite 5 (Vulkan): crash4.log