[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Brief summary of the problem:
When I am playing games with Wine on Lutris, randomly my GPU crashes and I can see the following error on the kernel logs: [drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx_0.0.0 timeout, but soft recovered
It is not easily reproducible, sometimes it can happen few minutes after I start playing, other times I can play for long time without happening. However, it happens daily. I am able to kill wine and the system recovers This is happening in Ubuntu 22.04, but also tried 24.04 and both Debian testing and unstable. Same issue with all those distros and kernels
Hardware description:
- CPU: AMD Ryzen 7 5800X 8-Core Processor
- GPU: 28:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX/7900M] [1002:744c] (rev ce)
- System Memory: 2x8GB BLS8G4D32AESTK.M8FE 3200Mhz
- Display(s): Samsung 34WQHD
- Type of Display Connection: HDMI
System information:
- Distro name and Version: Ubuntu 24.04
- Kernel version: Linux Ryzen5800x 6.8.0-31-generic drm/amd#31 (closed)-Ubuntu SMP PREEMPT_DYNAMIC Sat Apr 20 00:40:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
- Custom kernel: N/A
- AMD official driver version: libdrm-amdgpu1/noble,now 2.4.120-2build1 amd64
How to reproduce the issue:
Play and wait till it happens
Attached files:
dmesg logs attached if you need any other logs, please let me know