RX550 amdgpu *ERROR* ring gfx timeout on 6.2.12
Apr 29 22:42:01 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=210834, emitted seq=210836
Apr 29 22:42:01 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process sway pid 2389 thread sway:cs0 pid 2578
Apr 29 22:42:01 fedora kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Apr 29 22:42:01 fedora kernel: amdgpu: cp is busy, skip halt cp
Apr 29 22:42:01 fedora kernel: amdgpu: rlc is busy, skip halt rlc
Apr 29 22:42:01 fedora kernel: amdgpu 0000:03:00.0: amdgpu: BACO reset
Apr 29 22:42:02 fedora kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
Apr 29 22:42:02 fedora kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400380000).
Apr 29 22:42:02 fedora kernel: [drm] VRAM is lost due to GPU reset!
Apr 29 22:42:08 fedora kernel: amdgpu: SMU load firmware failed
Apr 29 22:42:08 fedora kernel: amdgpu: fw load failed
Apr 29 22:42:08 fedora kernel: amdgpu: smu firmware loading failed
Brief summary of the problem:
Somewhat randomly all displays will either go black or show a large amount of artifacting. The system will then either lock up completely (which requires a hard reset) or if I am lucky just the amdgpu driver will crash allowing me to gracefully reboot the system over SSH.
I have tried the following different kernel versions but have always encountered this issue:
- 5.16.0
- 5.18.19
- 6.2.2
- 6.2.9
- 6.2.11
- 6.2.12
Hardware description:
- CPU: 13th Gen Intel(R) Core(TM) i5-13600K
- Motherboard: ASRock Z690 Phantom Gaming 4
- System Memory: 32GiB
- GPU: AD102 Gigabyte GeForce RTX 4090 (using vfio-pci driver, not used for host displays)
- GPU: Lexa PRO Radeon RX 550
- Display: Output DP-1 'Goldstar Company Ltd 24M35 406NDDM1R122' Current mode: 1920x1080 @ 60.000 Hz
- Display: Output DP-2 'Ancor Communications Inc VE247 DALMQS062365' Current mode: 1920x1080 @ 60.000 Hz
- Display: Output DP-3 'Goldstar Company Ltd LG ULTRAGEAR 011NTTQG8145' Current mode: 2560x1440 @ 59.951 Hz
System information:
- Distro name and Version: Fedora 36
- Kernel version: 6.2.12
- Custom kernel: N/A
- Display Server: Wayland
- Window Manager: Sway
How to reproduce the issue:
This issue is not easily reproducible but seems to be caused by applications that use video acceleration such as Discord or Firefox. If I leave my system online but unused for a couple days I will not usually experience a crash. However if I am actively using Discord/Firefox I can easily get 1 or 2 crashes a day.
Note: I was running the same AMD GPU RX 550 on a Ryzen 3800x before and did not encounter any crashes. I have also tried a different Z690 motherboard with my Intel 13600k however the crashes persisted. I am fairly certain as this point that this issue is a result of the RX 550 paired with a Intel 13th gen CPU.
Attached files:
I have attached multiple dmesg logs, crash messages can be seen near the end of each log.
Do let me know if there are other logs I can provide to better troubleshoot.
Thanks in advance for any help.