*ERROR* ring gfx_0.0.0 timeout when using firefox, chrome or icaclient when dpm performance level = auto
Brief summary of the problem:
I got a brand new AMD Radeon RX 6700-XT card and started having following random crashes when using a browser or icaclient (Citrix client):
[ 85.861734] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=13365, emitted seq=13367 [ 85.862162] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 819 thread kwin_x11:cs0 pid 838
Display hangs/ becomes glitched. Sound is working. Rest of the system is responsive, can switch to terminal and recover.
DE: Xorg / KDE plasma (tested Xwayland as well).
- CPU: Intel(R) Core(TM) i9-9900KF
- GPU: 03:00.0 VGA compatible controller : Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M] [1002:73df] (rev c1)
- System Memory: 32GB
- Display(s): LG 32UD89 - 4k@60fps (Freesync: Extended)
- Type of Display Connection: DP
- Distro name and Version: Archlinux
- Kernel version: 5.17.2-arch3-1
- Custom kernel: no
- AMD official driver version: Mesa 22.0.1-3, vulkan-radeon 22.0.1-3
How to reproduce the issue:
Crashes are random. Usually happen several times a day when browsing, or using icaclient. Never happen during gaming/ GPU intensive work.
I have managed to find a remedy:
echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level
This command makes them disappear (tested for 3 weeks so it's 100% confirmed). I suppose it is power saving issue then.
Log files (for system lockups / game freezes / crashes)
- Dmesg log (full log)dmesg.log
- umr -R gfx_0.0.0partial_umr_gfx_0.0.0 , during collection fault happened:
[ 171.047397] BUG: unable to handle page fault for address: ffffb34e820ffffc