RX5600XT *ERROR* ring gfx_0.0.0 timeout
Summary of the problem:
Have a nice day everyone. I'm using a translator to describe this problem. There are a lot of similar threads here, and the problem still doesn't seem to be resolved. My configuration: nixos, xeon e5 2690v4, 32gb ram, 3 monitors All possible kernel parameters do not solve the problem, nor does the hight gpu performance level .
- This is most likely not a hardware problem, since furmark and other tests can run for more than 30 minutes.
- The error may not appear for a long time, under the same scenarios. Once she was gone for several days, judging by some people’s forums for more than a week.
- If the error appears, then under a certain load (in my case, games, dota2, or other dxvk games), the error is easily reproduced.
- Overheating is excluded, I changed the standard cooling system, during load the temperature does not exceed 65 degrees. During the error it can be 50 degrees.
- GPU core and memory frequencies most likely do not affect this. I tried several vbios firmwares, from different manufacturers. I tried using the firmware with tdp 130w and rather modest parameters (https://www.techpowerup.com/vgabios/220802/gigabyte-rx5600xt-6144-200422). The result is the same.
- Half a year before I had an rx5500 (navi14), I didn’t have this problem. Perhaps a navi 10 problem?
- I tried different kernels 5_4, 5_15, 6_1, 6_5, 6_6
I had to install windows (more than 5 years later) and here are my observations.
- The work is generally more stable, but there is also a TDR error and it seems to be the same error.
- The application (game) stops working, but the desktop itself and the driver continue to work, just close the application. I'm not an expert, but I think it's just a difference in the implementation of the graphics subsystem.
On the other hand, there are hundreds of thousands of similar cards, is this really true for everyone? Or are these all problems of certain unsuccessful copies?
Hardware description:
- CPU: Intel E5-2690v4
- GPU: RX6500xt
- System Memory: 32 GB Ram
- Display(s): 3 displays (60hz+165hz+75hz)
- Type of Display Connection: DP+DP+HDMI
How to reproduce the issue: Play games. Completely random, I haven't had a case of the error on my desktop
System information:
- Distro name and Version: Nixos stable 23.05 \ unstable
- Kernel version: 6.5.11-xanmod1, 5_15, 6_1
- AMD official driver version: OpenSource driver from kernel (amdgpu)
mydmesg.log
Logs:mydmesg_pcie_aspm_w_off_amdgpu.aspm_0_amdgpu.ppfeaturemask_0xfff73fff.log