System lockup with Vega10 amdgpu: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout
Submitted by sam
Assigned to Default DRI bug account
Link to original bug (#106921)
Description
Created attachment 140164
dmesg w/mesa 18.0.2-1.fc28
Using Vega10 hardware (in my case, RX Vega 64), the whole system experiences regular full lockups, requiring me to force reboot either with the power switch on the PC or using SysRq. The system is still running, since I am able to ssh in from a separate machine and retrieve logs/run commands/etc, but all keyboard and mouse input ceases.
I've had this occur when doing a multitude of things, some of which are as follows:
- Playing games through Steam (Half-Life 2, Portal 2, Terraria tested)
- Playing non-Steam games (SuperTuxKart, GNOME Mines)
- Idle GNOME 3 desktop (no applications running)
- Browsing the web with Firefox 60.0.1
I have had this occur with:
Kernel: 4.16.14-300.fc28.x86_64 (from Fedora repos), 4.17.0 & 4.18.0-git5.1 (from kernel-vanilla repositories linked on Fedora wiki)
Mesa: 18.0.2-1.fc28 (from Fedora repos), 18.2.0-0.11.git41dabdc.fc28 (from che/mesa copr repo)
linux-firmware: 20180525-85.git7518922b.fc28 (from Fedora repos), with amdgpu/vega10_vce.bin replaced with newest version from git master.
OS: Fedora 28 Workstation
I am attaching a few dmesgs, each of which going from boot to the bug occurring.
**Attachment 140164**, "dmesg w/mesa 18.0.2-1.fc28":
dmesg-18.0.2-1.fc28.txt