[radeon, rv370] Running piglit shaders@glsl-vs-raytrace-bug26691 test causes hard lockup & reboot
Running the piglit festsuite (git-11ee10ba04) via ./piglit run -1 quick -l verbose -s --dmesg
on a Radeon X600 causes the X600 to hard lockup & reboot. On my system this happens with kernel 5.15.11, 5.16.0, mesa 21.3.4 and mesa 22 (git-8b3d9472).
I had a closer look and found out that shaders@glsl-vs-raytrace-bug26691 causes the lockup. Running ./piglit/bin/glsl-vs-raytrace-bug26691 -auto -fbo
as a single test works sometimes the 1st time, but re-running it a 2nd or a 3rd time always causes the lockup:
[...]
[ 518.794824] radeon: wait for empty RBBM fifo failed! Bad things might happen.
[ 519.110152] Failed to wait GUI idle while programming pipes. Bad things might happen.
[ 519.111220] radeon 0000:07:00.0: Saved 59 dwords of commands on ring 0.
[ 519.111247] radeon 0000:07:00.0: (r300_asic_reset:426) RBBM_STATUS=0x8411C100
[ 519.616733] radeon 0000:07:00.0: (r300_asic_reset:445) RBBM_STATUS=0x8401C100
[ 520.118160] radeon 0000:07:00.0: (r300_asic_reset:457) RBBM_STATUS=0x8400C100
[ 520.118231] radeon 0000:07:00.0: failed to reset GPU
[ 520.319694] pcieport 0000:00:03.1: AER: Corrected error received: 0000:00:03.1
[ 520.319723] pcieport 0000:00:03.1: PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
[ 520.319729] pcieport 0000:00:03.1: device [1022:1483] error status/mask=00002000/00004000
[ 520.319735] pcieport 0000:00:03.1: [13] NonFatalErr
[ 520.722345] pcieport 0000:00:03.1: AER: Corrected error received: 0000:00:03.1
For regular desktop usage the X600 seems ok so far. Unfortunately I can't get a backtrace as running the test via gdb also leads to a lockup & reboot before interaction is possible.
Some data about the system:
$ inxi -b
System:
Host: prototype Kernel: 5.16.0-Zen3 x86_64 bits: 64 Desktop: Openbox 3.6.1
Distro: Gentoo Base System release 2.7
Machine:
Type: Desktop Mobo: ASRock model: B450M Steel Legend
serial: <superuser/root required> UEFI: American Megatrends v: P4.20
date: 08/03/2021
CPU:
Info: 16-Core AMD Ryzen 9 5950X [MT MCP] speed: 3685 MHz
min/max: 2200/3400 MHz
Graphics:
Device-1: AMD RV370 [Radeon X600/X600 SE] driver: radeon v: kernel
Display: x11 server: X.Org 1.20.14 driver: ati,radeon
unloaded: fbdev,modesetting resolution: 1920x1080~60Hz
OpenGL: renderer: ATI RV370 v: 2.1 Mesa 22.0.0-devel (git-8b3d947267)
Network:
Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
driver: r8169
# lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 7
01:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD (rev 01)
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller (rev 01)
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller (rev 01)
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge (rev 01)
03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370 [Radeon X600/X600 SE]
07:00.1 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] RV380 [Radeon X300/X550/X1050 Series] (Secondary)
08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
09:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
# lspci -s 07:00.0 -vv
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370 [Radeon X600/X600 SE] (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV370 [Radeon X600/X600 SE]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 59
IOMMU group: 2
Region 0: Memory at e8000000 (64-bit, prefetchable) [size=128M]
Region 2: Memory at fce30000 (64-bit, non-prefetchable) [size=64K]
Region 4: I/O ports at e000 [size=256]
Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <256ns, L1 <4us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 75.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <256ns, L1 <2us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (ok), Width x16 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee01000 Data: 0022
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 04000001 0000200f 07070000 b8cdf5fd
Kernel driver in use: radeon
Kernel modules: radeon
dmesg_5160_zen3_v0.txt config_5160_zen3 rv370_22.0.0git-8b3d947267.tar.bz2