r300 powerpc crashes when using 3D
Brief summary of the problem:
These r300 powerpc systems crash when running anything 3D related. I have tried many different xorg.conf configurations and radeon module parameters to no avail.
I am happy to build and test patches.
I have so far tested on two systems (radeon 9600 and radeon mobility 9550). Behaviour on both systems is identical. Also two different sets of package versions (debian unstable, the first packages are a couple of years old, but the second set are very recent). Behaviour is slightly different with different sets of packages:
First package versions tested
xserver-xorg-video-radeon=19.1.0-2 firmware-amd-graphics=20200918-1 libdrm-radeon1=2.4.103-1 mesa=20.0.2-1
X starts fine. glxgears runs fine. glxinfo reports hardware acceleration enabled. Running something more intense results in a crash: kodi crashes after a minute or two, openarena crashes generally after the initial animation or just after the menu is shown. The system is unresponsive after the crash and ssh connections are dropped, the machine has to be rebooted. On /var/log/messages via ssh we can sometimes see:
radeon 0000:00:10.0: GPU lockup (current fence id 0x0000000000001298 last fence id 0x000000000000129c on ring 0)
radeon: wait for empty RBBM fifo failed! Bad things might happen.
Failed to wait GUI idle while programming pipes. Bad things might happen.
Second package versions tested
xserver-xorg-video-radeon=19.1.0-3 firmware-amd-graphics=20221109-2 libdrm-radeon1=2.4.114-1 mesa=22.2.4-1
X starts fine. glxinfo reports hardware acceleration enabled. Running anything that uses 3D results in the same crash as with older packages above, with the same kernel output, but much more quickly: glxgears crashes instantly, as does openarena, in both cases without drawing anything except a window. kodi sometimes works for a short time.
However on one occasion I also experienced the system staying alive, with ssh still active but the X session freezing and the display mode continually being reset every 5 to 10 seconds. In that case it is impossible to recover the graphics session, but we see this in /var/log/messages, looping continually:
[ 253.927650] radeon 0000:00:10.0: ring 0 stalled for more than 10196msec
[ 253.927684] radeon 0000:00:10.0: GPU lockup (current fence id 0x000000000000087e last fence id 0x0000000000000880 on ring 0)
[ 254.065322] Failed to wait GUI idle while programming pipes. Bad things might happen.
[ 254.069528] radeon 0000:00:10.0: Saved 59 dwords of commands on ring 0.
[ 254.069559] radeon 0000:00:10.0: (r300_asic_reset:426) RBBM_STATUS=0x84110140
[ 254.567341] radeon 0000:00:10.0: (r300_asic_reset:445) RBBM_STATUS=0x80010140
[ 255.061136] radeon 0000:00:10.0: (r300_asic_reset:457) RBBM_STATUS=0x00000140
[ 255.061168] radeon 0000:00:10.0: GPU reset succeed
[ 255.061174] radeon 0000:00:10.0: GPU reset succeeded, trying to resume
[ 255.072909] debugfs: File 'r100_mc_info' in directory '0' already present!
[ 255.072957] [drm] radeon: 1 quad pipes, 1 Z pipes initialized
[ 255.072965] [drm] PCI GART of 512M enabled (table at 0x0000000002C00000).
[ 255.072982] radeon 0000:00:10.0: WB enabled
[ 255.072992] radeon 0000:00:10.0: fence driver on ring 0 use gpu addr 0x0000000078000000
[ 255.073004] debugfs: File 'r100_cp_ring_info' in directory '0' already present!
[ 255.073009] debugfs: File 'r100_cp_csq_fifo' in directory '0' already present!
[ 255.073093] debugfs: File 'radeon_ring_gfx' in directory '0' already present!
[ 255.073099] [drm] radeon: ring at 0x0000000078001000
[ 255.209686] [drm:r100_ring_test [radeon]] *ERROR* radeon: ring test failed (scratch(0x15E8)=0xCAFEDEAD)
[ 255.209942] [drm:r100_cp_init [radeon]] *ERROR* radeon: cp isn't working (-22).
[ 255.210024] radeon 0000:00:10.0: failed initializing CP (-22).
[ 265.267508] radeon 0000:00:10.0: ring 0 stalled for more than 10248msec
[ 265.267542] radeon 0000:00:10.0: GPU lockup (current fence id 0x000000000000087e last fence id 0x0000000000000880 on ring 0)
[ 265.274188] radeon 0000:00:10.0: Saved 192635 dwords of commands on ring 0.
[ 265.278385] radeon 0000:00:10.0: GPU reset succeeded, trying to resume
[ 265.278429] debugfs: File 'r100_mc_info' in directory '0' already present!
[ 265.278457] [drm] radeon: 1 quad pipes, 1 Z pipes initialized
[ 265.278465] [drm] PCI GART of 512M enabled (table at 0x0000000002C00000).
[ 265.278477] radeon 0000:00:10.0: WB enabled
[ 265.278486] radeon 0000:00:10.0: fence driver on ring 0 use gpu addr 0x0000000078000000
[ 265.278495] debugfs: File 'r100_cp_ring_info' in directory '0' already present!
[ 265.278501] debugfs: File 'r100_cp_csq_fifo' in directory '0' already present!
[ 265.278584] debugfs: File 'radeon_ring_gfx' in directory '0' already present!
[ 265.278590] [drm] radeon: ring at 0x0000000078001000
[ 265.415280] [drm:r100_ring_test [radeon]] *ERROR* radeon: ring test failed (scratch(0x15E8)=0xCAFEDEAD)
[ 265.415525] [drm:r100_cp_init [radeon]] *ERROR* radeon: cp isn't working (-22).
[ 265.415608] radeon 0000:00:10.0: failed initializing CP (-22).
[ 275.488778] radeon 0000:00:10.0: ring 0 stalled for more than 10232msec
[ 275.488813] radeon 0000:00:10.0: GPU lockup (current fence id 0x000000000000087e last fence id 0x0000000000000880 on ring 0)
[ 275.493557] radeon 0000:00:10.0: Saved 123067 dwords of commands on ring 0.
[ 275.498395] radeon 0000:00:10.0: GPU reset succeeded, trying to resume
Hardware description:
Tested on two systems, behaviour is identical.
System 1: Apple iBook G4 12" 1.33 GHz
- CPU: PowerPC 7447A
- GPU: 0000:00:10.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RV360/M12 [Mobility Radeon 9550] [1002:4e56] (rev 80)
- System Memory: 1.5GB
- Display(s): Laptop LCD
- Type of Display Connection: LVDS
System 2: Apple eMac 1.42GHz
- CPU: PowerPC 7447A
- GPU: 0000:00:10.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RV350 [Radeon 9550/9600/X1050 Series] [1002:4150]
- System Memory: 2GB
- Display(s): CRT
- Type of Display Connection: VGA
System information:
- Distro name and Version: Debian unstable
- Kernel version: 6.0.0-4-powerpc. Behaviour similar on 5.5.0-2-powerpc
How to reproduce the issue:
Install debian with packages: xserver-xorg-video-radeon=19.1.0-3 firmware-amd-graphics=20221109-2 libdrm-radeon1=2.4.114-1 mesa=22.2.4-1
. Run glxgears. Useful to have an ssh session running to catch kernel output.
Log files (for system lockups / game freezes / crashes)
- Dmesg log: From after boot, before the problem. dmesg
- Syslog: syslog where I have stripped out the previous and next boots. This is from the case where the video mode is continually reset, without a complete system lockup. syslog
- Xorg log: From after the problem, recovered after reboot. Xorg.0.log