32-bit armhf mesa/panthor can lock up RK3588 SoC
System information
64-bit host system
System:
Host: reform Kernel: 6.8-rc1+unreleased-arm64 arch: aarch64 bits: 64
compiler: gcc v: 13.2.0
Desktop: GNOME v: N/A tk: GTK v: 3.24.41 wm: Sway dm: N/A Distro: Debian
GNU/Linux trixie/sid
CPU:
Info: quad core (2-mt/2-st) model: N/A variant-1: cortex-a76
variant-2: cortex-a55 bits: 64 type: MST AMCP arch: ARMv8 rev: 0 cache:
L1: 768 KiB L2: 2.5 MiB L3: 3 MiB
Speed: N/A min/max: N/A cores: No per core speed data found. bogomips: N/A
Features: Use -f option to see features
Graphics:
Device-1: display-subsystem driver: rockchip_drm v: N/A bus-ID: N/A
chip-ID: rockchip:display-subsystem
Device-2: rk3588-mali driver: panthor v: kernel bus-ID: N/A
chip-ID: rockchip:fb000000
Device-3: rk3588-dw-hdmi driver: dwhdmi_rockchip v: N/A bus-ID: N/A
chip-ID: rockchip:fde80000
Device-4: rk3588-dw-hdmi driver: dwhdmi_rockchip v: N/A bus-ID: N/A
chip-ID: rockchip:fdea0000
Display: wayland server: X.org v: 1.21.1.11 with: Xwayland v: 23.2.4
compositor: Sway v: 1.9 driver: X: loaded: modesetting dri: meson
gpu: cdn-dp,dw-mipi-dsi-rockchip,dwhdmi-rockchip,innohdmi-rockchip,rockchip-dp,rockchip-drm,rockchip-lvds,rockchip-vop,rockchip-vop2
display-ID: 1
Monitor-1: HDMI-A-1 model: ChiMei InnoLux 0x1239 res: 1920x1080 dpi: 177
diag: 317mm (12.5")
Monitor-2: HDMI-A-2 size-res: N/A
API: EGL v: 1.5 platforms: device: 0 egl: 1.4 drv: panthor device: 1
drv: swrast gbm: egl: 1.4 drv: rockchip surfaceless: egl: 1.4 drv: panthor
wayland: egl: 1.4 drv: panthor x11: egl: 1.4 drv: panthor
API: OpenGL v: 4.5 compat-v: 3.3 vendor: mesa v: N/A glx-v: 1.4
direct-render: yes renderer: Mali-G610 (Panfrost)
device-ID: ffffffff:ffffffff
API: Vulkan v: 1.3.275 surfaces: xcb,xlib,wayland device: 0 type: cpu
driver: N/A device-ID: 10005:0000
Describe the issue
I am trying to run x86 Linux games through box86 on RK3588 (collabora kernel, mesa git with panthor). The two games I want to play reliably cause a complete lockup of the SoC when using the GPU through mesa/panthor. This issue does not happen with box64 / 64-bit games so far.
I managed to boil this down a bit to the armhf build of mesa being the issue. First of all, to get panthor working for armhf binaries, I built the same mesa as on the host Debian but in a debian armhf chroot, then copied out the resulting libraries into the host system's /usr/lib/arm-linux-gnueabihf/
, and I have the required dependencies installed via apt's multiarch feature.
The games in question (Dex and Va11-Halla) work fine with LIBGL_ALWAYS_SOFTWARE=1, but they lock up when running on the real GPU. So I did an apitrace
of Dex where I first set MESA_GL_VERSION_OVERRIDE=3.1 and MESA_GLSL_VERSION_OVERRIDE=310. This apitrace is attached. When playing this back on the 64-bit apitrace on the real GPU, the playback works fine, but there are a bunch of messages like:
2830: message: shader compiler issue 1: MESA_SHADER_FRAGMENT shader: 13 inst, 0.500000 cycles, 0.000000 fma, 0.125000 cvt, 0.062500 sfu, 0.500000 v, 0.250000 t, 0.000000 ls, 8 quadwords, 2 threads, 0 loops, 0:0 spills:fills
2830: message: shader compiler issue 1: MESA_SHADER_POSITION shader: 10 inst, 3.000000 cycles, 0.093750 fma, 0.000000 cvt, 0.062500 sfu, 0.000000 v, 0.000000 t, 3.000000 ls, 8 quadwords, 2 threads, 0 loops, 0:0 spills:fills
2830: message: shader compiler issue 1: MESA_SHADER_VARYING shader: 3 inst, 3.000000 cycles, 0.000000 fma, 0.000000 cvt, 0.000000 sfu, 0.000000 v, 0.000000 t, 3.000000 ls, 8 quadwords, 2 threads, 0 loops, 0:0 spills:fills
3035: message: shader compiler issue 1: MESA_SHADER_POSITION shader: 30 inst, 3.000000 cycles, 0.343750 fma, 0.062500 cvt, 0.062500 sfu, 0.000000 v, 0.000000 t, 3.000000 ls, 16 quadwords, 2 threads, 0 loops, 0:0 spills:fills
3035: message: shader compiler issue 1: MESA_SHADER_VARYING shader: 0 inst, 0.000000 cycles, 0.000000 fma, 0.000000 cvt, 0.000000 sfu, 0.000000 v, 0.000000 t, 0.000000 ls, 0 quadwords, 2 threads, 0 loops, 0:0 spills:fills
3035: message: shader compiler issue 1: MESA_SHADER_FRAGMENT shader: 8 inst, 0.093750 cycles, 0.000000 fma, 0.093750 cvt, 0.000000 sfu, 0.000000 v, 0.000000 t, 0.000000 ls, 8 quadwords, 2 threads, 0 loops, 0:0 spills:fills
I then installed apitrace:armhf in my chroot and executed it from the outside. When replaying the trace, it locks up the system. Ergo, there is an OpenGL command sequence that can cause this lockup through panthor on armhf. The apitrace is attached.