rusticl/llvmpipe: LLVM ERROR in function: cs_co_variant, on aarch64
Just a gift for @karolherbst after he invited me to try rusticl/llvmpipe on the Raspberry PI 3 CPU after having got successful results with PoCL.
So I built the same LLVM I use on amd64 (arbitrary recent commit d7d586e5) and Mesa current main
(commit 3ef88cd0).
So, some info and clinfo
output:
$ uname -a
Linux raspberrypi 5.15.61-v8+ #1579 SMP PREEMPT Fri Aug 26 11:16:44 BST 2022 aarch64 GNU/Linux
$ lscpu | grep -E '^Model name:' | sed -e 's/ \+/ /g'
Model name: Cortex-A53
$ clinfo --list
Platform #0: rusticl
`-- Device #0: llvmpipe (LLVM 16.0.0, 128 bits)
It should be noticed that all LuxMark3 kernels for the default LuxBall scene were entirely compiled, that's good!
While the application started to render (the black background was already there), the application crashed.
Here is the crash log:
[LuxCore] [72.079] [PathOCLBaseRenderThread::0] Kernels compilation time: 50339ms
LLVM ERROR: Cannot select: 0x7f2c8e1df0: v4f32 = fp_extend 0x7f2c8e22c0
0x7f2c8e22c0: v4i16 = and 0x7f2c6953f0, 0x7f2c695620
0x7f2c6953f0: v4i16 = or 0x7f2c935c80, 0x7f2c357d90
0x7f2c935c80: v4i16 = and 0x7f2c697920, 0x7f2c3c5fb0
0x7f2c697920: v4i16,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i16 %501
0x7f2c355a60: v4i16 = Register %501
0x7f2c3c5fb0: v4i16,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i16 %489
0x7f2c3c6e20: v4i16 = Register %489
0x7f2c357d90: v4i16 = and 0x7f2c35a640, 0x7f2c695460
0x7f2c35a640: v4i16,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i16 %480
0x7f2c8e45a0: v4i16 = Register %480
0x7f2c695460: v4i16,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i16 %490
0x7f2c8e4840: v4i16 = Register %490
0x7f2c695620: v4i16 = xor 0x7f2c2d8ec0, 0x7f2c9385f0
0x7f2c2d8ec0: v4i16 = truncate 0x7f2c938660
0x7f2c938660: v4i32 = and 0x7f2c2edac0, 0x7f2c8ebf10
0x7f2c2edac0: v4i32 = and 0x7f2c9356d0, 0x7f2c3c6480
0x7f2c9356d0: v4i32,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i32 %457
0x7f2c695850: v4i32 = Register %457
0x7f2c3c6480: v4i32 = xor 0x7f2c8e26b0, 0x7f2c2d9470
0x7f2c8e26b0: v4i32 = or 0x7f2c2ee4d0, 0x7f2c695a10
0x7f2c2ee4d0: v4i32,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i32 %458
0x7f2c695700: v4i32 = Register %458
0x7f2c695a10: v4i32,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i32 %481
0x7f2c8e4920: v4i32 = Register %481
0x7f2c2d9470: v4i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
0x7f2c29b340: i32 = Constant<-1>
0x7f2c29b340: i32 = Constant<-1>
0x7f2c29b340: i32 = Constant<-1>
0x7f2c29b340: i32 = Constant<-1>
0x7f2c8ebf10: v4i32,ch = CopyFromReg 0x7f2c0ba0c8, Register:v4i32 %118
0x7f2c696030: v4i32 = Register %118
0x7f2c9385f0: v4i16 = BUILD_VECTOR Constant:i32<65535>, Constant:i32<65535>, Constant:i32<65535>, Constant:i32<65535>
0x7f2c3c6170: i32 = Constant<65535>
0x7f2c3c6170: i32 = Constant<65535>
0x7f2c3c6170: i32 = Constant<65535>
0x7f2c3c6170: i32 = Constant<65535>
In function: cs_co_variant
I guess that to get more information I would have to rebuild with Debug enabled, but the non-Debug LLVM compilation on the rpi3 with only one job to fit the 1G RAM lasted two days, so a rebuild is not for today and that's all I have for now.