[r600, CAICOS] Crash in r600::AluInstr::split when linking SMAA shaders
System information
- OS: Gentoo Linux
- GPU: 0030:01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Caicos PRO [Radeon HD 7450] [1002:677b]
- Kernel version: Linux hakua 5.4.242 #1 SMP Fri May 12 19:40:12 CEST 2023 ppc64 POWER9, altivec supported PowerNV T2P9D01 REV 1.00 GNU/Linux
- Mesa version: OpenGL version string: 3.2 (Compatibility Profile) Mesa 23.0.3
- Xserver version (if applicable): X.Org X Server 1.21.1.8
- Desktop manager and compositor: XFCE
Describe the issue
After upgrading to KiCAD 7, I found that starting either the schematic or layout editor immediately crashed with a segfault. After some investigation I found that the crash was in the Mesa GLSL compiler, and would only happen if the "Accelerated graphics" setting was set to "Fast Antialiasing" (changing it to "High Quality Antialiasing" made the issue go away). So it turns out that the problem was caused by the particular shaders used by that rendering mode. They are based on SMAA, which is a rather large code base, but I was able to prune the shader code down to a somewhat manageable size while still producing the same crash, and turn it into a stand-alone reproduction program: mesabug2.cpp
When running this program I reliably get a crash in r600::AluInstr::split
with the following backtrace:
Thread 1 "mesabug2" received signal SIGSEGV, Segmentation fault.
0x00007ffff61d2478 in r600::AluInstr::split (this=0x1009cdd70, vf=...) at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_instr_alu.cpp:762
762 auto r = old_src->as_register();
(gdb) bt
#0 0x00007ffff61d2478 in r600::AluInstr::split (this=0x1009cdd70, vf=...) at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_instr_alu.cpp:762
#1 0x00007ffff61336e0 in r600::CollectInstructions::visit (this=0x7fffffffc270, instr=0x1009cdd70)
at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_scheduler.cpp:59
#2 0x00007ffff61ca2b8 in r600::AluInstr::accept (this=<optimized out>, visitor=...) at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_instr_alu.cpp:192
#3 0x00007ffff6132db4 in r600::CollectInstructions::visit (this=0x7fffffffc270, instr=0x1009ccfc0)
at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_scheduler.cpp:69
#4 0x00007ffff61c7b18 in r600::Block::accept (this=<optimized out>, visitor=...) at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_instr.cpp:328
#5 0x00007ffff61379fc in r600::BlockSheduler::schedule_block (this=this@entry=0x7fffffffc800, in_block=..., out_blocks=std::__cxx11::list = {...},
vf=...) at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_scheduler.cpp:284
#6 0x00007ffff6139114 in r600::BlockSheduler::run (this=this@entry=0x7fffffffc800, shader=shader@entry=0x100910820)
at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_scheduler.cpp:266
#7 0x00007ffff61396d4 in r600::schedule (original=0x100910820) at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_scheduler.cpp:231
#8 0x00007ffff610ca60 in r600_shader_from_nir (rctx=0x1000696c0, pipeshader=0x10099f150, key=0x7fffffffcf00)
at ../mesa-23.0.3/src/gallium/drivers/r600/sfn/sfn_nir.cpp:1003
#9 0x00007ffff61a1650 in r600_pipe_shader_create (ctx=0x1000696c0, shader=0x10099f150, key=...)
at ../mesa-23.0.3/src/gallium/drivers/r600/r600_shader.c:231
#10 0x00007ffff60b3ff0 in r600_shader_select (ctx=ctx@entry=0x1000696c0, sel=sel@entry=0x100905a70, dirty=dirty@entry=0x7fffffffcfd7,
precompile=precompile@entry=true) at ../mesa-23.0.3/src/gallium/drivers/r600/r600_state_common.c:967
#11 0x00007ffff60b43c0 in r600_create_shader_state (ctx=0x1000696c0, state=<optimized out>, pipe_shader_type=<optimized out>)
at ../mesa-23.0.3/src/gallium/drivers/r600/r600_state_common.c:1071
#12 0x00007ffff585f380 in st_create_nir_shader (st=st@entry=0x10023f550, state=<optimized out>, state@entry=0x7fffffffd168)
at ../mesa-23.0.3/src/mesa/state_tracker/st_program.c:566
#13 0x00007ffff58610c4 in st_create_fp_variant (st=st@entry=0x10023f550, fp=fp@entry=0x100922b90, key=key@entry=0x7fffffffd618)
at ../mesa-23.0.3/src/mesa/state_tracker/st_program.c:1063
#14 0x00007ffff58626e4 in st_get_fp_variant (st=st@entry=0x10023f550, fp=fp@entry=0x100922b90, key=key@entry=0x7fffffffd618)
at ../mesa-23.0.3/src/mesa/state_tracker/st_program.c:1108
#15 0x00007ffff5862f14 in st_precompile_shader_variant (prog=0x100922b90, st=0x10023f550) at ../mesa-23.0.3/src/mesa/state_tracker/st_program.c:1295
#16 st_finalize_program (st=0x10023f550, prog=0x100922b90) at ../mesa-23.0.3/src/mesa/state_tracker/st_program.c:1357
#17 0x00007ffff5bb2338 in st_link_nir (ctx=0x1001e9ca0, shader_program=<optimized out>) at ../mesa-23.0.3/src/mesa/state_tracker/st_glsl_to_nir.cpp:940
#18 0x00007ffff5baa09c in link_shader (prog=0x10016e130, ctx=0x1001e9ca0) at ../mesa-23.0.3/src/mesa/state_tracker/st_glsl_to_ir.cpp:97
#19 st_link_shader (ctx=0x1001e9ca0, prog=0x10016e130) at ../mesa-23.0.3/src/mesa/state_tracker/st_glsl_to_ir.cpp:112
#20 0x00007ffff5b7c508 in _mesa_glsl_link_shader (ctx=0x1001e9ca0, prog=0x10016e130) at ../mesa-23.0.3/src/mesa/program/link_program.cpp:91
#21 0x00007ffff5b4988c in link_program (no_error=<optimized out>, shProg=<optimized out>, ctx=<optimized out>)
at ../mesa-23.0.3/src/mesa/main/shaderapi.c:1332
#22 link_program_error (ctx=0x1001e9ca0, shProg=0x10016e130) at ../mesa-23.0.3/src/mesa/main/shaderapi.c:1443
#23 0x00007ffff6db97c8 in shared_dispatch_stub_509 (program=<optimized out>)
at /tmp/portage/media-libs/mesa-23.0.3-r1/work/mesa-23.0.3-.ppc64/src/mapi/shared-glapi/glapi_mapi_tmp.h:23834
#24 0x00007ffff7173808 in ?? () from /usr/lib64/libGLdispatch.so.0
#25 0x0000000100001400 in linkShaders () at mesabug2.cpp:94
#26 0x00000001000014c0 in main (argc=1, argv=0x7fffffffe168) at mesabug2.cpp:104
(gdb)
(gdb) print m_src
$1 = std::vector of length 4, capacity 4 = {0x100914b70, 0x1009cda10, 0x100914900, 0x1009cda10}
(gdb) print old_src
$3 = (r600::VirtualValue *) 0x100000000
(gdb)
You will notice that the value of old_src
can not be found in the m_src
array that it is supposed to be loaded from on line 758:
757 for (int i = 0; i < alu_ops.at(m_opcode).nsrc; ++i) {
758 auto old_src = m_src[s * alu_ops.at(m_opcode).nsrc + i];
759 // Make it easy for the scheduler and pin the register to the
760 // channel, otherwise scheduler would have to check whether a
761 // channel switch is possible
762 auto r = old_src->as_register();
In fact, 0x100000000
is the base address of the main binary, containing its ELF header, and not a valid data pointer at all.
It might be caused by an out-of-bounds array access, but I was unable to examine the value of alu_ops.at(m_opcode).nsrc
due to optimization.
i
has the value 0
, and s
the value 2
.