intel/compiler: Need a pass to merge CFG blocks
While working on something unrelated, I stumbled on a case where reconstructing the CFG right before scheduling produced a different schedule. One affected shader was kerbal-space-program/1017.shader_test
. Below is a diff of the two shaders.
@@ -340,16 +340,15 @@
mul(8) g6<1>F g17<8,8,1>F g4.7<0,1,0>F { align1 1Q compacted };
mul(8) g7<1>F g18<8,8,1>F g4.7<0,1,0>F { align1 1Q compacted };
END B0 ->B1
- START B1 <-B0 <-B2 (140 cycles)
+ START B2 <-B1 <-B3 (260 cycles)
LABEL1:
mov(1) g63<1>D 1077936128D { align1 WE_all 1N };
-mov(1) g63.1<1>D 1073741824D { align1 WE_all 1N };
END B1 ->B2 ->B4
- START B2 <-B1 <-B3 (240 cycles)
cmp.ge.f0.0(8) null<1>D g3<8,8,1>D 60D { align1 1Q compacted };
+mov(1) g63.1<1>D 1073741824D { align1 WE_all 1N };
(+f0.0) break(8) JIP: LABEL0 UIP: LABEL0 { align1 1Q };
- END B2 ->B1 ->B4 ->B3 ->B3
+ END B2 ->B1 ->B4 ->B3
START B3 <-B2 (10180 cycles)
mul(8) g22<1>D g3<8,8,1>D 12W { align1 1Q };
add(8) g3<1>D g3<8,8,1>D 1D { align1 1Q compacted };
In the before shader, this segment
START B1 <-B0 <-B2 (140 cycles)
LABEL1:
mov(1) g63<1>D 1077936128D { align1 WE_all 1N };
mov(1) g63.1<1>D 1073741824D { align1 WE_all 1N };
END B1 ->B2 ->B4
START B2 <-B1 <-B3 (240 cycles)
cmp.ge.f0.0(8) null<1>D g3<8,8,1>D 60D { align1 1Q compacted };
(+f0.0) break(8) JIP: LABEL0 UIP: LABEL0 { align1 1Q };
END B2 ->B1 ->B4 ->B3 ->B3
had two blocks that are a single block in the reconstructed CFG. I believe dead_control_flow_eliminate
should be enhanced to fuse these two blocks.