CSE inverted comparisons
Add a new pass or modify the existing CSE pass to recognize cases where a comparison and it's inverse are redundant. I have seen things like the following sequence many, many times.
vec1 32 ssa_116 = fge32 abs(ssa_113), ssa_109
vec1 32 ssa_117 = b32csel ssa_116, ssa_97, ssa_115
vec1 32 ssa_118 = ffma -ssa_92, ssa_12, ssa_98
vec1 32 ssa_119 = b32csel ssa_116, ssa_98, ssa_118
vec1 32 ssa_120 = flt32 abs(ssa_113), ssa_109
vec1 32 ssa_121 = flt32 abs(ssa_114), ssa_109
vec1 32 ssa_122 = ior ssa_121, ssa_120
vec1 32 ssa_123 = ffma ssa_91, ssa_12, ssa_99
vec1 32 ssa_124 = fge32 abs(ssa_114), ssa_109
vec1 32 ssa_125 = b32csel ssa_124, ssa_99, ssa_123
vec1 32 ssa_126 = ffma ssa_92, ssa_12, ssa_100
vec1 32 ssa_127 = b32csel ssa_124, ssa_100, ssa_126
/* succs: block_5 block_12 */
if ssa_122 {
In this case both fge
operations could be removed by swapping the order of the bcsel
operands and using the result of one of the flt
operations.
This case is a little unusual in that the comparison results are used with a logical operation. By far the more common case (based on my observations) is where all the comparisons are used only by bcsel
instructions.
There is a similar potential for "more CSE" with negated operations. I have also seen many cases where both (a × b) and (a × -b) are both used as ALU sources.