nir: handle more cases of phi shrinking
This patch add support for removing unused phi channels if there is a circular dependency of the phi nodes. For example here
...
vec4 32 ssa_318 = vec4 ssa_57.w, ssa_57.w, ssa_57.w, ssa_303
/* succs: block_11 */
loop {
block block_11:
/* preds: block_10 block_20 */
....
vec4 32 ssa_323 = phi block_10: ssa_318, block_20: ssa_446
....
/* succs: block_15 block_19 */
if ssa_347 {
block block_15:
/* preds: block_14 */
...
vec3 32 ssa_442 = fadd! ssa_441, ssa_323.xyz
vec4 32 ssa_443 = vec4 ssa_442.x, ssa_442.y, ssa_442.z, ssa_541
....
/* succs: block_20 */
} else {
block block_19:
/* preds: block_14 */
/* succs: block_20 */
}
block block_20:
/* preds: block_18 block_19 */
...
vec4 32 ssa_446 = phi block_18: ssa_443, block_19: ssa_323
...
/* succs: block_11 */
}
block block_21:
/* preds: block_12 */
vec3 32 ssa_452 = fmul!.sat ssa_323.xyz, ssa_287
...
the pass will be able to shrink the phis to 3 channels.
This is just a basic version, it only works for the channels where the two phis at the beginning and end of the loop directly point at each other. It won't work if there is even a simple mov with trivial swizzle in between. This is highlighted in the test where some more possible opportunities are shown, however the dependency tracking would be quite complex and I haven't found any real-world examples where this would be needed.
Helps few CXBX-r shaders with r300 driver, where this allows some loops to unroll without actually going over the instruction limit afterwards. No other change right now, because there is currently also quite complex DCE in the backend. But doing this in NIR would allow me to get rid of the backend DCE later.
CC @daniel-schuermann per previous request