aco,nir: disable/fix undef optimization for divergent merge phis
Divergence analysis currently considers merge phis with a single uniform source and an undef source to be uniform. We should only do this immediately before the backend, and ensure that the phis are always uniform.
Doing this incorrectly by making these phis uniform, utilizing that information and then making the phi divergent can lead to incorrect code generation:
if (subgroup_invocation_id == 0) {
a = uniform
}
b = phi a, undef //uniform. if any invocation takes the branch, "a" is used for undef
c = read_invocation(b, 0)
use(c)
->
if (subgroup_invocation_id == 0) {
a = uniform
}
b = phi a, undef //uniform. if any invocation takes the branch, "a" is used for undef
//read_invocation can be eliminated in the case of uniform sources
use(b)
->
if (subgroup_invocation_id == 0) {
a = uniform
}
b = phi a, undef //divergent. undef might not equal "a"
use(b) //some invocations might use undef instead of "a"
This MR both disables the optimization in most cases, and has ACO add a v_readfirstlane_b32 before the phi so that a SGPR phi can be used to ensure that the phi is actually uniform like divergence analysis expects.