Optimize subgroup ops in the presence of uniformity
There are a bunch of potential optimizations we can do if we find a subgroup op which has a uniform input. For shuffles, we can drop the shuffle and just use the input. For min/max/and/or/xor scan/reduce, we can drop the scan/reduce. For FS derivatives, they're zero. There are probably a bunch more. We should add a pass for this. It'll have to be something somewhat specialized because it'll require divergence analysis and therefore LCSSA form.