nir,intel: Improvements for lowered 8-bit arithmetic
There's one bug fix here for a bcsel
-of-shuffle
optimization and the rest is focused on trying to reduce the pain from lowering 8-bit arithmetic to 16-bit. I was primarily looking at some of the 8-bit subgroup operation tests where we were getting slammed to the wall with extra MOV
instructions. The most important patch here for reducing them is probably actually the last one because it gets rid of most 8-bit instruction destinations which cause our back-end compiler no end of trouble.