ac/nir/ngg: Support scalarized and packed 16-bit IO when lowering mesh shaders.
This MR makes it possible for ac_nir_lower_ngg_ms
to consume mesh shaders optimized by nir_opt_varyings
, that have been lowered to scalar I/O and potentially pack two 16-bit outputs into a single 32-bit slot. There is also some code cleanup.