aco: improve do_pack_2x16() with zero constants

We can skip the v_or_b32 or use an instruction smaller than
v_alignbyte_b32.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
116 jobs for aco_minmax_opt