ac/nir: Remove byte permute from prefix sum of the repack sequence.
The byte-permute instruction v_perm_b32 is not exposed by older LLVM releases (only available on LLVM 13 and later), therefore a new sequence is needed which we can use with these LLVM versions too. The prefix sum is replaced by two alternatives: 1. For GPUs that support v_dot, we shift 0x01 to the wanted byte positions and then use v_dot to sum the results. 2. For older GPUs (Navi 10), we simply shift out the unwanted bytes and use v_sad_u8 to produce the sum. Signed-off-by: Timur Kristóf <email@example.com> Acked-by: Marek Olšák <firstname.lastname@example.org> Part-of: <!12786>