ac/nir: Remove byte permute from prefix sum of the repack sequence.
The byte-permute instruction v_perm_b32 is not exposed by older LLVM releases (only available on LLVM 13 and later), therefore a new sequence is needed which we can use with these LLVM versions too. The prefix sum is replaced by two alternatives: 1. For GPUs that support v_dot, we shift 0x01 to the wanted byte positions and then use v_dot to sum the results. 2. For older GPUs (Navi 10), we simply shift out the unwanted bytes and use v_sad_u8 to produce the sum. Signed-off-by:Timur Kristóf <timur.kristof@gmail.com> Acked-by:
Marek Olšák <marek.olsak@amd.com> Part-of: <!12786>