Skip to content
Snippets Groups Projects
Commit 75dbb404 authored by Timur Kristóf's avatar Timur Kristóf
Browse files

ac/nir: Remove byte permute from prefix sum of the repack sequence.


The byte-permute instruction v_perm_b32 is not exposed by older
LLVM releases (only available on LLVM 13 and later), therefore a new
sequence is needed which we can use with these LLVM versions too.

The prefix sum is replaced by two alternatives:

1. For GPUs that support v_dot, we shift 0x01 to the wanted byte
positions and then use v_dot to sum the results.

2. For older GPUs (Navi 10), we simply shift out the unwanted bytes
and use v_sad_u8 to produce the sum.

Signed-off-by: default avatarTimur Kristóf <timur.kristof@gmail.com>
Acked-by: default avatarMarek Olšák <marek.olsak@amd.com>
Part-of: <!12786>
parent 966cff9c
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment