aco: GFX6 and GFX7 subgroup shuffle
GFX6 and GFX7 don't have the ds_bpermute
(or permute) instruction, so
this MR introduces a new pseudio instruction which will be lowered
to an "unrolled loop" of v_readlane_b32
that implements the shuffle on these old pieces
of hardware as well.
Tested with:
- Oland (GFX6)
- Bonaire (GFX7)
RADV_PERFTEST=aco RADV_DEBUG=nocache ./deqp-vk --deqp-vk-device-id=2 --deqp-case=dEQP-VK.subgroups.*shuffle*
Test run totals:
Passed: 480/1152 (41.7%)
Failed: 0/1152 (0.0%)
Not supported: 672/1152 (58.3%)
Warnings: 0/1152 (0.0%)
Edited by Timur Kristóf