nak: fix SHFL.UP clamp value
Found this while I was trying to test my SHFL.UP
encoding on SM50 with dEQP-VK.subgroups.shuffle.compute.*
. Without advertising vulkan 1.1 and VK_SUBGROUP_FEATURE_SHUFFLE_BIT
the CTS will skip all of these tests, which is probably how this bug was missed on Turing previously. I don't have a Turing card to test with, but with the SM50 changes that I was working on, dEQP-VK.subgroups.shuffle.compute.subgroupshuffleup*
failed before this change and succeeded after it.
I looked at ptxas
to confirm that the semantics for the c
operand are probably not different between SM50 and SM75. shfl.sync.up.b32 ..., 0x1f
gets compiled with the same immediate on both.