panfrost: use arithmetic shifts for swizzles
This patch is inspired by !23233, the memcpy
caused a bunch of extra code to be generated when targeting armv8, which seemed sub-optimal to me. Armed with godbolt I've tried to create a better tuned version that would work on the value
directly. The generated code is now substantially smaller, albeit GCC no longer wants to inline the function at -O2
for reasons to me unknown.
Godbolt comparison: https://godbolt.org/z/4q3EsKGdW