panfrost: use arithmetic shifts for swizzles

kiroma requested to merge kiroma/mesa:swizzle into main

This patch is inspired by !23233, the memcpy caused a bunch of extra code to be generated when targeting armv8, which seemed sub-optimal to me. Armed with godbolt I've tried to create a better tuned version that would work on the value directly. The generated code is now substantially smaller, albeit GCC no longer wants to inline the function at -O2 for reasons to me unknown. Godbolt comparison:

