gallium/llvmpipe: add an optimised 32-bit memset
This might have other users beyond the gallium filling/clearing code.
This increases a fullscreen 4k gears from 68->74 fps on my Ryzen since gears is really just a clear benchmark, and this helps clearing.