aco/lower_to_hw: optimize split 64bit constant copies
Most of the stat benefits come from using s_bfm_b64
for creating immutable samplers. But I think it's also just nice to have all of these micro optimizations in one place instead of spread around the compiler.