aco: VGPR spilling fixes
It seems that buffer_load/store instructions cannot cross element_size boundaries. For simplicity, and as VGPR spilling rarely if ever happens, we just do it the same way as LLVM and use single dword loads/stores. A small change to the scheduler is added to prevent accidential reordering of spill instructions. This also changes the scheduling of other memory instructions, but seems to have no negative impact on performance.
Fixes: 86786999 "aco: implement VGPR spilling"