aco: improvements to VMEMtoScalarWriteHazard mitigation

Rhys Perry requested to merge pendingchaos/mesa:aco_vmem_swrite into master

Apparently the s_waitcnt_depctr is potentially faster than v_nop: https://reviews.llvm.org/D83872

Merge request reports