intel: Only stall after sending all memory fence messages

In Gen11+, when emitting a fence for both L3 and SLM, instead of

SEND, MOV (for stall), SEND, MOV (for stall)


SEND, SEND, MOV (for stall), MOV (for stall)

