broadcom/compiler: avoid using ldvary sequence to hide latency of branching

Merged Iago Toral requested to merge itoral/mesa:firefox_bug into main

This can cause us to stomp the contents of r5 before we have a chance to read it, like this:

0x3d103186bb800000 nop                           ; nop                         ; ldvary.r0
0x3d105686bbf40000 nop                           ; mov rf26, r5                ; ldvary.r1
0x020000ef0000d000 bu.allna  232, r:unif (0x0000001c / 0.000000)
0x3d1096c6bbf40000 nop                           ; mov rf27, r5                ; ldvary.r2

Here, the MOV in the last instruction is supposed to read r5 produced from ldvary.r0, but because we have inserted the bu instruction in between now that read happens at the same time that ldvary.r1 updates r5, stomping the value we were supposed to read.

Fix this by disallowing injection of a branch instruction in between an ldvary instruction and its write to the r5 register 2 instructions later.

Merge request reports