aco/spill: Insert p_start_linear_vgpr right after p_logical_end
If p_start_linear_vgpr allocates a VGPR that is already blocked, RA will try moving the blocking VGPR somewhere else. If p_start_linear_vgpr is inserted right before the branch, that move will be inserted after exec has been overwritten, which might cause the move to be skipped for some threads.
Example code showcasing the problem:
p_logical_end
s1: %210:s[14], s1: %211:scc, s1: %0:exec_lo = s_and_saveexec_b32 %209:s[15], %0:exec_lo
v1: %2299:v[5] = p_parallelcopy %204:v[255] /* this move is inserted by RA and might be skipped */
lv1: %2298:v[255] = p_start_linear_vgpr /* this should be before s_and_saveexec! */
s2: %212:s[16-17] = p_cbranch_z %0:exec_lo BB126, BB1