r300: optimize pass "source conflict resolve"
("source conflict resolve" pass is workaround for hardware incapability to use 2 const/input registers in one instruction) Compiler without this patch is often wasting temp registers.
For example:
Vertex Program: after 'source conflict resolve'
Radeon Compiler Program
0: MOV temp[3], const[0];
1: MUL temp[0], const[4].xxxx, temp[3];
2: MOV temp[4], const[1];
3: MAD temp[1], const[4].yyyy, temp[4], temp[0];
4: MOV temp[5], const[2];
5: MAD temp[0], const[4].zzzz, temp[5], temp[1];
6: MOV temp[6], const[3];
7: MAD temp[2], const[4].wwww, temp[6], temp[0];
8: MOV output[0], temp[2];
9: MOV output[1], temp[2];
After this patch:
Vertex Program: after 'source conflict resolve'
Radeon Compiler Program
0: MOV temp[3], const[4];
1: MUL temp[0], temp[3].xxxx, const[0];
2: MAD temp[1], temp[3].yyyy, const[1], temp[0];
3: MAD temp[0], temp[3].zzzz, const[2], temp[1];
4: MAD temp[2], temp[3].wwww, const[3], temp[0];
5: MOV output[0], temp[2];
6: MOV output[1], temp[2];
Tagging @mareko @anholt, all advices how to make this look nicer are gonna be appreciated. :)
@anholt can you try running deqp_gles2 on your CI? (I have issue with running full test suit, test nr 7658 is hardlocking my pc.) Thanks.