ir3/a6xx: do not double threadsize if exceeding branchstack limit and make possible to specify branchstack up to 64
On a6xx there is such dependency between branchstack bitfield and the amount of nested ifs, which could be seen with blob:
IFs BRANCHSTACK
0 0
1 1
2 2
3 2
4 3
5 3
6 4
...
59 30
60 31
61 31
62 32
63 32
64 32
Tested by running shader with switch-case varying the number of cases. Shader started to produce wrong result starting from 65 cases.
The shader was:
layout(local_size_x=128, local_size_y=1, local_size_z=1) in;
layout(binding = 0) buffer block {
uint values[];
};
void main()
{
uint x = 0;
switch (gl_GlobalInvocationID.x) {
case 0: { x = NUM_CASES; break; }
case 1: { x = NUM_CASES - 1; break; }
...
case NUM_CASES: { x = 1; break; }
}
values[gl_GlobalInvocationID.x] = x;
}
When branchstack exceeds the limit - blob does nothing about it. However, the solution would be not doubling the wave size (blob keeps it doubled).
CC: @cwabbott0