ir3: Be able to reduce register limit for RA when CS has barriers
If barriers are used, it must be possible for all waves in the workgroup to execute concurrently. Thus we may have to reduce the registers limit.
Fixes a hang in "Digital Combat Simulator".
After this the trace of this game renders correctly, but only if ubwc is disabled.
While I'm at it - I added an assert for the case when it would be impossible to have enough concurrent waves due to a big branchstack.
Last time I checked the blob exploded in such case: !9859 (comment 871421)