radv: use specialized DGC shaders - part 1
This MR creates one DGC compute shader per indirect command layout and there are two main reasons:
- it's cleaner than passing all params as push constants
- it's faster than a pile of conditional SALU instructions (I benchmarked it)
This will also allow us to do even more improvements, like computing the layout stride automatically when building the NIR shader instead of computing it manually (it's currently very error prone).
The first part basically inline all parameters that are known at creation time.