amd/llvm,radeonsi: fix streamout overflow, switch Gfx11 streamout to GDS_STRMOUT registers
RADV is only missing nir_intrinsic_xfb_counter_sub_amd
implementation for the overflow fix. The new Gfx11 streamout code is disabled on RADV.
This is required by register shadowing (required by the new PAIRS packets), preemption, user queues, and we only have to wait for VS after streamout, not PS. This is how gfx11 streamout should have been done.
Edited by Marek Olšák