radeonsi: fix tcs_out_lds_offsets arg alignment
tcs_out_lds_offsets is not sure to be 16 byte aligned, it's calculated like this:
num_patches * patch_vertices * lshs_vertex_stride
num_patches and patch_vertices are not sure to be any value aligned, lshs_vertex_stride is added one extra dword, so it's only 4 byte aligned.
This may cause problem even before we switch to nir tess output lower when write tess factor before read tail of input. But it's more likely to cause problem after we switch to nir tess output lower because the main body won't eliminate the low 4bit offset but epilog will, so they use different offset to read/write tess factor.
Fixes: 7598bfd7 ("radeonsi: replace llvm tcs output with nir lower pass") Closes: #7083 (closed) Signed-off-by: Qiang Yu yuq825@gmail.com