ac/nir,radv: add 1 dword to LS/HS and ES/GS vertex stride
radeonsi has this optimization but radv doesn't, which makes the 16 byte alignment assumption invalid in ac_nir_lower_tess_io_to_mem.c
and ac_nir_lower_esgs_io_to_mem.c
.
This is not a problem when LLVM because align_mul
and align_offset
are not used. But when aco, this will generate ds_read_b128
for un-aligned data. So make radv use this optimization to unify the nir code.
Edited by Qiang Yu