ac/nir,radv: add 1 dword to LS/HS and ES/GS vertex stride

Qiang Yu requested to merge yuq825/mesa:topic/radv-lshs-stride into main

radeonsi has this optimization but radv doesn't, which makes the 16 byte alignment assumption invalid in ac_nir_lower_tess_io_to_mem.c and ac_nir_lower_esgs_io_to_mem.c.

This is not a problem when LLVM because align_mul and align_offset are not used. But when aco, this will generate ds_read_b128 for un-aligned data. So make radv use this optimization to unify the nir code.

Edited by Qiang Yu

Merge request reports