d3d12: force 4-byte aligntment for vertex-buffer stride
In D3D12, we require 4-byte alignment when using half-float vertext formats. It's not clear to me why, but the validator complains.
Sadly, we don't have a smaller hammer, so let's just always force 4-byte alignment istead. It's not ideal, but it works. It makes us fallback to u_vbuf.c all the time, which has some performance overhead.