gallium: bypass u_vbuf for draw calls that don't need it
Only drivers that support all vertex formats and byte-aligned strides and offsets will use this.
This improves performance for VBOs, glBegin/End, and display lists when the GL profile is not Core.