draw llvm: highly reduce the compilation times for draw llvm
our code resets pipe_vertex_buffer's with different offsets when rendering vbo, meaning that we kept creating insane number of shaders even for simple apps e.g. geartrain had 54 shaders and it was taking almost 27 seconds just to compile them. this patch passes pipe_vertex_buffer's to the jit function and lets it to the stride/buffer_offset computation at run time. the slowdown at runtime is largely unnoticable but the we go from 54 shaders to 3, and from 27 seconds to less than 1.
Showing with 56 additions and 67 deletions