virgl: Don't wait for pending similar buffer operations
The feature added in f9d12bf5 (use of a single buffer for both indices and vertices) introduces a significant performance regression in virgl for applications that use util/primconvert
and BOs/display lists because now we're waiting for pending reads/writes no matter what operation we want to perform. We can significantly decrease the wait time by avoid waiting for pending similar buffer operations (we still can read if only read operations are pending and we can write if only write operations are pending).
glxgears renders 300 fps in qemu/virgl before f9d12bf5, and drops to 10 fps after the commit due to ~10 msec wait on every u_primconv->mmap()
operation. Similar problem can be observed with many 2D games that use quads and quad strips. These two patches fix the performance regression.
Signed-off-by: Stéphane Marchesin marcheu@chromium.org