v3dv: reimplement occlusion queries
Our implementation was mostly CPU-based, with things such as query resets and result copying handled in the CPU, as well as some aspects of query availability tracking.
This new implementation handles all GPU-side query functions by dispatching compute shaders to push the work to the GPU. This involves query availability, reset and result copying.
For now, only occlusion queries are managed this way. Performance queries can also be implemented in a similar fashion in the future with some additional work, however, for timestamp queries our only option to improve this would be to execute the actual timestamp in the kernel, since we can't take a timestamp from a shader.