Implementing #5104 (closed)
The main reason is not only "lots of queries" but also the fact that the command stream is painfully not coherent with the rest of shader writes. Using a shader means we can be coherent with the rest of the transfer commands.