llvmpipe compute shader support
This series adds initial support for llvmpipe compute shaders. There is no pipelining of state changes or anything, dispatches are executed until they finished. Blocks are executed in threads, and each workgroup is executed in a coroutine execution environment with each thread being a coroutine.
I'm still doing some cleanups on this but this is pretty much all working.
llvmpipe will expose GLES 3.1 after this, but due to multisample it will fail still.