Use drm/scheduler
Since we refcnt submits, and even replay them in gpu hang recovery, we could just move writing the submit into it's ringbuffer into a kthread or workqueue, so that we don't block userspace waiting for rb space. This could also serialize writing into rb. Some thought/care would be required for handling gpu hang recovery while we are still writing submits into the rb.
But at that point, we should probably also be using drm/scheduler