Problems related to the lack of GPU preemption

#92 (comment 1379786) pointed out a denial of service attack on compositors that use dma-bufs. However, I think there is actually a broader class of problems: the lack of effective GPU preemption means that a rogue client can hog the GPU for an unpredictable amount of time. This is a nasty problem for any GPU use with untrusted tenants, not just Wayland.

There are two solutions I can think of:

Partition the GPU (either in software or hardware) such that each tenant has access to a disjoint subset of GPU resources. That tenant can lock up the resources it has been assigned, but is not able to interfere with other tenants.
Ensure that on-GPU computations can be involuntarily preempted in a small and bounded amount of time.

In either case, enforcement must be by means of the kernel driver and/or the compositor, even if the other parts of the system are malicious. While local DoS from a userspace process is typically not considered very serious, the same problems arise with any form of GPU virtualization that works on ordinary hardware with FLOSS drivers (read: does not rely on hardware SR-IOV).

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Problems related to the lack of GPU preemption