[RFC][WiP] Magma: Cross-platform system call interface [RFC][WiP]
Magma in Mesa RFC
We propose the introduction of a cross-platform GPU system call library in Mesa, provisionally named Magma. The library is similar to libdrm, but with a more modern and unified interface for the following backends:
- WDDM
- Fuchsia
- DRM
- possibly some well-known microkernel RTOSes
The library will be written in Rust, expose a C FFI boundary and be housed in the Mesa repository. The first use cases of Magma will be enabling paravirtualization for Fuchsia and Linux hosts.
Motivation
Performance
For optimal GPU paravirtualization performance for both Linux guest + Fuchsia host and Android guest + Linux host embedded use cases, the host driver in the guest must be used. These solutions will use Vulkan- over-gfxstream for cloud testing + emulator use cases, but prefer low-level system call forwarding for the “non-emulator embedded use case”.
Security
For embedded GPU paravirtualization, our goal is a 100% Rust based solution at the host userspace virtual machine manager (VMM) level. We've been working towards via Rutabaga. Xen developers have also expressed interest in this goal.
Use of both virglrenderer and gfxstream inhibit this, since they both rely on many instances of global data. That prevents Rutabaga from being shared safely across different threads when those features are enabled.
Cross-platform support
This will be an increasing problem as Mesa becomes more cross platform. One only needs to take a look at branches for:
to see potential for de-duplication. Depending on the driver, Mesa separates out KMD specific code in terms of “winsys” or “KMD” backends. These can be categorized in the following ways:
- Real DRM backends (most important)
- Downstream backends (i.e, Turnip KGSL)
- virtio native context backends (nascent, freedreno/AMD in-tree, Intel WiP)
- WDDM backends (nascent, RADV WiP now)
- Fuchsia backends (nascent)
Magma’s use case would be the “nascent” backends, and not the crucial real DRM backends which have to work everywhere. For example, instead of having a:
- RADV Fuchsia
- WDDM backend
- corresponding virtio-gpu contexts for both
We can just have one Magma backend for those cases in theory, but we won’t necessarily insist on it.
We also envisage Magma has a faster moving interface for non-traditional OSes (Fuchsia, HarmonyOS?, uxrt?, Redox OS? etc.), that incorporates the many lessons incorporated over the years.
Design
Crates housed in Mesa
A few extra will be housed in Mesa:
- mesa3d_utils
- mesa3d_virtio_gpu
- mesa3d_magma
- mesa3d_magma_ffi
The Rutabaga Virtual Graphics Interface (VGI) will be the first non-Mesa consumer of these crates, and they will be uploaded to crates.io.
Rust dependencies
Given the proposed library is written in Rust, the most controversial part will be adding more optional Rust dependencies into the Mesa 3D. The dependencies we will need are:
- nix 0.28 or higher (use: ioctls + Linux syscalls)
- zerocopy ^0.7 (zero copy for structs)
- cfg-if 1.0.0
- log 4.0
Given these dependencies, Magma will have to be behind a flag. Android will be the first distro to use it; the dependencies are already there. The benefit is that these are standard Rust dependencies, and the quality of Mesa Rust will improve due this initiative.
Fuchsia dependencies do not need to be imported, since it will use the meson2hermetic design.
Current state
Very very rough, but you should be able to get the general idea???
Collaboration
We welcome any review feedback and opportunities for shared work-streams. All work will be done in open-source repositories. All resultant code will have its upstream in Mesa and will be MIT licensed.
Frequently asked questions
Q: Does the introduction of more Rust dependencies mean we can rewrite Mesa in Rust?
You said it, not me. Maybe if we ever need a new compiler this could be handy..
Q: Why not do this in C or C++?
Is this 1987?
Q: Why didn’t the DRM native context adopt this design?
Magma enthusiasts did recommend using Rust and Magma during the initial bring-up of the DRM native context, internally and externally.
Though, the thing is, the DRM native context for freedreno was done since there was a tight deadline to ship the ARCVM project on Chromebooks, and API virtualization performance wasn’t good enough. Were Magma enthusiasts ever going to make a big deal and potentially further aggravate deadlines due our technical ideals? It’s just not our style.
Additionally, the “official policy” for ChromeOS was API virtualization + SR-IOV for dGPUs at that point. Some efforts did not pan out. So we weren’t sure if the native context stuff was ever the “official policy” for ChromeOS (it still isn’t IIUC).
Similarly, while we like Magma as a design, the resourcing/timing wasn’t there to pursue it. Many consumers start with gfxstream_vk since it meets many needs, and want features/fixes there. We weren’t going to waste anybody’s time suggesting designs without actual code to discuss, and we couldn’t get to this prototype until customer gfxstream_vk desires were fulfilled.
Native context enthusiasts were always open to rust-ification at a future point, so here we are.
Q: Are you suggesting we delete DRM native contexts?
No. It would be hard to do if anybody wanted to. For example, AMD has plans with ROCm and virgl video. Our design probably will use clvk + Magma and virtio-media. We know AMD probably has a “good business reason” to do what they are doing though, and generally are supportive of their efforts.
For us, we have good technical/business reasons for Magma. We imagine when the business and technical purity timelines align, we can make a deal about merging everything and making everything nice and clean.
Q: Can't we just use vDRM?
vDRM was designed with guest/host DRM + C in mind (not necessarily a bad thing!). It is useful to compare the diagram at the beginning of this MR, to the diagram from the recent blogpost on the subject.
We think a wider scope for the problem would benefit from a clean-room design and implementation.
Q: A common system call interface will be difficult. DRM uAPI in particular isn’t really designed to be common across drivers. How will this work at all?
Magma will try to come up with a hardware-agnostic API. In general, the DRM UAPI is probably too device-specific,, but full-convergence across SoCs is also likely impractical. The D3DKMT API works by using private driver data, but the private driver data is publicly undocumented.
Magma can generally define device-specific functions in a public header function. We’re aiming for higher-level of standardization compared to DRM and D3DKMT.
Q: When will Magma be merged?
Our intention is to merge Magma when:
- There is clear business-level adoption. Contracts are signed and consumers who don't know anything at-all about graphics will be using a Magma interface in their product.
- The following use cases working:
- Starnix GPU (Linux on Fuchsia)
- Android on Linux (RADV UMD) (fully working on target HW)
- Linux/Android on Windows (RADV UMD) (probably offscreen only)
We just wanted to share our idea, to get early feedback and potentially de-duplicate work. If anyone would find use for Magma outside of this, we are happy to move up merge timelines. We'll keep this branch updated until then, and also land some code downstream.
Q: Another virtio-gpu context type? *BANGS HEAD AGAINST WALL*
There are a lot of current and proposed context types:
- virgl
- gfxstream_vk
- venus
- DRM native context
Proposed:
It’s symptomatic of the problem space: everyone is rightfully focused on their own project. For example, it’s completely rational for:
- Linux developers not to care about more exotic hosts (Windows, Fuchsia, QNX)
- experienced C/C++ developers not to care about Rust.
- QEMU based-projects not to care about other projects based on other VMMs (crosvm, rust-vmm)
The key thing, in our humble opinion, is working in the same codebases and aligning over time. This implies some contexts will be deleted upstream and some will be merged.