Skip to content
Snippets Groups Projects

Draft: Config file format for GPU offloading

This adds a config file format for the GPU offloading interface in !224.

With this, the app profile library will read config files in a couple directories to find the app profile data for the current process. The app profile library itself provides the same interface, so the format (or even existence) of the config files is opaque to the rest of libglvnd. And since this all gets called from libglvnd, it will work regardless of which window system or desktop environment (or lack thereof) you happen to be using.

It'll also work correctly with things like Steam, where just having the desktop environment set an environment variable would not work.

The format I've defined here is TOML-based. I picked that because it's easy to parse, easy to extend, and is human- or machine-editable. The array-of-tables syntax also makes defining a list of profiles pretty clean.

Each config file can define a list of profiles, and the parser will search until it finds one that matches the current process. The profile can then have a list of device ID's that get handed back to EGL/GLX/Vulkan. The device ID's are strings in the format of vendor_name/device_uuid, just like you'd pass with the environment variable.

Currently, the parser can match a profile based on the API name and based on the value in /proc/self/exe and /proc/self/comm. The match rules are structured so that we can add new ones without breaking compatibility, too. Off the top of my head, we'll probably want something that could be used with programs running through Wine or an interpreter.

In addition to using a specific device ID, a profile can also use a symbolic alias. In that case, the same config files can define a device ID for that alias. The reason for the alias feature is so that users (or distros or drivers or whoever) can define a profile for an application with a generic category like "performance", without having to know what the specific hardware configuration is going to be. Then, another config file can define which device is the "performance" device using whatever policy you want. A user could define an alias manually, or you could have a program generate a config file at startup or in response to hotplug events.

Edited by Kyle Brenneman

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Kyle Brenneman mentioned in merge request !224

    mentioned in merge request !224

  • Kyle Brenneman added 118 commits

    added 118 commits

    • e905215f...5024e579 - 96 commits from branch glvnd:master
    • a157b862 - Added a basic app profile interface.
    • e664cd0d - app_profile: Add functions to return arbitrary attribute values.
    • 8c60a27b - app_profile: Add attributes for GPU offloading
    • 6f576599 - app_profile: Get a device list from an environment variable
    • 8a7e54c1 - app_profile: Add an attribute for device filtering
    • f87b9388 - EGL: Define a GPU offload interface.
    • 6fe0c57b - EGL: Implement the GPU offload interface.
    • 229f47de - GLX: Define a GPU offload interface.
    • a7fdd97d - GLX: Implement the GPU offload interface.
    • b53ed707 - vulkanconfig: Add a stub Vulkan layer.
    • 68e686f0 - vulkanconfig: Move the instance map functions to their own header.
    • a65fa292 - vulkanconfig: Load an app profile.
    • 14ca1f25 - vulkanconfig: Implement device sorting based on profile.
    • 1e68dfb4 - app_profile: Check in a TOML parser.
    • 281f5e7f - app_profile: Move the GLVNDProfileRec definition into app_profile_internal.h.
    • 7f5d69f2 - app_profile: Add functions to look up process data.
    • bb298573 - app_profile: Add functions to scan the config directories.
    • e96cc3e1 - app_profile: Add a function for reading TOML files.
    • 415411d5 - app_profile: Implement app profile parsing.
    • 70681dbd - app_profile: Implement device aliases.
    • 082e7f3a - app_profile: Added a file describing the config file format.
    • f557db37 - Update profile_format.md.

    Compare with previous version

  • Kyle Brenneman marked this merge request as draft

    marked this merge request as draft

  • Kyle Brenneman changed title from WIP: Config file format for GPU offloading to Draft: Config file format for GPU offloading

    changed title from WIP: Config file format for GPU offloading to Draft: Config file format for GPU offloading

  • Kyle Brenneman changed the description

    changed the description

  • Updated to match the new revision of !224. The config file parsing code itself hasn't actually changed, and the format is the same, except that the device ID string has to include a valid UUID instead of an arbitrary vendor-defined string.

  • Kyle Brenneman mentioned in issue #229

    mentioned in issue #229

  • Just finished looking at the config file example:

    For wine programms being able to match by cwd and cmdline could also be very usefull. comm is the name of the .exe file but truncated to 15 chars, cmdline contains all agruments after the wine executable, aka it starts with an absolute path or a path relative to cwd to the .exe file (ofc given that the cwd hasn't changed in the meantime), but at least contains the full name of the .exe file.

  • Oh, I had never noticed that -- wine adjusts things so that /proc/self/cmdline is just the windows command line, so element zero is the .exe file. I had thought that it left the wine executable as element zero.

    In that case, for wine, it would be pretty easy to add a rule to look at cmdline[0], or maybe generalize it to look at a specified index or list of indices.

  • Kyle Brenneman added 5 commits

    added 5 commits

    • c42b9f8f - app_profile: Fix the autotools build
    • 65d079b4 - app_profile: Update the tomlc99 version.
    • 88b5364e - app_profile: Read /proc/self/cmdline
    • a172590e - app_profile: Add a match rule for /proc/self/cmdline.
    • 23124614 - app_profile: Allow case-insensitive matching for cmdline

    Compare with previous version

  • Okay, I added a match rule that looks at /proc/self/cmdline.

    I set it up so that you can specify the argument index to match against, which does unfortunately make it a bit more complicated. Rather than a simple array, it uses sub-tables. TOML's array-of-tables syntax still works pretty cleanly for nested arrays like that, though.

  • Did not yet test cmdline, but only in general with glxinfo:

    • __NV_PRIME_RENDER_OFFLOAD=1 must still be set manually for nvidia gpus (the code only seems to do the equivalent of __GLX_VENDOR_LIBRARY_NAME=nvidia)
    • vendor->glxvc->initOffloadVendor is NULL albeit the deivce supportys offloading (altering the function to ignore the existence vendor->glxvc->initOffloadVendor and it's output and it works)

    As far as i can tell: it's not possible atm to just set the vendor name, the device uuid also must always be provided?

    • !224 (which is included in this change) adds a driver interface for EGL and GLX. For offloading to work, we'd need to finalize that interface and check in the libglvnd changes, and then after that, the drivers would need to implement it.

      Before I check in the libglvnd changes, I'm hoping to get feedback from at least someone on the Mesa side of things to confirm whether or not the interface would work. The whole goal here is to have a vendor-agnostic configuration interface.

      As far as configuration, though: Yes, as currently defined, you do have to specify both a vendor name and a device UUID. In theory, we could adjust things to take only a vendor name, but then you'd likely get different behavior for GLX and Vulkan: For GLX, the driver would have to select a device internally, and for Vulkan, the new Vulkan layer library would have to select one.

      That sort of thing is what the alias stuff is for, though: You could define an "nvidia" alias, and then have some other program that runs on startup or login that selects a specific NVIDIA device to use. Then, you'd be able to just put "nvidia" into a profile, and all three API's would get the same device.

    • !224 (which is included in this change) adds a driver interface for EGL and GLX. For offloading to work, we'd need to finalize that interface and check in the libglvnd changes, and then after that, the drivers would need to implement it.

      Did only just now realize that initOffloadVendor is something new introduced by this patchset.

      Before I check in the libglvnd changes, I'm hoping to get feedback from at least someone on the Mesa side of things to confirm whether or not the interface would work. The whole goal here is to have a vendor-agnostic configuration interface.

      Do you know if a Mesa dev is already aware of this?

      As far as configuration, though: Yes, as currently defined, you do have to specify both a vendor name and a device UUID. In theory, we could adjust things to take only a vendor name, but then you'd likely get different behavior for GLX and Vulkan: For GLX, the driver would have to select a device internally, and for Vulkan, the new Vulkan layer library would have to select one.

      That sort of thing is what the alias stuff is for, though: You could define an "nvidia" alias, and then have some other program that runs on startup or login that selects a specific NVIDIA device to use. Then, you'd be able to just put "nvidia" into a profile, and all three API's would get the same device.

      Yes, read that art in the documentation. Makes things more complicated because you first must scan the device for GPUs, but also avoids ambiguity with AMD also supporting offloading and Intel also having discrete GPUs.

      If you know any way I can help to speed things up let me know. I'm tasked with creating a GUI for TUXEDO_OS for easy GPU offload handling, and this MR would be the perfect backend for it.

      e.g. would having a prototype of the GUI ready help getting the attention of the Mesa devs? Or is this irrelevant at this early point in time?

    • Please register or sign in to reply
    • Yeah, I spent a more time than I'd care to think about trying to figure out a sane way of dealing with multi-vendor configurations beyond simple cases like a single integrated and a single discrete GPU. That ultimately led to the alias feature -- it lets profiles select a generic "performance" rule, but it still separates the GPU offloading mechanism from the device selection policy, and provides a way for a user or a distro to define whatever policy they see fit.

      Anyway, if you want something to test, the Vulkan side should work without any extra driver support -- it just relies on existing Vulkan interfaces and fudges the VkPhysicalDevice ordering to try to make an application choose the right device.

      I don't know if having a GUI (or even a mock-up) for a config file editor would be helpful or not -- the config file format is deliberately opaque to the rest of libglvnd (and by extension, the drivers). I actually expected that the vendor interface in !224 would get checked in, and then the config file format would get nailed down sometime later.

    • vulkaninfo does not reorder it's list, but I'm not sure if that's the right place to look?

    • Please register or sign in to reply
  • I would expect vulkaninfo to show the ordering. It's quite possible that I messed something up in the Vulkan layer, though. Most of the testing I've done so far was for the new libGlvndAppProfile.so library itself.

  • Just wanna bump this. Did you hear back from the mesa folks?

  • No, it's been pretty quiet. I would like to get this moving again, though.

  • Something that came up when discussing another bug is that I think this could be used for something like the inverse of the usual GPU offloading arrangement.

    With X11 and Wayland, the client-side EGL driver can tell which device the display server is running on using DRI3Open or wl_drm. By default, the dGPU's driver would skip a native display if the server is running on the iGPU.

    But, if an application calls eglGetDisplay(NULL) or the eglGetPlatformDisplay equivalent with EGL_EXT_platform_device or EGL_MESA_platform_surfaceless, then there is no display server involved, so the driver can't make any such distinction. As a result, the dGPU driver would respond to that call, possibly waking up the dGPU to do so.

    To avoid that, something (possibly a startup script for a desktop environment) could generate a config file with a default profile that specifies whatever device the desktop is running on. Then, any application that calls eglGetDisplay(NULL) would end up with that device.

    To do that, we'd need to make sure that any application-specific configurations take priority over that, which would be tricky to do using only the directory search order. Also, you'd want to put that in a per-session (rather than per-user) directory, and I don't know of any standard place for such a thing.

    We'd also need to be able to limit the eglQueryDevicesEXT calls that libglvnd makes internally, to avoid unnecessarily waking up any GPUs. It would be pretty easy to add a name for each driver like we have with GLX, which would be enough to limit the eglQueryDevicesEXT call to that driver. For any finer granularity than that, though, we'd need a new query of some sort.

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading