Skip to content
Snippets Groups Projects

anv/xe: try harder when the vm_bind ioctl fails

Merged Paulo Zanoni requested to merge pzanoni/mesa:sparse-enobufs into main
1 unresolved thread
anv/xe: try harder when the vm_bind ioctl fails

From all the many possible errors returned by the vm_bind ioctl, some
can actually happen in the wild when the system is under memory
pressure. Thomas Hellström pointed to us that, due to its asynchronous
nature, the vm_bind ioctl itself has to pin some memory, so if the
number of bind operations passed is too big, there is a probability
that it may run out of memory.

Previously the Kernel would return ENOMEM when this condition
happened.  Since commit e8babb280b5e ("drm/xe: Convert multiple bind
ops into single job") the Kernel has started returning ENOBUFS when it
doesn't have enough memory to do what it wants but thinks we'd succeed
if we tried to do one bind operation at a time (instead of doing
multiple operations in the same ioctl), and ENOMEM in some other
situations. Still-uncommitted commit "drm/xe: Return -ENOBUFS if a
kmalloc fails which is tied to an array of binds" proposes converting
a few more ENOMEM cases no ENOBUFS.

Still, even ENOMEM situations could in theory be possible to recover
from, because if we wait some amount of time, resources that may have
been consuming memory could end up being freed by other threads or
processes, allowing the operations to succeed. So our main idea in
this patch is that we treat both ENOMEM and ENOBUFS in the same way,
so our implementation can work with any xe.ko driver regardless of
having or not having the commits mentioned above.

So in this patch, when we detect the system is under memory pressure
(i.e., the vm_bind() function returns VK_ERROR_OUT_OF_HOST_MEMORY), we
throw away our performance expectations and try to go slowly and
steady. First we wait everything we're supposed to wait (hoping that
this alone could also help to alleviate the memory pressure), and then
we synchronously bind one piece at a time (as this will ensure ENOBUFS
can't be returned), hoping that this won't cause the Kernel to try to
reserve too much memory. All this while also hoping that whatever
thing that may be eating all the memory goes away in the meantime. If
even this fails, we give up and hope the upper layer will be able to
figure out what to do.

This fixes a bunch of LNL failures and flaky tests (as LNL is our
first officially supported xe.ko platform). This can be seen in dEQP
but only if multiple tests are being run parallel. Happens in multiple
tests, some of which may include:

  - dEQP-VK.sparse_resources.image_sparse_binding.2d_array.rgba8_snorm.1024_128_8
  - dEQP-VK.sparse_resources.image_sparse_binding.3d.rgba16_snorm.1024_128_8
  - dEQP-VK.sparse_resources.image_sparse_binding.3d.rgba16ui.512_256_6

I don't ever see these errors when running Alchemist/DG2 with xe.ko.

Fixes: e9f63df2f2c0 ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE")

Relevant links:

Cc: @mbrost @thomash

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
746 VkResult result = device->kmd_backend->vm_bind(device, submit,
747 ANV_VM_BIND_FLAG_NONE);
748
749 if (result == VK_ERROR_OUT_OF_HOST_MEMORY) {
750 /* If we get this, the system is under memory pressure. First we
751 * manually wait for all our dependency syncobjs hoping that some memory
752 * will be released while we wait, then we try to issue each bind
753 * operation in a single ioctl as it requires less Kernel memory and so
754 * we may be able to move things forward, although slowly, while also
755 * waiting for each operation to complete before issuing the next.
756 * Performance isn't a concern at this point: we're just trying to move
757 * progress forward without crashing until whatever is eating too much
758 * memory goes away.
759 */
760
761 result = vk_sync_wait_many(&device->vk, submit->wait_count,
  • You could pass the sets of waits in the first vm_bind operation and the sets of signals in the last one.

  • Author Developer

    I intentionally avoided doing that. My theory here is the following:

    We just got an ENOMEM/ENOBUFS from a vm_bind ioctl due to the Kernel not having enough memory. So instead of immediately trying another vm_bind ioctl, perhaps it would be better to manually wait the stuff we need to wait because maybe in the meantime memory would be freed, then maybe the next time we tried the vm_bind ioctl it would have a higher chance to succeed. We want to try to give each ioctl the least amount of work possible so it won't need as much memory.

    Here we're just trying to be as "conservative" as we can in terms of chance to success.

  • Please register or sign in to reply
  • Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

  • Paulo Zanoni added 189 commits

    added 189 commits

    Compare with previous version

  • Paulo Zanoni assigned to @marge-bot and unassigned @pzanoni

    assigned to @marge-bot and unassigned @pzanoni

  • Marge Bot added 23 commits

    added 23 commits

    Compare with previous version

  • merged

  • Please register or sign in to reply
    Loading