Vulkan: Custom VkAllocationCallbacks result in segmentation fault.
System information
- OS: Debian Testing
- GPU: Intel(R) UHD Graphics 610 (CFL GT1)
- Kernel version: 5.10
- Mesa version: started in 20.3.2
- Desktop manager and compositor: GNOME
Describe the issue
I have been using a custom allocator (one that adjusts to alignment properly) in Vulkan for over a year now (in addition to the current Debian device described above, it had been tested on a Windows 10 device on that Windows device's Nvidia driver). When I updated my mesa drivers to 20.3.2, my application suddenly began crashing every time I run it (after a seemingly-random number of seconds or mintues that typically doesn't exceed 2 minutes).
I have reviewed the allocator several times, and attempted modifying the allocator in several ways (including reducing it to a simple stack that ignores free requests), and the segmentation faults still occur.
Replacing the custom allocator with glibc malloc() does not result in the segmentation faults occurring. However, it should be noted that the application's memory usage, when using GLIBC malloc(), and the application's memory usage when not using VkAllocationCallbacks, are both the same, which indicates that the driver might be inadvertently depending on some behavior of glibc malloc().
Using backtrace on GDB reveals this whenever the program crashes (around half the time, it simply causes vkWaitSemaphores() to wait forever, in which case, no backtrace can be done: I know that vkWaitSemaphores() is what causes the infinite-waiting, because adding a timeout to vkwaitsemaphores causes a warning (one about the current command buffer already being in use) to be emitted after the application has been running for the number of seconds specified in the timeout argument of vkWaitSemaphores()):
Thread 18 "program" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbd7fa700 (LWP 5519)]
--Type <RET> for more, q to quit, c to continue without paging--
0x00007fffef64d056 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan_intel.so
(gdb) bt
#0 0x00007fffef64d056 in ?? ()
from /usr/lib/x86_64-linux-gnu/libvulkan_intel.so
#1 0x00007fffef67d5d6 in ?? ()
from /usr/lib/x86_64-linux-gnu/libvulkan_intel.so
#2 0x00007ffff7ca3ea7 in start_thread (arg=<optimized out>)
at pthread_create.c:477
#3 0x00007ffff7edbdef in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb)
UPDATE (using mesa-vulkan-drivers-dbgsym):
This is a backtrace that can occur when the window is not resized before crashing (when not resizing the window, vkWaitSemaphores() usually waits forever, and a crash doesn't occur):
[New Thread 0x7fffbd7fa700 (LWP 4947)]
Thread 18 "program" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbd7fa700 (LWP 4947)]
--Type <RET> for more, q to quit, c to continue without paging--
0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec28a2e0, size=3600)
at ./libs/mem_utils.h:38
38 currentAddress = *(uptr*)currentAddress;
(gdb) bt
#0 0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec28a2e0, size=3600)
at ./libs/mem_utils.h:38
#1 0x00000000004072a8 in Allocator::free (this=0x7fffffffdaa8,
memLoc=0x7fffec28a2e0) at ./libs/mem_utils.h:185
#2 0x0000000000406bdb in vkFreeFunction (pUserData=0x7fffffffc820,
pMemory=0x7fffec28a2f0) at main.cpp:905
#3 0x00007fffef64cd08 in vk_free (data=<optimized out>, alloc=<optimized out>)
at ../src/vulkan/util/vk_alloc.h:67
#4 anv_execbuf_finish (exec=0x7fffbd7f9d90)
at ../src/intel/vulkan/anv_batch_chain.c:1128
#5 anv_queue_execbuf_locked (queue=queue@entry=0x7fffec299f30,
submit=submit@entry=0x7fffec28a218)
at ../src/intel/vulkan/anv_batch_chain.c:1871
#6 0x00007fffef67d5d6 in anv_queue_task (_queue=0x7fffec299f30)
at ../src/intel/vulkan/anv_queue.c:410
#7 0x00007ffff7ca3ea7 in start_thread (arg=<optimized out>)
at pthread_create.c:477
#8 0x00007ffff7edbdef in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
These are three backtraces that can occur when the window is resized before crashing:
[New Thread 0x7fffbd7fa700 (LWP 4986)]
Thread 1 "program" received signal SIGSEGV, Segmentation fault.
0x000000000040a241 in Allocator::addAddressToFreeList (
--Type <RET> for more, q to quit, c to continue without paging--
this=0x7fffffffdaa8, memLoc=0x7fffec28ce98, size=528)
at ./libs/mem_utils.h:38
38 currentAddress = *(uptr*)currentAddress;
(gdb) bt
#0 0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec28ce98, size=528)
at ./libs/mem_utils.h:38
#1 0x0000000000406f16 in Allocator::alloc (this=0x7fffffffdaa8,
size=1336, alignment=8) at ./libs/mem_utils.h:102
#2 0x0000000000406afe in vkAllocationFunction (pUserData=0x7fffffffc820,
size=1336, alignment=8, allocationScope=VK_SYSTEM_ALLOCATION_SCOPE_OBJECT)
at main.cpp:880
#3 0x00007fffef65fd52 in vk_alloc (scope=VK_SYSTEM_ALLOCATION_SCOPE_OBJECT,
align=8, size=1336, alloc=<optimized out>)
at ../src/vulkan/util/vk_alloc.h:36
#4 vk_alloc2 (align=8, scope=VK_SYSTEM_ALLOCATION_SCOPE_OBJECT, size=1336,
alloc=<optimized out>, parent_alloc=<optimized out>)
at ../src/vulkan/util/vk_alloc.h:94
#5 vk_zalloc2 (align=8, scope=VK_SYSTEM_ALLOCATION_SCOPE_OBJECT, size=1336,
alloc=<optimized out>, parent_alloc=<optimized out>)
at ../src/vulkan/util/vk_alloc.h:105
#6 anv_image_create (_device=0x7fffec298718, create_info=0x7fffffffacc0,
alloc=0x7fffec28b848, pImage=0x7fffec28baa0)
at ../src/intel/vulkan/anv_image.c:721
#7 0x00007fffef660c46 in anv_CreateImage (device=0x7fffec298718,
pCreateInfo=0x7fffffffaf40, pAllocator=0x7fffec28b848,
pImage=<optimized out>) at ../src/intel/vulkan/anv_image.c:895
[New Thread 0x7fffbcff9700 (LWP 5044)]
Thread 18 "program" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbcff9700 (LWP 5044)]
--Type <RET> for more, q to quit, c to continue without paging--
0x00007fffef64d056 in anv_queue_execbuf_locked (
queue=queue@entry=0x7fffec299f30, submit=submit@entry=0x7fffef543b58)
at ../src/intel/vulkan/anv_batch_chain.c:1862
1862 ../src/intel/vulkan/anv_batch_chain.c: No such file or directory.
(gdb) bt
#0 0x00007fffef64d056 in anv_queue_execbuf_locked (
queue=queue@entry=0x7fffec299f30, submit=submit@entry=0x7fffef543b58)
at ../src/intel/vulkan/anv_batch_chain.c:1862
#1 0x00007fffef67d5d6 in anv_queue_task (_queue=0x7fffec299f30)
at ../src/intel/vulkan/anv_queue.c:410
#2 0x00007ffff7ca3ea7 in start_thread (arg=<optimized out>)
at pthread_create.c:477
#3 0x00007ffff7edbdef in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
[New Thread 0x7fffbd7fa700 (LWP 5095)]
Thread 18 "program" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbd7fa700 (LWP 5095)]
--Type <RET> for more, q to quit, c to continue without paging--
0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec2a05c0, size=302544)
at ./libs/mem_utils.h:38
38 currentAddress = *(uptr*)currentAddress;
(gdb) bt
#0 0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec2a05c0, size=302544)
at ./libs/mem_utils.h:38
#1 0x0000000000406f16 in Allocator::alloc (this=0x7fffffffdaa8,
size=3584, alignment=8) at ./libs/mem_utils.h:102
#2 0x0000000000406afe in vkAllocationFunction (pUserData=0x7fffffffc820,
size=3584, alignment=8, allocationScope=VK_SYSTEM_ALLOCATION_SCOPE_DEVICE)
at main.cpp:880
#3 0x00007fffef64a99c in vk_alloc (scope=<optimized out>, align=8, size=3584,
alloc=<optimized out>) at ../src/vulkan/util/vk_alloc.h:36
#4 anv_execbuf_add_bo (device=device@entry=0x7fffec298718,
exec=exec@entry=0x7fffbd7f9d90, bo=0x85d398, relocs=relocs@entry=0x0,
extra_flags=extra_flags@entry=0)
at ../src/intel/vulkan/anv_batch_chain.c:1179
#5 0x00007fffef64cc79 in anv_queue_execbuf_locked (
queue=queue@entry=0x7fffec299f30, submit=submit@entry=0x7fffec294c60)
at ../src/intel/vulkan/anv_batch_chain.c:1710
#6 0x00007fffef67d5d6 in anv_queue_task (_queue=0x7fffec299f30)
at ../src/intel/vulkan/anv_queue.c:410
#7 0x00007ffff7ca3ea7 in start_thread (arg=<optimized out>)
at pthread_create.c:477
#8 0x00007ffff7edbdef in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
If a timeout is specified, and the program is allowed to continue to run, it rarely crashes, and this is a possible backtrace for when it does crash(along with validation-layer warnings):
[New Thread 0x7fffbd7fa700 (LWP 5172)]
Validation Error: [ VUID-vkQueueSubmit-pCommandBuffers-00071 ] Object 0: handle = 0x7fffec298718, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x2e2f4d65 | VkCommandBuffer 0x7fffec292da0[] is already in use and is not marked for simultaneous use. The Vulkan spec states: If any element of the pCommandBuffers member of any element of pSubmits was not recorded with the VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT, it must not be in the pending state (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-vkQueueSubmit-pCommandBuffers-00071)
Thread 18 "program" received signal SIGSEGV, Segmentation fault.
--Type <RET> for more, q to quit, c to continue without paging--waw
[Switching to Thread 0x7fffbd7fa700 (LWP 5172)]
0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec295250, size=3600)
at ./libs/mem_utils.h:38
38 currentAddress = *(uptr*)currentAddress;
(gdb) bt
#0 0x000000000040a241 in Allocator::addAddressToFreeList (
this=0x7fffffffdaa8, memLoc=0x7fffec295250, size=3600)
at ./libs/mem_utils.h:38
#1 0x00000000004072a8 in Allocator::free (this=0x7fffffffdaa8,
memLoc=0x7fffec295250) at ./libs/mem_utils.h:185
#2 0x0000000000406bdb in vkFreeFunction (pUserData=0x7fffffffc820,
pMemory=0x7fffec295260) at main.cpp:905
#3 0x00007fffef64cd08 in vk_free (data=<optimized out>, alloc=<optimized out>)
at ../src/vulkan/util/vk_alloc.h:67
#4 anv_execbuf_finish (exec=0x7fffbd7f9d90)
at ../src/intel/vulkan/anv_batch_chain.c:1128
#5 anv_queue_execbuf_locked (queue=queue@entry=0x7fffec299f30,
submit=submit@entry=0x7fffef544658)
at ../src/intel/vulkan/anv_batch_chain.c:1871
#6 0x00007fffef67d5d6 in anv_queue_task (_queue=0x7fffec299f30)
at ../src/intel/vulkan/anv_queue.c:410
#7 0x00007ffff7ca3ea7 in start_thread (arg=<optimized out>)
at pthread_create.c:477
#8 0x00007ffff7edbdef in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb)