Skip to content

zink: Workaround for a race condition in the Nvidia driver

Sidney Just requested to merge justsid/mesa:fix_zink_nvidia_crash into main

We ran into this issue in X-Plane with Zink enabled on Nvidia GPUs, where we'd see random crashes in the Nvidia driver. After bringing this up with Nvidia engineers, this was traced it back to a race condition with vkResetCommandPool/vkResetCommandBuffer on command buffers that were still in the recording state. While technically spec legal, as far as I can tell only the Beta Nvidia driver currently has a fix for this and the mainline drivers still have the race condition.

Triggering the crash isn't super easy, you gotta go fast for your calls to properly race the Nvidia driver. This made reproducing and triaging this issue quite hard and I actually don't have good repro steps for how to get this outside of X-Plane. Here is our guidance from Nvidia:

We think you can workaround the issue on current drivers by tweaking any vkBeginCommandBuffer->vkResetCommandBuffer sequence, where there is no vkEndCommandBuffer before to the vkResetCommandBuffer to instead do vkBeginCommandBuffer->vkEndCommandBuffer->vkResetCommandBuffer. I think what you’re doing now is valid usage, it just hit a bug.

This MR implements exactly that for Zink.

Merge request reports