Zink on Anv doing invalid Vulkan operations in piglit/arb_sparse_buffer-basic
On Anv we now have Vulkan Sparse Resources for both i915.ko and xe.ko. That means we can run piglit/arb_sparse_buffer-basic
.
From that I can understand, the Piglit test itself seems to be doing valid OpenGL things, but then Zink is doing things the Vulkan spec does not seem to allow.
In a "bind" operation, GL allows the size and offset for the bind operation to be either a multiple of SPARSE_BUFFER_PAGE_SIZE_ARB or a value that reaches the end of the buffer:
offset must be an integer multiple of the implementation dependent constant SPARSE_BUFFER_PAGE_SIZE_ARB, and size must either be a multiple of SPARSE_BUFFER_PAGE_SIZE_ARB, or extend to the end of the buffer's data store.
Now, for opaque binds (i.e., not bindings of an image region) Vulkan doesn't explicitly restrict the size of VkSparseMemoryBind::size
(other than saying it must be > 0
), but at some point it talks about the bind granularity:
The sparse block size in bytes for sparse buffers and fully-resident images is reported as VkMemoryRequirements::alignment. alignment represents both the memory alignment requirement and the binding granularity (in bytes) for sparse resources.
While the Validation layers don't seem to complain when VkSparseMemoryBind::size
is not a multiple of VkMemoryRequirements::alignment
, some implementation such as NVidia's proprietary driver simply return VK_ERROR_VALIDATION_FAILED_EXT
when that happens.
In other words: Vulkan is more strict than OpenGL here, so Zink can't just forward GL's operations to Vulkan. Also, there are two things with different sizes to be considered: the VkBuffer size and the VkDeviceMemory size.
So, back to piglit/arb_sparse_buffer-basic, we see it's doing the following:
buffer_size = sparse_buffer_page_size / 2;
pass = run_one(buffer_size, 0, buffer_size, 0, true) && pass;
Since sparse_buffer_page_size
is 64k, it's creating a buffer of size 32k and trying to bind 32k of it, which is valid in OpenGL.
The big problem is that for this VkBuffer of size 32k, Anv returns VkMemoryRequirements::size
and VkMemoryRequirements::alignment
to be 64k, so while it's fine for VkBuffer to be size 32k, the matching VkDeviceMemory has to be size 64k.
So, at zink_bo.c:buffer_bo_commit()
we have two problems.
First, we hit the following assertion:
assert(size % ZINK_SPARSE_BUFFER_PAGE_SIZE == 0 || offset + size == bo->base.size);
The variables involved here have the following values:
size:32768 ZINK_SPARSE_BUFFER_PAGE_SIZE:65536
offset:0 offset+size:32768 bo->base.size:65536
I believe that instead of bo->base.size
what Zink means here is actually res->obj->size
, which is 32k. I am planning to send a patch for this.
The second issue is: even if we get past this assertion, we'll actually try to do a bind operation of size 32k, which is invalid in Vulkan. While it's possible for us to make that valid using the TR-TT backend for Anv, the Kernel's vm_bind ioctl doesn't allow that. Anv could also try to work around this by just "aligning up" the allocation, but that would still mean Zink is producing Vulkan-invalid code, which would fail when running on top of other drivers.
If we adjust Anv to bring VkMemoryRequirements::size
down to 32k in this case, this specific test will pass, but Zink will still be producing Vulkan-invalid code. And lots of dEQP tests will return:
Assertion `(bufferMemRequirements.size % bufferMemRequirements.alignment) == 0' failed
So yeah, even if we fix this in Anv, I believe there's something Zink should do about the problem. Edit: well, perhaps the fix for Zink is just to round size
in buffer_bo_commit
up and make sure the range still ends before bo->base.size
? I'll write a patch proposing that.
Unless, of course, my assessment is incorrect. Please let me know if that's the case
Thanks,
Paulo