nir,spirv: Add support for SPV_KHR_ray_tracing
This merge request adds support to NIR and the SPIR-V parser for the provisional SPV_KHR_ray_tracing extension. It's worth noting that the SPV_KHR_ray_tracing extension which was released in March is currently provisional and therefore subject to change between now and when the final spec is released. This part of why this MR is marked
WIP: the spec isn't quite final yet and so neither is this implementation. For those with Khronos access, there is an identically named branch on the Khronos internal Mesa repo which I will keep updated with any SPIR-V spec changes.
This MR is also marked WIP because it provides NIR and SPIR-V parser bits only. Nothing is hooked up to it in any back-ends so it doesn't do anyone much good just yet; there's no vertical slice of functionality. I have tested it and can verify that it correctly parses SPIR-V ray-tracing shaders and that it generates competent NIR for them.
What follows is a quick break-down of what this MR contains and some key design points. If you're new to the Vulkan ray-tracing API, my XDC talk (shameless plug!) in 3 weeks will give what I hope to be a good overview of the Vulkan ray-tracing spec and roughly how things function. If you don't want to wait that long, there are many other talks and blog posts out there about Vulkan and DX12 ray-tracing which are functionally pretty similar.
1. New shader stages
There are 6 of them; yes, that doubles the number of shader stages currently in Vulkan.
While we're at it, we also add mesh and task stages. This isn't because we really have any intention of implementing them any time soon. It's more that while we're dealing with the pain of adding stage enums, we might as well add them all.
2. A new OpTypeAccelerationStructure opaque type
This is for the shader binding points for
VkAccelerationStructureKHR objects. Instead of adding a new
GLSL_BASE_TYPE_ACCELERATION_STRUCTURE or similar, I elected to keep the new type entirely internal to
spirv_to_nir. At the NIR level, it treats acceleration structures as a
uint64_t which is loaded using a
load_vulkan_descriptor intrinsic. Why? Partly for simplicity – adding a new
glsl_base_type is a bit of a pain – but mostly because it will make a few things easier later. Go read the Khronos branch for more details.
3. New system values / built-ins
The SPV_KHR_ray_tracing extension adds 14 new built-ins for all sorts of things from getting your raygen launch ID to ray information such as origin and direction to hit information such as the min/max T value and the current world transform. Most of this is fairly straightforward. We add new
SYSTEM_VALUE_* enums, new system-value NIR intrinsics, and the obvious mapping code.
There is a tricky bit in here, however, which is the matrix system values:
gl_WorldToObjectEXT. These are tricky because all NIR system values are currently either scalars, vectors, or arrays of scalars which we know a priori will only have one element (
gl_SampleMaskIn). The matrix system values are a bit different because you can't load the system value with a single NIR intrinsic. Instead, I've solved this problem by having index-based system values (we already have a couple of these for clip-planes, etc.) where the index is a column number. If the shader happens to dynamically index the columns (this isn't likely but could happen), we emit all intrinsics for all four columns and a
bcsel tree to pick off the column you want. If this is ever a problem in the future, we can try to improve it then.
4. New storage classes
Ray-tracing quite a few new storage classes. 6 of them, to be precise. In order to avoid blowing up
nir_variable_mode, I made a few simplifying assumptions. Under these assumptions, several of them become degenerate so we can combine them:
CallableData and RayPayload data actually lives on the stack somewhere, presumably in the caller's stack. We assume that these are no different from global variables and use nir_var_shader_temp for them. We still need a separate storage class for the incoming variants but that's only so we can figure out which one the incoming one is and lower it to something useful. This does assume that you have a "real" stack.
There's no difference between incoming CallableData and RayPaolad data. We can use a single storage class for both.
ShaderRecordBuffer data is just a constant global memory access. This lets us avoid NIR variables entirely and just fetch the pointer via the shader_record_ptr system value and it's accessed using a 64-bit global memory pointer.
This results in the following mapping to
5. New intrinsics
Nothing much to say here, I don't think. These intrinsics basically exactly map to the new SPIR-V opcodes. The one bit of weirdness is around the way that OpTraceRayKHR and OpExecuteCallableKHR refer to the incoming payload. In the SPIR-V, they reference it by location number. Each RayPayloadKHR or CallableDataKHR variable has a location number and the last parameter to the SPIR-V opcode is the location number that is supposed to be referenced by the trace or call. This is really awkward for a NIR consumer to use so, instead, the NIR intrinsics take a deref for that parameter. All of the location number matching and lookup is handled by
spirv_to_nir so drivers don't have to care.