panfrost: implement noperspective varyings
Tested against dEQP-VK.*.renderpass.multiple_interpolation.*
and dEQP-VK.spirv_assembly.*.no_perspective
.
Mali only has hardware support for either flat or perspective-correct varying interpolation, so we need to emulate noperspective varyings with lowering in the VS and FS.
Because interpolation qualifiers are not required to match between stages, so we need to make the FS qualifiers available in the VS. There are three approaches I've considered for dealing with this
Passing qualifiers statically
If we know the FS qualifiers when compiling the VS, we can pass them as a compile-time constant. This has the lowest runtime overhead. There is no compile-time overhead to doing this for only one variant, but if we compile multiple variants specialized on different qualifiers the compile-time overhead would be significant.
Passing qualifiers dynamically
If we do not know the FS qualifiers when compiling the VS, we can pass them dynamically similar or other sysvalues. This has the highest runtime overhead because we need to branch on the dynamic value for every (non-integer) varying write.
VS epilogs
Split VS shader compilation into two parts: the main VS body and an "epilog". The main body doesn't depend on the qualifiers, and so can be compiled once. It passes the varying values to the epilog with a defined ABI, and the epilog is specialized on qualifiers. There would be a defined ABI between the main body and the epilog, and they would be linked together by simple concatenation or with jumps. Because the epilog is very small, compiling it at link-time is fast.
We could do this with a single body variant, but this may be suboptimal for shaders that never use noperspective because it would require computing gl_Position.w
in the varying shader, which may otherwise be unnecessary.
This is the approach used by honeykrisp and radv for a similar class of problems. Implementing this in panfrost would be complex. Compile-time performance would be worse than the other approaches for one variant (because compiling and linking the epilog has non-zero cost), but may reduce the number of variants we need to compile. Runtime performance would be somewhat worse than fully static qualifiers (because the epilog ABI would have non-zero cost and because optimizations cannot see across the link boundary) but better than fully dynamic qualifiers.
One variant for "noperspective is not used"
Compiling a variant for "at least one varying is noperspective" and a separate variant for "no varyings are noperspective" would allow us to avoid the runtime overhead of dynamic qualifiers and epilogs in the common case. The compile-time overhead would be fixed (compiling each varying shader exactly twice). We are already using one variant in panvk for the case where a shader writes gl_PointSize
when drawing non-point primitives. The existing variant is only on the position shader, and this new variant would be only on the varying shader, so there would not be combinatorial explosion.
OpenGL
In OpenGL, we are already doing shader variants. If we implement epilogs it would probably make sense to use them here, but if not we can just specialize the VS on the FS qualifiers and it won't be significantly worse than the current state. There's one complication here, which is that we currently specialize the FS based on the set of desktop GL varyings written by the VS. Specializing the VS on the FS creates a dependency loop, but I think it's possible to resolve this with some refactoring.
Vulkan
There are three cases in vulkan:
Monolithic pipelines
The FS is always available when compiling the VS. We can pass the FS qualifiers to the VS statically.
VK_EXT_shader_object
There is no way to know the FS qualifiers when compiling the VS, and the spec disallows compiling full VS variants at link-time. The only options are passing the qualifiers dynamically or using epilogs. We don't currently advertise this extension, but the interface between the mesa vulkan runtime and panvk is ~ESO. We with need to modify that interface in order to do anything special for monolithic pipelines or GPL.
VK_EXT_graphics_pipeline_library
We are currently advertising both graphicsPipelineLibraryFastLinking
and graphicsPipelineLibraryIndependentInterpolationDecoration
. If we dropped support for IndependentInterpolationDecoration
we could use assume that VS qualifiers match the FS and they would be available statically. If we dropped support for graphicsPipelineLibraryFastLinking
we could possibly specialize the VS on the FS at link-time. The extension proposal says "If this property is not supported, linking should still be cheaper than a full pipeline compilation", but that text doesn't seem to have made it into the spec. Recompiling just the VS is technically cheaper than a full pipeline compilation, but I suspect this is still more overhead than GPL-using applications expect. With the currently-advertised features, we're in the same situation as VK_EXT_shader_object
.