anv: optimize descriptor buffer binding

Currently we regenerate the surface state of the descriptor buffer everytime it changes : https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/vulkan/genX_cmd_buffer.c?ref_type=heads#L2637

This leads to some CPU time spent doing this, showing up on profiles.

This can be optimized because in most cases the descriptor buffer is not accessed by the shader.

This surface state generation should be moved to emit_binding_table() and so it only happens if needed.

We also need the pipeline layout not to force the descriptor buffer usage : https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/vulkan/anv_nir_apply_pipeline_layout.c?ref_type=heads#L146

The might mean doing more in the get_used_bindings() prepass because in some cases the descriptor buffer is read :

inline uniforms
to generate A64 addresses in order to avoid non-uniform peeling loops (only on buffers)

We can't avoid the first case, but we might be able to avoid the second by switching to peeling loops. This is probably an optimization tread off. vkd3d-proton also relies on out of spec behave forcing us to use peeling loops.

Edited Feb 29, 2024 by Lionel Landwerlin

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information