anv: optimize descriptor buffer binding
Currently we regenerate the surface state of the descriptor buffer everytime it changes : https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/vulkan/genX_cmd_buffer.c?ref_type=heads#L2637
This leads to some CPU time spent doing this, showing up on profiles.
This can be optimized because in most cases the descriptor buffer is not accessed by the shader.
This surface state generation should be moved to emit_binding_table()
and so it only happens if needed.
We also need the pipeline layout not to force the descriptor buffer usage : https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/vulkan/anv_nir_apply_pipeline_layout.c?ref_type=heads#L146
The might mean doing more in the get_used_bindings()
prepass because in some cases the descriptor buffer is read :
- inline uniforms
- to generate A64 addresses in order to avoid non-uniform peeling loops (only on buffers)
We can't avoid the first case, but we might be able to avoid the second by switching to peeling loops. This is probably an optimization tread off. vkd3d-proton also relies on out of spec behave forcing us to use peeling loops.