radv: Split off cmd_buffer variant of descriptor set template updates
Improves performance for descriptor set template updates.
Benchmarks:
I made a microbenchmark based on Bas' vulkan microbench suite.
bnieuwenhuizen/vulkan_microbench!1
In this benchmark, this improves vkDescriptorTemplateUpdate performance consistently by 36%.
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
DescriptorTemplateUpdate 81.2 ns 81.2 ns 8573169
->
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
DescriptorTemplateUpdate 52.9 ns 52.9 ns 13306065