d3d12: Perf improvements
Significant perf improvements:
- Using a dynarray to track pending_barrier_bos
- Using a context-level array of
dxil_wrap_sampler_state
for filling shader keys, significantly reducing the size of the d3d12_shader_key. When a new variant is created, a new array for them is allocated, and the contents are copied over. - Using a local array of
batch_bo_reference_state
to track up to 16 contexts worth of batches ind3d12_bo
, effectively avoiding the use of a hash_table for the same purpose (while keeping the hash table around for >16 active context scenarios) - Reducing calls to and cost of
validate_geometry_shader_variant
Less significant perf improvements:
Shader key compare and hash updates, not recomputing manual_depth_range
in every shader variant selection, and unrolling the shader variant selection loop.