anv: Allow compressed memtypes with default buffer types
What does this MR do and why?
anv: Allow compressed memtypes with default buffer types
Source 2 games segfault if certain buffers are not able to use the same
memory types as images. CS2 specifically expects this to be the case for
vertex and index buffers (VK_BUFFER_USAGE_2_INDEX_BUFFER_BIT,
VK_BUFFER_USAGE_2_VERTEX_BUFFER_BIT). I have not tested other Source 2
games to see how much the requirement differs for the usage (if at all).
Up until now, we've disabled CCS for the Source 2 engine with the
anv_disable_xe2_ccs driconf option. However, this option is not great
for performance. So, replace this with a new option to allow the same
memory types we use for images on buffers - anv_enable_buffer_comp.
Compression of buffers is generally not good for performance. I
collected the result of unconditionally enabling the feature in the
performance CI on BMG. I used the default configuration to average the
result of two runs of each trace.
The CI reports that 4 game traces would regress between 0.44-1.01% FPS
with buffer compression. However, the CI actually shows it to be
beneficial in three of our game traces:
* Cyberpunk-trace-dx12-1080p-high 106.51%
* Hitman3-trace-dx12-1080p-med 101.59%
* Blackops3-trace-dx11-1080p-high 100.44%
So, enable the option for the two games we already have driconf entries
for, Cyberpunk and Hitman3.
Of course, also enable the option for Source 2 games. Casey Bowman
reports that on BMG, some frame times drop from ~15ms to ~7ms in CS2.
This is in large part due to the removal of HiZ resolves, which is a
consequence of the game now using of HIZ_CCS_WT instead of plain HIZ.
This is basically the alternative solution to disabling CCS that was proposed by @pzanoni a while back in #11520 (comment 2502782), but hidden behind a driconf option.
V1. Saw different performance impact on Blackops3.
I used the performance CI to compare various options:
- Exclude compression for the usages recommended by internal docs except for index and vertex vs upstream (
runs/12174141474/job/33955632713
) - Exclude compression for the usages recommended by internal docs (see !32474 (closed)) vs option 1 (
runs/12175658767/job/33959820710
) - Allow compression on every usage which uses the default memory types (this MR) vs option 1 (
runs/12175676033/job/33959873114
)
Unfortunately, the perf CI couldn't complete each run, so I only have partial data to work off. The result from option 1 is that we get a maximum regression of 1.11% on PubG, but a maximum improvement of 6.40% in Cyberpunk. If I'm calculating it correctly, the geomean is slightly positive. With option 2, we lose the gain in Cyberpunk, but we gain 1.6% in Blackops. Option 3 shows no difference from option 1, indicating that only index and vertex buffers really matter for the workloads were tracking.