radv: enable TC-compat HTILE for D32_SFLOAT+MSAA on GFX10+

This was disabled due to some depth/stencil resolve CTS failures
which are now fixed.

I figured that disabling TC-compat HTILE for D32_SFLOAT+MSAA reduced
performance in Control by -11% on Vega10. In fact, the game only uses
D32_SFLOAT for depth rendering.

This gives a huge boost in Control on Navi10 (eg. +17% in MSAA4x).
Note that the game is still slower than PRO without MSAA on Navi10,
but as fast (or even a bit faster) on Vega10.

I think TC-compat HILE could also be enabled for D32_SFLOAT_S8_UINT
but it needs more testing first.
......@@ -94,12 +94,9 @@ radv_use_tc_compat_htile_for_image(struct radv_device *device,
return false;
/* FIXME: for some reason TC compat with 2/4/8 samples breaks some cts
* tests - disable for now. On GFX10 D32_SFLOAT is affected as well.
* tests - disable for now.
if (pCreateInfo->samples >= 2 &&
(format == VK_FORMAT_D32_SFLOAT_S8_UINT ||
(format == VK_FORMAT_D32_SFLOAT &&
device->physical_device->rad_info.chip_class >= GFX10)))
if (pCreateInfo->samples >= 2 && format == VK_FORMAT_D32_SFLOAT_S8_UINT)
return false;
/* GFX9 supports both 32-bit and 16-bit depth surfaces, while GFX8 only
