d3d12: fix stencil replicate
This is the flip-side of !300 (merged), together I have these two CTS cases fixed on NVIDIA:
- KHR-GL33.packed_depth_stencil.blit.depth24_stencil8
- KHR-GL33.packed_depth_stencil.blit.depth32f_stencil8
So, what craziness is going on here, you ask? It turns out, the only way of writing MSAA stencil data in D3D12 is through the graphics pipeline. But for GPUs who don't support PIPE_CAP_SHADER_STENCIL_EXPORT
, this means the fastest way of blitting to a MSAA stencil buffer is to write one bit at the time, using the stencil write-mask support and fragment-shader discards.
Yes, this is complicated, but it seems to work.
Edited by Erik Faye-Lund