freedreno/a6xx: avoid sw transfer path
Some apps (like firefox and chromium) seem to like to create textures with internal format GL_RGBA8 but then upload GL_BGRA to them. If the app were a bit more clever it could hit the memcpy path. But fallback sw conversion (convert_ubytes()) is slow. If we are going to have to make this extra copy, do it on the gpu.
edit: since this has the side effect of enabling accelerated pbo transfers, it turned up a surprisingly high # of other issues.
As far as non-driver changes:
- Fix for nir
image_deref_store
in pbo download shader - Remove the
clear_depth_stencil()
fromutil_blitter_stencil_fallback()
to avoid recursively entering u_blitter if it is used to implementclear_depth_stencil()
. Zink and crocus where already doing their own clears, only d2d12 was depending on the clear inutil_blitter_stencil_fallback()
.