gallium,radeonsi: simplify VRAM uploads by adding PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY
When this flag is set, u_threaded_context will try not to map it directly for better buffer placement. It's set by drivers when visible VRAM is too small.
This makes viewperf (creo & snx) faster.