tu: Read some input attachments directly
It can happen that the user reads an input attachment as the first use of that attachment. In that case there are no subpass dependencies required at all, because there could be a pipeline barrier before the renderpass instead, and in any case we assume that dependencies with the first subpass as a destination can be executed only once outside the renderpass. The result is that we only do a CACHE_INVALIDATE once before the entire renderpass, but it's actually required after each GMEM load, because input attachments read GMEM through UCHE and those writes to GMEM invalidate UCHE.
While we could add the missing CACHE_INVALIDATE "by hand" somehow, it turns out it's actually just as easy to do an optimization the blob does, where it simply doesn't patch those input attachments and reads them directly instead. This means we can skip allocating memory in GMEM for them entirely in some circumstances.
This fixes e.g. dEQP-VK.api.copy_and_blit.core.resolve_image.whole_array_image.4_bit with TU_DEBUG=forcebin.