gallium/u_threaded, radeonsi: track busy buffers in TC, merge draws in execute callbacks
This enables busy buffer tracking in
u_threaded_context, which allows promoting buffer mappings to
UNSYNCHRONIZED for idle buffers, improving scalability especially for glBufferSubData. RadeonSI is the only driver that enables it currently. TC tracks buffer lists of referenced buffers in unflushed TC batches and unflushed driver command buffers, and they use hashing of unique buffer IDs without atomics for lower overhead instead of pipe_resource pointers. TC also tracks all currently bound buffers in terms of buffer IDs, which is required for reconstructing the buffer list when starting a new TC batch where the buffer list starts empty.
Now TC has all the synchronization prevention optimizations that drivers should have (or even more than drivers have), and can be safely enabled by default.
Draw merging is also moved to tc_call_draw_single to facilitate more complex merging in the future. The first 2 commits do it.
pipe->flushfor the buffer list ring might have to be handled the same as an internal flush in drivers. (for maximum efficiency) - radeonsi might need changes here.
- Invalidated buffers are not re-bound in other instances of u_threaded_context. This affects synchronization, but I don't know yet how.