freedreno: more draw overhead opts
First five patches push multidraw down to backend (which helps a bit some of the drawoverhead benchmarks but not as much as you would think.. fd6_draw_vbo
is already pretty low overhead for the !dirty case.. but implementing draw_vertex_state()
might be useful for legacy gl things). The remaining patches chip away at the various atomic bottlenecks. Overall 7% improvement at drawoverhead, but up to 40% improvement in individual tests that were bottlenecked more on locks and atomic refcnt's (13, 18-20 (TBO's seem less effected than tex by mesa/st's atomics), 32-39)