freedreno/a6xx: CP overhead reductions
For things that have a large number of small draws, in particular draws that only effect a subset of tiles, moving more register writes to stateobjs and reducing the register writes we emit directly to the draw ring helps drop CP overhead. In particular, webgl fishtank, at 1000 fish, with sharks+lasers, which is a bit of a pedantic case) on a small a6xx (less gmem, more tiles, and more tiles per VSC pipe), this MR takes us from:
metric | master | MR |
---|---|---|
fps | 49 | 60 |
CP_BUSY_GFX_IDLE | 26-28% | 18.5% |
CP_BUSY_CYCLES | 87% | 73.5% |
Leaving this as WIP for now, there are probably a few other groups of registers we could move to stateobjs
Edited by Rob Clark