radv: Implement binning for gfx10
This adds some cleanups for gfx9 as well, from radeonsi.
There are some differences between amdvlk and radeonsi though, wrt a multiplier of 4 for depth bytes per pixel. I followed radeonsi for now, can benchmark things as a follow up.