freedreno/pps: reduce cpu overhead
Moved one of the commits in !19426 (closed) to here.
Sampling at 1ms, perf goes from
- 34.44% pps::FreedrenoDriver::collect_countables
25.36% pps::FreedrenoDriver::Countable::collect
3.92% cfree
+ 2.28% operator new
to
- 29.60% pps::FreedrenoDriver::collect_countables
20.70% pps::FreedrenoDriver::Countable::collect
4.01% cfree
+ 2.35% operator new
1.09% memcpy
to
- 22.75% pps::FreedrenoDriver::collect_countables
22.59% pps::FreedrenoDriver::Countable::collect