gallium,winsys/amdgpu: refactor pb_buffer/cache/slab, radically optimize winsys/amdgpu
Summary of the common code changes:
-
pb_buffer_lean
is added, which is justpb_buffer
withoutvtbl
.pb_buffer
becomespb_buffer_lean
+vtbl
. -
pb_cache_entry
andpb_slab_entry
are refactored to decrease their size, touching a bunch of drivers.
Summary of amdgpu changes:
- Complete rewrite of BO fence tracking. It introduces a new queue fence system that decreases the CS thread overhead by 46%, massively decreases the CPU cache footprint for BO fences and their processing, and the best seen FPS improvement in one CPU-bound benchmark is 12%.
- The slab allocator with 3 levels is replaced by a slab allocator with only 1 level. While I can't explain why this improves performance so much, one CPU-bound benchmark gets 10-18% (random/noisy) higher FPS.
- Lots of refactoring to allow some of the size decreases.
r300 an r600 also have a lot of changes to accommodate the winsys changes.
This depends on !26547 (merged) (whose commits are included here, separated by an empty commit)
Edited by Marek Olšák