ac,radeonsi: clear rework, compute/cpdma flags rework, copy shader optimizations, etc. (BIG MR) (!9795) · Merge requests · Mesa / mesa

This MR continues in !10003 (merged).

Below is the first half.

Explicit DCC/CMASK clears are parallelized.
HTILE is enabled for all levels where it's possible (not just level 0).
Sync flags for CP DMA and internal compute are reworked. Now all callers can specify when they want to sync (e.g. before/after).
The maximum variable compute shader workgroup size decreased from 1024 to 512 threads to optimize user SGPR usage in internal shaders (to pack the size in 10 bits per channel).
Some internal compute shaders are optimized.

Tested piglit/glcts/deqp:

Edited Apr 02, 2021 by Marek Olšák

Admin message