Skip to content

ac,radeonsi: add optimized clear/copy_buffer compute shader into AMD common code, supporting unaligned copies

Marek Olšák requested to merge mareko/mesa:ac-clear-copy-buffer-cs into main

This is a substantial improvement of the clear/copy_buffer compute shader in radeonsi, which is also moved to src/amd/common.

This adds support for unaligned buffer clears and copies while maintaining the same performance as aligned clears and copies. The optimal alignment for buffer offsets is 256, not 4.

More chip-specific tuning will follow, but this is already optimal for Navi31.

Merge request reports