-
Samuel Pitoiset authored
We already track if the DMA engine is busy/idle with a flag, and we emit a packet that waits for all CP DMA operations to be complete. This is done at end of command buffer because the kernel doesn't wait for them, and also when emitting barriers, so it should be safe. This improves small copies for both aligned and unaligned sizes. Aligned sizes: BEFORE: 1 KB: 59.840000 ms 2 KB: 71.200000 ms AFTER: 1 KB: 31.200000 ms 2 KB: 31.040000 ms Unaligned sizes: BEFORE: 2 KB: 68.3200 ms 3 KB: 79.3600 ms 5 KB: 76.6400 ms 9 KB: 90.8800 ms 17 KB: 116.0000 ms AFTER: 2 KB: 31.0400 ms 3 KB: 32.0000 ms 5 KB: 30.8800 ms 9 KB: 30.5600 ms 17 KB: 29.6000 ms Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
3fb4adae