WIP: tu: Enable concurrent resolves
What we know about concurrent resolves:
-
RB_CCU_CNTL::CONCURRENT_RESOLVE
controls the resolve modes, I named them:DISABLED
-
AUTO
- I don't have evidence that it works, doesn't do anything on a630 or a660, though blob doesn't enable concurrent resolve on a630 and usesMANUAL
mode on a660. Presumably HW itself determines when to synchronize blits. -
MANUAL
- we have to manually specify the last blit in the sequence viaRB_BLIT_INFO::LAST
. I could guess that blits execution is deferred untilLAST
is met or something else forces synchronization.
-
RB_CCU_CNTL::UNK1
- slows down the resolves. -
RB_BLIT_INFO::BUFFER_ID
- if two blits have the same buffer id, they cannot be executed in parallel. blob uses it in both GL and VK contrary to the old comment. -
RB_BLIT_INFO::LAST
- seeCONCURRENT_RESOLVE::MANUAL
.
Testing on a660, enabling MANUAL
concurrent resolve mode I've got 10-20% perf improvement of gmem stores. For testing I had 8 color and 1 D/S attachment with size of 2048x2048, and one triangle draw call.
Tested with v631 blob driver:
GPU | Resolve Mode | RB_CCU_CNTL::UNK1 |
---|---|---|
616/618/619 | disabled | true |
620 | auto | false |
630 | disabled | true |
640 | disabled | true |
650 | auto | false |
660 | manual | false |