Optimization: discontinuous VS/PLBU command buffer
Currently we build VS/PLBU command buffer in a dynamic array, then copy it to GPU buffer before submit. But VS/PLBU has continue command which can be used to create separate command buffer as needed which saves the copy.
Here is the steps:
- create a GPU bo to hold VS/PLBU commands generated from the beginning
- when it's full, create a new one and point the previous bo to it with continue command
From some experiments, I found:
- the next bo's va must be bigger than current one, so it's jump forward not backward
- vs/plbu_cmd_start/end is set to the first and last command by va, as bo's va is incremental, so this is also a range of the VS/PLBU command buffer, and there is some hole in this range
Record here, some one may continue the work before I have time.