The dGPU prime feature is to allow integrated GPU(iGPU) to display output rendered by discrete GPU(dGPU). We are working to implement dGPU prime feature on XEN guest VM which has VirtIO iGPU and passthrough dGPU. We want to use passthrough dGPU to render data and iGPU to display data so that XEN guest VM would be able to handle more using situations that require higher performance. But the original virgl doesn't support this feature because virtio-gpu driver doesn't support DMA operations so iGPU cannot import data from passthrough dGPU directly on XEN guest VM. So we comp up with a solution for mesa to allow dGPU blit data to the display buffer of iGPU so that passthrough dGPU rendered data can be displayed in XEN guest VM.
This idea includes three main steps:
- Allow driScreenDisplayGPU created from fd_display_gpu if fd_display_gpu is not same as fd_render_gpu. In our case, this display_gpu is iGPU. A linear_buffer would be created from this driScreenDisplayGPU in dri3_alloc_render_buffer() and regarded as display buffer. With this linear buffer, every time swap front/back buffer in dri3_flush_present_events() would lead to a blit from render buffer of dGPU to linear buffer of iGPU.
- Round stride of linear buffer by 256. When dGPU try to import buffer created by iGPU, it requires the buffer to be aligned by 256 because the stride of dGPU is different from virtio iGPU. So in this case, we round stride to 256 and then create the linear buffer.
- Send flush command from guest to host. Every time dGPU blit data from render buffer to linear buffer, virgl send transfer command with stride to flush host data. Here we need to remove stride check of virtio-gpu so that virgl can send transfer command to host. Please see the related kernel virtio-gpu implementation virtio-gpu: Remove stride and layer_stride check for dGPU prime on VM
Please refer to page12 to see more details of our platform and dGPU feature :https://static.sched.com/hosted_files/xen2023/41/VirtIO_Passthrough_GPU_on_Xen_Summit_2023.pdf
- Create blob memory for linear buffer, which is display buffer in step1 listed above. With blob memory, linear buffer could be mapped to guest so that guest mesa can blit data to linear buffer directly.
- Remove step3 listed above. Since blob memory could be read and wrote directly, we don't have send command everytime guest mesa swap buffers. With this, dGPU prime feature got higher benchmark scores in XEN VMs.
- Implement dGPU prime on venus.
- Implement resource_query_layout to get the correct stride of linear resource instead of hard code.
- Please see the related virglrenderer implementation virglrenderer: enable gbm_bo_create for linear resource