vulkan colorconvert performs poorly on embedded platforms because uniform variables in fragment shader
When we use VulkanColorconvert for color conversion in our embedded platform (imx8mqevk platform), the performance of the pipeline becomes poor.
The following case is color convert BGRA->BGRA:
- Generate a video for testing:
gst-launch-1.0 videotestsrc num-buffers=10 ! video/x-raw,format=BGRA,width=1920,height=1080 ! filesink location=./input_BGRA_1080p_10frames.rgb
- vulkan color convert:
gst-launch-1.0 multifilesrc location=./input_BGRA_1080p_10frames.rgb num-buffers=50 ! videoparse format=12 width=1920 height=1080 framerate=60/1 ! queue ! vulkanupload ! vulkancolorconvert ! "video/x-raw(memory:VulkanImage),format=BGRA" ! vulkansink sync=false
In above pipeline, the frame rate of the vulkan plugins is only 10 fps. After debugging, we found that the cause of this issue was the uniform variables in vulkancolorconvert fragment shader, as shown follows.
uniform reorder {
ivec4 in_reorder_idx;
ivec4 out_reorder_idx;
};
The reorder
is stored in vulkan buffer, and its value is set by gstreamer using gst_vulkan_full_screen_quad_set_uniform_buffer()
. This uniform reorder
lead to frequent DDR read/write making performance poor. When we change the uniform variables to non-uniform variables in fragment shader, the performance of the pipeline was improved from 10 fps to 35 fps.
Have you ever found issues like this when testing vulkan plugins?
In addition, have you tested vulkan colorconvert on embedded platforms? If so, can you please tell us what the platform you used to test?