Skip to content

cuda: Conversion performance improvement and more features

Seungha Yang requested to merge seungha.yang/gstreamer:cuda-convertscale into main

Summary of this MR

  • rewrite conversion object
  • "add-border" property to scale elements (which is enabled by default, like videoscale)
  • add cudaconvertscale element to colorspace conversion and rescaling at once
  • add planar 8bits RGB formats (RGBP, BGRP, GBR, and GBRA) support
    cudaconvertscale, cudascale: Add "add-borders" property and support 8bits RGB planar formats

    Adding "add-borders" property which is identical to that of
    videoscale and this will be enabled by default.
    And adding RGBP/BGRP/GBR/GBRA format support.
    cuda: Rewrite colorspace/rescale object

    Rewriting GstCudaConverter object, since the old implementation was not
    well organized and it's hard to add new features.
    Moreover, the conversion operations were not very optimized.

    Major change of this implementation:
    * Remove redundant intermediate conversion operations such as
      any RGB -> ARGB(64) conversion or any YUV -> Y444 (or 16bits Y444).
      That's not required most of cases. The only required case is
      converting 24bits (such as RGB/BGR) packed format to 32bits format
      because CUDA texture object does not support sampling 24bits format
    * Use normalized sample fetching (i.e., [0, 1] range float value)
      and also normalized coordinates system for CUDA texture.
      It's consistent with the other graphics APIs such as Direct3D
      and OpenGL, that makes sampling operations much easier.
    * Support a kind of viewport and adopt math for colorspace conversion
      from GstD3D11 implementation
    cudaupload,cudadownload: Add support for planar 8bits RGB formats

    Defines RGBP, BGRP and GBR formats, which have the same memory
    layout as already supported Y444
    cudacontext: Store texture alignment

    it was missed in the previous refactoring
    cudaconvert, cudascale: Port to GstCudaBaseCovert baseclass

    Don't need to hold duplicated code in the source tree
    cuda: Add convertscale element

    GstCudaConverter object can do colorspace conversion and scale at once.
    Adding new element "cudaconvertscale" to do that, this can
    save unnecessary GPU operation if colorspace conversion and
    rescale is required for given input stream format.

    Most of codes are taken from d3d11convert element
Edited by Seungha Yang

Merge request reports