Skip to content
Snippets Groups Projects

codecs: h264decoder: Add support for output delay

Merged Seungha Yang requested to merge seungha.yang/gst-plugins-bad:h264-render-delay into master

Some decoding APIs support delayed output or a command for decoding a frame doesn't need to be sequential to corresponding command for getting decoded frame. For instance, subclass might be able to request decoding for multiple frames and then get for one (oldest) decoded frame or so. If aforementioned case is supported by specific decoding API, delayed output might show better throughput performance.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Nicolas Dufresne mentioned in merge request !1881 (merged)

    mentioned in merge request !1881 (merged)

  • Seungha Yang resolved all threads

    resolved all threads

  • Nicolas Dufresne resolved all threads

    resolved all threads

  • added 4 commits

    • a6d36984...a417a761 - 2 commits from branch gstreamer:master
    • 86e312c1 - codecs: h264decoder: Add support for output delay
    • fba807be - nvh264sldec: Add support for output-delay to improve throughput performance

    Compare with previous version

  • Does this mean that the output_picture is a sync point for some HW decoder such as V4L2 to complete the current decoding frame? You need to wait for the HW completing the current buffer before pushing it, and so the output-delay can help the performance?

  • I can speak for v4l. In that case we have a request queue, we set a maximum number of request, so if that max is reached we sync after queuing the current request. Otherwise, well sync on the job completion (wait for the request to complete) before calling finish frame.

    The max is needed for two reasons, we want to limit the memory overhead that letting reorder depth + delay would cost in bitstream, and also limit memory overhead for per slice decoders.

    Request are processed in order by the driver, if they were processed in parallel perhaps some extra work would be needed.

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading