Skip to content

Draft: h264parse: add insert-cc=a53 to mux DTVCC streams

This change to h264parse allows it to insert a DTVCC stream into a H.264 stream as ATSC A/53 Part 4 SEI NALs. This allows muxing {EIA,CEA,CTA}-{608,708} closed captions into H.264 streams without needing to transcode.

Without this patch, you'd need to transcode the input video to mux captions:

gst-launch-1.0 -v \
  cccombiner name=ccc schedule=false ! x264enc pass=quant ! \
    mp4mux name=muxer ! filesink location=output.mp4 \
  filesrc location=input.mp4 ! decodebin ! queue ! ccc. \
  filesrc location=input.mcc ! mccparse ! queue ! ccconverter ! \
    closedcaption/x-cea-708,format=cc_data ! ccc.caption

With this patch, you can cut out the H.264 decode and encode:

gst-launch-1.0 -v \
  cccombiner name=ccc schedule=false ! h264parse insert-cc=a53! \
    mp4mux name=muxer ! filesink location=output.mp4 \
  filesrc location=input.mp4 ! qtdemux ! queue ! ccc. \
  filesrc location=input.mcc ! mccparse ! queue ! ccconverter ! \
    closedcaption/x-cea-708,format=cc_data ! ccc.caption

(Note: the above examples assume an ISO-MP4 file with 1 video track, you'll need extra glue to copy across any audio tracks.)

Adding captions to a simple, 20 second 720p30 video:

  • Full transcode (using x264enc): 6.4 seconds
  • This patch, with insert-cc=a53: 0.05 seconds (112x speed-up)

There's still some work to be done, which I need help with, because I don't fully understand the gory details of H.264:

  • Rec. ITU-T H.264 §7.4.1.2.3 says that the SEI must come after the AU Delimiter, SPS and PPS if they are present, and before the first VCL NAL of the primary picture. This patch puts it before the first picture NAL. Not every frame already has an SEI, or an AUD (nor is it consistently addded).

    It looks like the existing timing SEI code assumes there is at least one timing SEI that exists -- this doesn't hold true.

    The existing version can't have an SEI on the first frame, and Comcast's Caption Inspector reports a garbage timecode for the first caption SEI (probably because it's uninitialised memory).

  • It may be possible to simplify the SEI insertion, I tried using gst_h264_parser_insert_sei and gst_h264_parser_insert_sei_avc but I couldn't get valid outputs.

  • Is the NAL alignment code correct, or even needed? This was copied from gst_h264_parse_create_pic_timing_sei().

    The video I've used (generated with videotestsrc ! x264enc ! mp4mux) appears to be AU-aligned:

    0.00.00.019058500-gst-launch.PAUSED_PLAYING.dot

Edited by Michael Farrell

Merge request reports