Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • gst-plugins-good gst-plugins-good
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 648
    • Issues 648
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 82
    • Merge requests 82
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GStreamer
  • gst-plugins-goodgst-plugins-good
  • Issues
  • #912

Closed
Open
Created Aug 06, 2021 by Aleksandr Slobodeniuk@aslobodeniuk

matroskademux: subrip subtitles can be rendered with XML tags

About the issue:

We have a file, it's muxed with ffmpeg, and contains SubRip subtitles (from an srt file) SampleVideo_640x480_1mb.mkv

VLC can play this file without issues mkv-vlc

But if we play it with gst-play-1.0, we can see the XML tags rendered together with the subtitle text mkv-gst


Subtitles are correctly muxed and have codec id of the SubRip format: S_TEXT/UTF8

When matroskademux element opens this file, for subtitles stream it exposes pango-markup caps

...
  if (!strcmp (codec_id, GST_MATROSKA_CODEC_ID_SUBTITLE_UTF8)) {
    /* well, plain text simply does not have a lot of markup ... */
    caps = gst_caps_new_simple ("text/x-raw", "format", G_TYPE_STRING,
        "pango-markup", NULL);
    context->postprocess_frame = gst_matroska_demux_check_subtitle_buffer;
    subtitlecontext->check_markup = TRUE;
...

This particular action (saying that SubRip is a pango markup) is wrong: SubRip has it's own markup, and it's not always compatible with pango, and the file attached is the case.

About a fix:

If we open with GStreamer an srt file with same subtitles, it doesn't have described issue, because the input is handled by a subparse element, that converts different subtitle formats to pango-markup and also throws away unknown markups.

Idea of fix that is going to be proposed in the MRs is to make subparse element autoplug after matroskademux and make the convertion SubRip --> pango-markup. To do that we do 2 things:

  1. Instead of "pango-markup" expose from matroskademux some new format, let's call it "text/x-subrip-muxed" (do you know, maybe there's already some existing format for it?). NOTE: We can't use "application/x-subtitle" that is used for srt files, because it's a little bit different: data of such format is supposed to have number and timestamp inside of the text. MR to matroskademux element, (gst-plugins-good)

  2. Make subparse element handle "text/x-subrip-muxed". MR to subparse element (gst-plugins-base)

Edited Aug 06, 2021 by Aleksandr Slobodeniuk
Assignee
Assign to
Time tracking