Skip to content

rtp: Add PCMU/PCMA RTP payloader / depayloader elements

Sebastian Dröge requested to merge slomo/gst-plugins-rs:rtppcmau2 into main

Just like the other MRs (rtspsrc2, rtpbin2), this is a starting point of the discussion and the design can still be changed if needed. If you look at the discussion threads I opened, some of them are actually design-related open questions.

The goal here is to provide a clean slate for our RTP payloader/depayloader elements and being able to solve design mistakes from the past, and trying different designs. This concerns both the base classes themselves and the actual implementations as the two can't really be separated.

Complete compatibility with the existing elements was a non-goal for me, but we can decide on a case-by-case basis where breaking compatibility makes sense and where it would better be kept. However, the current elements are working fine in playbin (depayloaders) and gst-rtsp-server (payloaders). They don't yet (see discussion threads I created) work flawlessly in e.g. webrtcbin but getting them to work there is a goal, whether that means implementing more features here or explicitly adding support there.

As was brought up in the other MR, performance is definitely one of the goals here and we now have a base on top of which we can easily experiment with new approaches without having to deal with legacy code and being bound by backwards-compatibility. This intentionally does not implement anything clever for performance yet so we don't lock ourselves into a specific direction yet.

Now some words about the design of the base classes

  • As you can see, not all API is used in this MR yet. PCMA/PCMU are not the most interesting formats to rewrite, but the most minimal ones to get the discussion started and get a sense of how the API works. We also have a few other payloaders / depayloaders in the works on top of these new base classes, but those aren't quite ready yet and will be submitted one by one for review once the initial discussions about the API are over and once they're ready for general review.
  • The data processing model in the base classes follows what we do everywhere else nowadays: you receive input, process it, queue/finish output with reference to the input(s) it is made from
    • This allows the base class to figure out metadata for the output automatically and in a reliable way (this is not the case currently)
    • Most importantly, it allows knowing all inputs that belong to an output for handling RTP header extensions. This is kind of added to the existing base classes now but the API is awkward and difficult to use correctly due to backwards-compatibility concerns
    • A single processing model for both base classes, and no confusing, backwards-compatibility-caused API like in the old ones (GstRTPBaseDepayload specifically)
  • Buffer list output is decided automatically and the subclasses do not have to worry about that everywhere, unlike now
  • RTP packet handling is not based on libgstrtp but on a very lightweight new implementation (which is also not set into stone yet!), allowing us to experiment with new approaches for better performance. It is not yet doing anything special, but there are ideas and plans for the future.
  • Properties that do not make sense for all payloaders (ptime, etc) are moved to subclasses instead of having sometimes-unused properties on the base class
  • SSRC and PT are selected by property instead of both property and caps negotiation. This is an open topic, but the current approach of negotiating these is rather brittle and also only works in special cases. The general idea here was that if downstream actually wants a different SSRC or PT, then downstream can easily change that in the packets (related: SSRC collission). I've opened a discussion thread about that so we can figure out what we want here.
  • There is no specific handling for packet-loss events yet (see also related discussion thread). The current approach does not seem great and is not used much, other than the conversion to a gap event that technically is not needed in live streams anyway. There's also the whole discussion with FEC related here, where we have workarounds for ULPFEC deficiencies everywhere. For handling this, let's see if we can come up with something more useful (and in the case of ULPFEC there are also some ideas in the context of rtpbin2).
  • The raw(ish) audio payloader base class properly detects discontinuities (and like the old one handles ptime etc), and there will be a similar base class for encoded audio formats at a later time (that actually handles ptime etc if possible for the codec in question).

This is definitely not intended to be the perfect design and implementation yet, but a start of how we can get there without having to worry too much about backwards-compatibility or API stability. I also do not intend this MR to be merged only once we reached that, but once we have a reasonable baseline on top of which we can iterate for further improvements.

Edited by Sebastian Dröge

Merge request reports