Skip to content

Draft: webrtcsink: Add video simulcast support to LiveKit sink

This change adds video simulcast support to webrtcsink as it's supported by LiveKit but it shouldn't be too much different for other servers that support simulcasts.

The overall idea is that the sender application supplies multiple video streams at different qualities that contribute to one logical video track. The LiveKit server (SFU) will decide which quality to send to each receiving client based on a combination of the client's preferences and network conditions.

This implementation currently only creates simulcasts in configurations where the sink creates the SDP offer.

Also, there is a more advanced version of WebRTC simulcasts with scalable video codecs where higher quality components of a track may depend on its lower quality components which this implementation doesn't support.

What's needed in this case are the following:

  • A mechanism to group video streams together as one track which is the MID header extension
  • A mechanism to tag each stream within a track which is the RID header extension
  • A mechanism to assign a role/priority to the streams within a track which is the a=simulcast media attribute.

Based on what I've tested, webrtcbin has enough support for simulcasts to make this work as long as each component of a simulcast within a single media section has its own SSRC.

The changes necessary to the webrtcsink are:

  • Allow MID/RID attributes to be set on each sinkpad
  • Create a tree of MID=>RIDs based on the element's sinkpads
  • For each simulcast in the tree, the contributing video streams will share a single webrtcbin sinkpad through a rtpfunnel
  • Assign MID and RID header extensions to the payloader of each sinkpad that's part of a simulcast
  • For each simulcast, assign a list of caps to each rtpfunnel with the appropriate a-mid, rid-*, and a-simulcast attributes by combining the caps of its members
  • For each video stream that's a member of a simulcast, also add the standard max-width and max-height video dimension restrictions. This is not absolutely necessary in general but some servers might prefer that this transmitted in the SDP. More importantly for LiveKit purposes, it allows a kind of informal interface between the sink element and the signalling client.

The changes necessary for the LiveKit signalling client are:

  • Add a basic parser for SDP media restrictions and build LiveKit VideoLayers using the video dimension restrictions specified for each RID.

Also, LiveKit requires that users follow the convention of using the following RID values within a simulcast:

  • "q" for the low quality ("quarter") stream
  • "h" for the medium quality ("half") stream
  • "f" for the high quality ("full") stream

The sink element doesn't enforce that but the signalling client will only build video layers from those RIDs.

TODO:

  • Consider modifying design so that these additions are only applied to the livekitwebrtcsink subclass to avoid adding unnecessary complexity to the other sinks.
  • Properly incorporate the RTP header extension registration that was added recently.
  • Add some documentation on how to create a simulcast with LiveKit
Edited by Jordan Yelloz

Merge request reports