bluetooth: Improve Gst encoder/decoder pipeline efficiency
This issue started out as stuttering and delays observed with GStreamer encoder/decoder pipelines, and I think we all agree setting the threads to RT is just a band-aid. Even if the overhead of ping-ponging between the encoder and IO thread is negligible - including buffer copies to guarantee data lifetime - it is unnecessary and complicates the code where SBC and the previous codec implementation would simply perform en/decoding in-line, within the IO thread.
The goal of this issue is to remove all semaphores and buffer copies, including the queue in the form of a GstAdapter
.
The initial discussion around this got started in !440 (comment 745891), leading to two possible solutions. Both aim to spawn at most one thread and perform all processing within it synchronously, all the way from rendering PA sink audio -> encoding -> write to transport socket or read from transport socket -> decode -> post to PA source (this concept is called chaining and what will be referred to below).
1. Run the pipeline synchronously within the IO thread
Apparently it should be possible to use a pipeline (which is just an extension of a GstBin
- a group/chain of connected elements) without threads and other synchronization primitives that come with it. Effectively this means there is a function to call with an input buffer (pointer to some memory, like a PA memblock, wrapped in a GstBuffer
) of audio data that updates encoder state and eventually returns a number of readily encoded/decoded bytes in some other buffer that is most likely provided by the caller, all without spawning or synchronizing with any threads (within GStreamer).
If possible this should be rather straightforward to implement: encode_buffer
/decode_buffer
are already set up for this use-case, with no code changes required outside that.
2. Handle data entering/exiting the pipeline within the GStreamer thread
As far as the A2DP transport socket (file descriptor) an PA sink rendering and source are concerned they can theoretically run anywhere 1, in this case the GStreamer pipeline thread to form one synchronous chain again completely circumventing PA's IO thread and logic in module-bluez5-device
. This approach might even be implementable with "direct" interactions with the appsink/source as listed in 1. (rewrapped in an element) or by creating (reusable) gstreamer elements in PA.
We don't even need to implement the Bluetooth end: GStreamer already provides a2dpsink
/avdtpsink
/avdtpsrc
which, when given a transport endpoint, opens it, reads MTUs and performs all the data reading/writing, circumventing lots of complex code. A hacky form of this (for sinks only) has been implemented in MarijnS95/pulseaudio@6b25a5c2 but without the input part (rendering PA audio and shoving it into the pipeline) which is why all thread logic had to be damaged like this. That can all be reverted when the sink-rendering/source-posting functionality is implemented in a GStreamer element - then the IO thread can simply not be started and it should all work fine.
We will however need an element that renders PA data and pushes it into the pipeline, or more precisely, something that drives the pipeline. GStreamer should support this all but I have not yet had the time to experiment here.
In short, either:
- The sink (end of the pipeline) should drive the pipeline in a pull-based manner: then the input element might be a simple
appsrc
with a callback when it needs data: this callback callspa_sink_render_full
and provides the data back toappsrc
(synchronously); - The source element should drive the pipeline (push-based): here also I'm expecting GStreamer to invoke a callback X times a second based on some clock, where we can wait for and render audio data in PA and provide it synchronously to the
appsrc
.
(Of course the inverse applies for A2DP sources)
Downsides of this method:
- PA already includes lots of (admittedly complicated) logic to get data in and out of a Bluetooth transport. This has to live on and be maintained for the non-GStreamer case like SBC;
-
bluez5-device
might need quite some modifications to deal with codecs that circumvent the IO thread (though this might also be as simple as two callback hooks instart_thread
/stop_thread
) - has to be investigated; - Probably need two pipelines/threads for bidirectional audio.
One advantage is that we can reuse such PA <-> gstreamer
elements elsewhere, ie. in RTP that also uses appsrc
and an IO-thread pushing data into GStreamer.
Can we get bluetooth assigned here and in all other relevant issues/PRs?
-
PA sink/source functions require to run in their definition of an "IO thread": we can simply hook into GStreamers thread-spawning code and provide such an IO thread instead.
↩