aggregator: Races between sinkpad chain functions, srcpad task and determining timeouts
What I can see in the tests of gst-plugins-rs!159 (merged) is the following. Specifically in
test_no_drops_but_no_fallback_frames_not_live it is relatively easy to reproduce in a loop.
- A buffer arrives on the normal sinkpad and is signalled
- Aggregate function is checking if pads are ready. No: because the fallback sinkpad has no buffers yet
get_next_time()is called, and to determine the timeout we look at which pads have buffers.
peek_buffer()on the normal sinkpad (see 1!) returns
- We wait forever to be woken up again once a buffer arrives so we can calculate a new timeout based on that
The reason for 3) is that the queue for the sinkpad contains first a segment and caps event and only then a buffer. The events were supposed to be handled before
gst_aggregator_aggregate_func(), however they arrived after the handling of the events.
I believe the whole loop inside
gst_aggregator_aggregate_func() should be protected by the
SRC_LOCK so that between the top of the loop and the waiting no new things can be queued up in the pads, as otherwise we end up with inconsistent state like the above.
gst_aggregator_check_pads_ready() should also check for events to be handled and in that case don't cause waiting but first handling of the events.
However holding the
SRC_LOCK will cause deadlocks so I'm not sure what to do there yet.