aggregator: Races between sinkpad chain functions, srcpad task and determining timeouts
CC @meh
What I can see in the tests of gst-plugins-rs!159 (merged) is the following. Specifically in test_no_drops_but_no_fallback_frames_not_live
it is relatively easy to reproduce in a loop.
- A buffer arrives on the normal sinkpad and is signalled
- Aggregate function is checking if pads are ready. No: because the fallback sinkpad has no buffers yet
-
get_next_time()
is called, and to determine the timeout we look at which pads have buffers.peek_buffer()
on the normal sinkpad (see 1!) returnsNULL
- We wait forever to be woken up again once a buffer arrives so we can calculate a new timeout based on that
The reason for 3) is that the queue for the sinkpad contains first a segment and caps event and only then a buffer. The events were supposed to be handled before gst_aggregator_wait_and_check()
in gst_aggregator_aggregate_func()
, however they arrived after the handling of the events.
I believe the whole loop inside gst_aggregator_aggregate_func()
should be protected by the SRC_LOCK
so that between the top of the loop and the waiting no new things can be queued up in the pads, as otherwise we end up with inconsistent state like the above.
Alternatively gst_aggregator_check_pads_ready()
should also check for events to be handled and in that case don't cause waiting but first handling of the events.
However holding the SRC_LOCK
will cause deadlocks so I'm not sure what to do there yet.