webrtcsink: panic when using `Homegrown` congestion control
Hi there,
we are seeing the following crash from time to time (I don't have the full trace right now, but error is helpful):
thread '<unnamed>' panicked at 'failed to upgrade `element` (if you don't want to panic, use @default-return)', net/webrtc/src/webrtcsink/imp.rs:1658:33
It is related to line 13
in the below piece of code (current line: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/-/blob/main/net/webrtc/src/webrtcsink/imp.rs#L2155) available if the homegrown congestion control is enabled. Our line numbers are different because we haven't updated the plugin to its newest version yet, however, those lines are the same still.
1 if session.congestion_controller.is_some() {
2 let session_id_str = session_id.to_string();
3 if session.stats_sigid.is_none() {
4 session.stats_sigid = Some(rtpbin.connect_closure("on-new-ssrc", true,
5 glib::closure!(@weak-allow-none element, @weak-allow-none webrtcbin
6 => move |rtpbin: gst::Object, session_id: u32, _src: u32| {
7 let rtp_session = rtpbin.emit_by_name::<gst::Element>("get-session", &[&session_id]);
8 let element = element.expect("on-new-ssrc emited when webrtcsink has been disposed?");
9 let webrtcbin = webrtcbin.unwrap();
10 let mut state = element.imp().state.lock().unwrap();
11 if let Some(mut session) = state.sessions.get_mut(&session_id_str) {
12 session.stats_sigid = Some(rtp_session.connect_notify(Some("twcc-stats"),
13 glib::clone!(@strong session_id_str, @weak webrtcbin, @weak element => @default-panic, move |sess, pspec| {
14 // Run the Loss-based control algorithm on new peer TWCC feedbacks
15 element.imp().process_loss_stats(&element, &session_id_str, &sess.property::<gst::Structure>(pspec.name()));
16 })
17 ));
18 }
19 })
20 ));
21 }
22 }
It looks to me as a race condition between the weak reference in the closure and the actual elements getting destroyed about the same time (that's the reason why it's a bit difficult to reproduce). It seems to me that @default-return
that signals an error and propagating it upstream to avoid that the corresponding session to start could be the solution. Also, why not passing strong references here?
Also, in line 5
, None
is allowed, but in the body, they are expect
'd and unwrap
'd, which would result in the process going down as well if for some reason None
is found. Wouldn't these situations be better handled if the corresponding session simply does not start on any such kind of errors?
Thanks in advance!