local-prflx candidate pair is pruned prematurely
I am consistently encountering an issue with the way libnice attempts to re-try previously failed peer-reflexive candidates.
I am using Gstreamer's webrtcbin
element for this; however, this seems to be a libnice issue specifically. See the attached logs, which I will reference specific lines for below: libnice_prunes_peer_reflexive_candidate.txt
The setup for the issue is the following:
- A remote peer reflexive is found (L36)
- Local candidate gathering finishes, which prompts the NiceAgent to create a new candidate pair with the local UDP host candidate and remote prflx candidate, and add them to the conncheck. (L61-L72)
- The conncheck then unfreezes this pair soon after and finds that the remote credentials have not yet been set. (L76-L85) This pair is then set as FAILED (L86).
Once the answer SDP arrives, the remote credentials are set (L94). At this point, the conncheck recognizes that this pair needs to be re-tested and the pair is added to the triggered check queue (L100).
RFC 5245 (section 7.2.1.4) states at this point:
If the state of the pair is Failed, it is changed to Waiting and the agent MUST create a new connectivity check for that pair (representing a new STUN Binding request transaction), by enqueueing the pair in the triggered check queue.
libnice adds this pair to the queue; however, it also leaves this candidate pair in the FAILED state. If a new remote candidate is added prior to us re-checking this pair, the pair ends up getting pruned because it's in the FAILED state still (L114).
This remote peer reflexive candidate is the only viable remote candidate in this networking scenario, so the component ends up failing.
It seems like libnice should change the pair's state to NICE_CHECK_WAITING
here to be more resilient to this situation and prevent premature pruning.