Commit 010ecd50 authored by Committed by Olivier CrêteBrowse files
stun: update timer timeout and retransmissions
This patch updates the stun timing constants and provides the rationale with the choice of these new values, in the context of the ice connection check algorithm. One important value during the discovery state is the combination of the initial timeout and the number of retransmissions, because this state may complete after the last stun discovery binding request has timed out. With the combination of 500ms and 3 retransmissions, the discovery state is bound to 2000ms to discover server reflexive and relay candidates. The retransmission delay doubles at each retransmission except for the last one. Generally, this state will complete sooner, when all discovery requests get a reply before the timeout. Another mechanism is used during the connection check, where an stun request is sent with an initial timeout defined by : RTO = MAX(500ms, Ta * (number of in-progress + waiting pairs)) with Ta = 20ms The initial timeout is bounded by a minimum value, 500ms, and scales linearly depending of the number of pairs on the way to be emited. The same number of retransmissions than in the discovery state in used during the connection check. The total time to wait for a pair to fail is then RTO + 2*RTO + RTO = 4*RTO with 3 retransmissions. On a typical laptop setup, with a wired and a wifi interface with IPv4/IPv6 dual stack, a link-local and a link-global IPv6 address, a couple a virtual addresses, a server-reflexive address, a turn relay one, we end up with a total of 90 local candidates for 2 streams and 2 components each. The connection checks list includes up to 200 pairs when tcp pairs are discarded, with : <33 in-progress and waiting pairs in 50% cases (RTO = 660ms), <55 in-progress and waiting pairs in 90% cases (RTO = 1100ms), and up to 86 in-progres and waiting pairs (RTO = 1720ms) The number of retransmission of 3 seems to be quite robust to handle sporadic packets loss, if we consider for example a typical packet loss frequency of 1% of the overall packets transmitted. And a relatevely large initial timeout is interesting because it reduces the overall network overhead caused by the stun requests and replies, mesured around 3KB/s during a connection check with 4 components. Finally, the total time to wait until all retransmissions have completed and have timed out (2000ms with an initial timeout of 500ms and 3 retransmissions) gives a bound to the worst network latency we can accept, when no packet is lost on the wire.
Showing with 24 additions and 17 deletions