dns: move ratelimiting and restart from NMDnsManager to NMDnsDnsmasq
Note that the only DNS plugin that actually emits the FAILED signal was NMDnsDnsmasq. Let's not handle restart, retry and rate-limiting by NMDnsManager but by NMDnsDnsmasq itself. There are three goals here: (1) we want that when dnsmasq (infrequently) crashes, that we always keep retrying. A random crash should be automatically resolved and eventually dnsmasq should be working again. Note that we anyway cannot fully detect whether something is wrong. OK, we detect crashes, but if dnsmasq just gets catatonic, it's just as broken. Point being: our ability to detect non-working dnsmasq is limited. (2) when dnsmasq keeps crashing all the time, then rate limit the retry. Of course, at this point there is already something seriously wrong, but we shouldn't kill the system by respawning the process without rate limiting. (3) previously, when NMDnsManager noticed that the pluging was broken (and rate-limiting kicked in), it would temporarily disable the plugin. Basically, that meant to write the real name servers to /etc/resolv.conf directly, instead of setting localhost. This partly conflicts with (1), because we want to retry and recover automatically. So what good is it to notice a problem, resort to plain /etc/resolv.conf for a short time, and then run into the issues again? If something is really broken, there is no way but to involve the user to investigate and fix the issue. Hence, we don't need to concern NMDnsManager with this either. The only thing that the manager notices is when the dnsmasq binary is not available. In that case, update() fails right away, and the manager falls back to configure the name servers in /etc/resolv.conf directly. Also, change the backoff time from 5 minutes to 1 minute (twice the burst interval). There is not particularly strong reason for either choice, I think that if the ratelimit kicks in, then something is already so wrong that it doesn't matter either way. Anyway, also 60 seconds is long enough to not kill the machine otherwise.
Showing with 90 additions and 138 deletions
Result: UNSTABLE: Some tests failing
Passed: 954, Failed: 1