Race condition in Wireguard connection when resolving DNS name of dual-stack server (IPv4/IPv6)
Setup
I'm using the wireguard support in NetworkManager to connect to a wireguard server that has full dual-stack support:
- the DNS name of the wireguard server resolves to both A and AAAA addresses
- I obtain a default route for both IPv4 and IPv6 over the VPN
- however, I only have IPv4 connectivity outside of the VPN
Bug description
There seems to be a race condition when connecting: in many cases, NetworkManager tells wireguard to use the IPv6 address of the server as the remote endpoint. As a result, the VPN doesn't connect because I have no IPv6 connectivity. Sometimes NetworkManager gets it right and tells wireguard to use the IPv4 address of the server, and in that case the VPN can connect.
Investigation
Here is what I think happens:
- NetworkManager starts resolving the DNS name of the server
- in the meantime, NetworkManager sets up a default route through the wireguard interface (using the policy routing & fwmark stuff), for both IPv4 and IPv6
- when the DNS name has finished resolving, it returns
A
andAAAA
records. Since an IPv6 default route is now available, NetworkManager selects theAAAA
record (IPv6 address). This is the bug: the IPv6 default route goes over the VPN that is being setup.
How could the bug be fixed
Solution 1
As a simple solution, NetworkManager should wait for DNS resolution to complete before setting up routing.
Solution 2
Another solution would be to detect that the default route goes through the VPN, with something similar to:
$ ip route get $VPN_SERVER_IPV4_ADDRESS mark 0xcaf6
$VPN_SERVER_IPV4_ADDRESS via 192.168.43.1 dev wlp0s20f3 src 192.168.43.69 mark 0xcaf6 uid 1000
$ ip -6 route get $VPN_SERVER_IPV6_ADDRESS mark 0xcaf6
RTNETLINK answers: Network is unreachable
This tells us that IPv6 is not available outside of the VPN, but IPv4 is. We could imagine IPv6-only situations where the reverse is true.
Solution 3
Finally, perhaps more elegantly, NetworkManager could only setup the routes when Wireguard has successfully connected, but that might be too hard to detect reliably.
System information
I'm using Debian bullseye which has a pretty recent version of NetworkManager:
$ lsb_release -c
Codename: bullseye
$ apt show network-manager
1.26.2-1
$ NetworkManager --version
1.26.2
Below is a dump of the wireguard connection profile:
[connection]
id=wg-test
uuid=5706cba3-315e-48b0-95b5-610c5e986c84
type=wireguard
interface-name=wg0
permissions=
timestamp=1600374956
[wireguard]
mtu=1400
private-key=REDACTED
[wireguard-peer.SERVER_PUBLIC_KEY]
endpoint=vpn-wg.illyse.org:443
persistent-keepalive=120
allowed-ips=0.0.0.0/0;::/0;
[ipv4]
address1=XX.YY.ZZ.TT/32
dns-search=
method=manual
[ipv6]
addr-gen-mode=stable-privacy
address1=2001:db8::1/128
dns-search=
ip6-privacy=0
method=manual
[proxy]