Skip to content
  • Thomas Haller's avatar
    device: generate unique default route-metrics per interface · 6a32c64d
    Thomas Haller authored
    In the past we had NMDefaultRouteManager which would coordinate adding
    the default-route with identical metrics. That especially happened, when
    activating two devices of the same type, without explicitly specifying
    ipv4.route-metric. For example, with ethernet devices, the routes on
    both interfaces would get a metric of 100.
    
    Coordinating routes was especially necessary, because we added
    routes with NLM_F_EXCL flag, akin to `ip route replace`. We not
    only had to avoid that activating two devices in NetworkManager would
    result in a fight over the default-route, but more importently
    to preserve externally added default-routes on unmanaged interfaces.
    
    NMDefaultRouteManager would ensure that in case of duplicate
    metrics, that the device that activated first would keep the
    best default-route. It would do so by bumping the metric
    of the second device to find a unused metric. The bumping itself
    was not very important -- MDefaultRouteManager could also just not
    configure any default-routes that show up as second, the result
    would be quite similar. More important was to keep the best
    default-route on the first activating device until the device
    deactivates or a device activates that really has a better
    default-route..
    
    Likewise, NMRouteManager would globally manage non-default-routes.
    It would not do any bumping of metrics, but it would also ensure that the routes
    of the device that activates first are not overwritten by a device activating
    later.
    
    However, the `ip route replace` approach has downsides, especially
    that it messes with routes on other interfaces, interfaces that are
    possibly not managed by NetworkManager. Another downside is, that
    binding a socket to an interface might not result in correct
    routes, because the route might just not be there (in case of
    NMRouteManager, which wouldn't configure duplicate routes by bumping
    their metric).
    
    Since commit 77ec3027 we would no longer
    use NLM_F_EXCL, but add routes akin to `ip route append`. When
    activating for example two ethernet devices with no explict route
    metric configuration, there are two routes like
    
       default via 10.16.122.254 dev eth0 proto dhcp metric 100
       default via 192.168.100.1 dev eth1 proto dhcp metric 100
    
    This does not only affect default routes. In case of a multi-homing
    setup you'd get
    
      192.168.100.0/24 dev eth0 proto kernel scope link src 192.168.100.1 metric 100
      192.168.100.0/24 dev eth1 proto kernel scope link src 192.168.100.1 metric 100
    
    but it's visible the most for default-routes.
    
    Note that we would append the routes that are activated later, as the order
    of `ip route show` confirms. One might hence expect, that kernel selects
    a route based on the order in the routing tables. However, that isn't
    the case, and activating the second interface will non-deterministically
    re-route traffic via the new interface. That will interfere badly with
    with NAT, stateful firewalls, and existing connections (like TCP).
    
    The solution is to have NMManager keep a global index of the default route-metrics
    currently in use. So, instead of determining the default-route metric based solely
    on the device-type, we now in addition generate default metrics that do not
    overlap. For example, if you activate eth0 first, it gets route-metric 100,
    and if you then activate eth1, it gets 101. Note that if you deactivate
    and re-activate eth0, then it will get route-metric 102, because the
    best route should stick on eth1 (which reserves the range 100 to 101).
    
    Note that when a connection explititly selects a particular metric, then that
    choice is honored (contrary to NMDefaultRouteManager which was more concerned
    with avoiding conflicts, then keeping the exact metric).
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1505893
    6a32c64d