Skip to content
Snippets Groups Projects
  1. Jun 10, 2022
  2. Feb 25, 2022
    • Vladimir Oltean's avatar
      net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device · ec638740
      Vladimir Oltean authored
      When the switchdev_handle_fdb_event_to_device() event replication helper
      was created, my original thought was that FDB events on LAG interfaces
      should most likely be special-cased, not just replicated towards all
      switchdev ports beneath that LAG. So this replication helper currently
      does not recurse through switchdev lower interfaces of LAG bridge ports,
      but rather calls the lag_mod_cb() if that was provided.
      
      No switchdev driver uses this helper for FDB events on LAG interfaces
      yet, so that was an assumption which was yet to be tested. It is
      certainly usable for that purpose, as my RFC series shows:
      
      https://patchwork.kernel.org/project/netdevbpf/cover/20220210125201.2859463-1-vladimir.oltean@nxp.com/
      
      
      
      however this approach is slightly convoluted because:
      
      - the switchdev driver gets a "dev" that isn't its own net device, but
        rather the LAG net device. It must call switchdev_lower_dev_find(dev)
        in order to get a handle of any of its own net devices (the ones that
        pass check_cb).
      
      - in order for FDB entries on LAG ports to be correctly refcounted per
        the number of switchdev ports beneath that LAG, we haven't escaped the
        need to iterate through the LAG's lower interfaces. Except that is now
        the responsibility of the switchdev driver, because the replication
        helper just stopped half-way.
      
      So, even though yes, FDB events on LAG bridge ports must be
      special-cased, in the end it's simpler to let switchdev_handle_fdb_*
      just iterate through the LAG port's switchdev lowers, and let the
      switchdev driver figure out that those physical ports are under a LAG.
      
      The switchdev_handle_fdb_event_to_device() helper takes a
      "foreign_dev_check" callback so it can figure out whether @dev can
      autonomously forward to @foreign_dev. DSA fills this method properly:
      if the LAG is offloaded by another port in the same tree as @dev, then
      it isn't foreign. If it is a software LAG, it is foreign - forwarding
      happens in software.
      
      Whether an interface is foreign or not decides whether the replication
      helper will go through the LAG's switchdev lowers or not. Since the
      lan966x doesn't properly fill this out, FDB events on software LAG
      uppers will get called. By changing lan966x_foreign_dev_check(), we can
      suppress them.
      
      Whereas DSA will now start receiving FDB events for its offloaded LAG
      uppers, so we need to return -EOPNOTSUPP, since we currently don't do
      the right thing for them.
      
      Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ec638740
  3. Feb 23, 2022
    • Vladimir Oltean's avatar
      net: switchdev: avoid infinite recursion from LAG to bridge with port object handler · acd8df58
      Vladimir Oltean authored
      
      The logic from switchdev_handle_port_obj_add_foreign() is directly
      adapted from switchdev_handle_fdb_event_to_device(), which already
      detects events on foreign interfaces and reoffloads them towards the
      switchdev neighbors.
      
      However, when we have a simple br0 <-> bond0 <-> swp0 topology and the
      switchdev_handle_port_obj_add_foreign() gets called on bond0, we get
      stuck into an infinite recursion:
      
      1. bond0 does not pass check_cb(), so we attempt to find switchdev
         neighbor interfaces. For that, we recursively call
         __switchdev_handle_port_obj_add() for bond0's bridge, br0.
      
      2. __switchdev_handle_port_obj_add() recurses through br0's lowers,
         essentially calling __switchdev_handle_port_obj_add() for bond0
      
      3. Go to step 1.
      
      This happens because switchdev_handle_fdb_event_to_device() and
      switchdev_handle_port_obj_add_foreign() are not exactly the same.
      The FDB event helper special-cases LAG interfaces with its lag_mod_cb(),
      so this is why we don't end up in an infinite loop - because it doesn't
      attempt to treat LAG interfaces as potentially foreign bridge ports.
      
      The problem is solved by looking ahead through the bridge's lowers to
      see whether there is any switchdev interface that is foreign to the @dev
      we are currently processing. This stops the recursion described above at
      step 1: __switchdev_handle_port_obj_add(bond0) will not create another
      call to __switchdev_handle_port_obj_add(br0). Going one step upper
      should only happen when we're starting from a bridge port that has been
      determined to be "foreign" to the switchdev driver that passes the
      foreign_dev_check_cb().
      
      Fixes: c4076cdd ("net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acd8df58
  4. Feb 16, 2022
    • Vladimir Oltean's avatar
      net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces · c4076cdd
      Vladimir Oltean authored
      
      The switchdev_handle_port_obj_add() helper is good for replicating a
      port object on the lower interfaces of @dev, if that object was emitted
      on a bridge, or on a bridge port that is a LAG.
      
      However, drivers that use this helper limit themselves to a box from
      which they can no longer intercept port objects notified on neighbor
      ports ("foreign interfaces").
      
      One such driver is DSA, where software bridging with foreign interfaces
      such as standalone NICs or Wi-Fi APs is an important use case. There, a
      VLAN installed on a neighbor bridge port roughly corresponds to a
      forwarding VLAN installed on the DSA switch's CPU port.
      
      To support this use case while also making use of the benefits of the
      switchdev_handle_* replication helper for port objects, introduce a new
      variant of these functions that crawls through the neighbor ports of
      @dev, in search of potentially compatible switchdev ports that are
      interested in the event.
      
      The strategy is identical to switchdev_handle_fdb_event_to_device():
      if @dev wasn't a switchdev interface, then go one step upper, and
      recursively call this function on the bridge that this port belongs to.
      At the next recursion step, __switchdev_handle_port_obj_add() will
      iterate through the bridge's lower interfaces. Among those, some will be
      switchdev interfaces, and one will be the original @dev that we came
      from. To prevent infinite recursion, we must suppress reentry into the
      original @dev, and just call the @add_cb for the switchdev_interfaces.
      
      It looks like this:
      
                      br0
                     / | \
                    /  |  \
                   /   |   \
                 swp0 swp1 eth0
      
      1. __switchdev_handle_port_obj_add(eth0)
         -> check_cb(eth0) returns false
         -> eth0 has no lower interfaces
         -> eth0's bridge is br0
         -> switchdev_lower_dev_find(br0, check_cb, foreign_dev_check_cb))
            finds br0
      
      2. __switchdev_handle_port_obj_add(br0)
         -> check_cb(br0) returns false
         -> netdev_for_each_lower_dev
            -> check_cb(swp0) returns true, so we don't skip this interface
      
      3. __switchdev_handle_port_obj_add(swp0)
         -> check_cb(swp0) returns true, so we call add_cb(swp0)
      
      (back to netdev_for_each_lower_dev from 2)
            -> check_cb(swp1) returns true, so we don't skip this interface
      
      4. __switchdev_handle_port_obj_add(swp1)
         -> check_cb(swp1) returns true, so we call add_cb(swp1)
      
      (back to netdev_for_each_lower_dev from 2)
            -> check_cb(eth0) returns false, so we skip this interface to
               avoid infinite recursion
      
      Note: eth0 could have been a LAG, and we don't want to suppress the
      recursion through its lowers if those exist, so when check_cb() returns
      false, we still call switchdev_lower_dev_find() to estimate whether
      there's anything worth a recursion beneath that LAG. Using check_cb()
      and foreign_dev_check_cb(), switchdev_lower_dev_find() not only figures
      out whether the lowers of the LAG are switchdev, but also whether they
      actively offload the LAG or not (whether the LAG is "foreign" to the
      switchdev interface or not).
      
      The port_obj_info->orig_dev is preserved across recursive calls, so
      switchdev drivers still know on which device was this notification
      originally emitted.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4076cdd
    • Vladimir Oltean's avatar
      net: switchdev: rename switchdev_lower_dev_find to switchdev_lower_dev_find_rcu · 7b465f4c
      Vladimir Oltean authored
      
      switchdev_lower_dev_find() assumes RCU read-side critical section
      calling context, since it uses netdev_walk_all_lower_dev_rcu().
      
      Rename it appropriately, in preparation of adding a similar iterator
      that assumes writer-side rtnl_mutex protection.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b465f4c
  5. Feb 10, 2022
  6. Dec 08, 2021
  7. Oct 27, 2021
  8. Aug 04, 2021
    • Vladimir Oltean's avatar
      net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge · 957e2235
      Vladimir Oltean authored
      With the introduction of explicit offloading API in switchdev in commit
      2f5dc00f ("net: bridge: switchdev: let drivers inform which bridge
      ports are offloaded"), we started having Ethernet switch drivers calling
      directly into a function exported by net/bridge/br_switchdev.c, which is
      a function exported by the bridge driver.
      
      This means that drivers that did not have an explicit dependency on the
      bridge before, like cpsw and am65-cpsw, now do - otherwise it is not
      possible to call a symbol exported by a driver that can be built as
      module unless you are a module too.
      
      There was an attempt to solve the dependency issue in the form of commit
      b0e81817 ("net: build all switchdev drivers as modules when the
      bridge is a module"). Grygorii Strashko, however, says about it:
      
      | In my opinion, the problem is a bit bigger here than just fixing the
      | build :(
      |
      | In case, of ^cpsw the switchdev mode is kinda optional and in many
      | cases (especially for testing purposes, NFS) the multi-mac mode is
      | still preferable mode.
      |
      | There were no such tight dependency between switchdev drivers and
      | bridge core before and switchdev serviced as independent, notification
      | based layer between them, so ^cpsw still can be "Y" and bridge can be
      | "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE
      | will need to be set as "Y", or we will have to update drivers to
      | support build with BRIDGE=n and maintain separate builds for
      | networking vs non-networking testing.  But is this enough?  Wouldn't
      | it cause 'chain reaction' required to add more and more "Y" options
      | (like CONFIG_VLAN_8021Q)?
      |
      | PS. Just to be sure we on the same page - ARM builds will be forced
      | (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our
      | automation testing will just fail with omap2plus_defconfig.
      
      In the light of this, it would be desirable for some configurations to
      avoid dependencies between switchdev drivers and the bridge, and have
      the switchdev mode as completely optional within the driver.
      
      Arnd Bergmann also tried to write a patch which better expressed the
      build time dependency for Ethernet switch drivers where the switchdev
      support is optional, like cpsw/am65-cpsw, and this made the drivers
      follow the bridge (compile as module if the bridge is a module) only if
      the optional switchdev support in the driver was enabled in the first
      place:
      https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/
      
      
      
      but this still did not solve the fact that cpsw and am65-cpsw now must
      be built as modules when the bridge is a module - it just expressed
      correctly that optional dependency. But the new behavior is an apparent
      regression from Grygorii's perspective.
      
      So to support the use case where the Ethernet driver is built-in,
      NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we
      need a framework that can handle the possible absence of the bridge from
      the running system, i.e. runtime bloatware as opposed to build-time
      bloatware.
      
      Luckily we already have this framework, since switchdev has been using
      it extensively. Events from the bridge side are transmitted to the
      driver side using notifier chains - this was originally done so that
      unrelated drivers could snoop for events emitted by the bridge towards
      ports that are implemented by other drivers (think of a switch driver
      with LAG offload that listens for switchdev events on a bonding/team
      interface that it offloads).
      
      There are also events which are transmitted from the driver side to the
      bridge side, which again are modeled using notifiers.
      SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with
      notifying the bridge that a MAC address has been dynamically learned.
      So there is a precedent we can use for modeling the new framework.
      
      The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work
      that the bridge needs to do when a port becomes offloaded is blocking in
      its nature: replay VLANs, MDBs etc. The calling context is indeed
      blocking (we are under rtnl_mutex), but the existing switchdev
      notification chain that the bridge is subscribed to is only the atomic
      one. So we need to subscribe the bridge to the blocking switchdev
      notification chain too.
      
      This patch:
      - keeps the driver-side perception of the switchdev_bridge_port_{,un}offload
        unchanged
      - moves the implementation of switchdev_bridge_port_{,un}offload from
        the bridge module into the switchdev module.
      - makes everybody that is subscribed to the switchdev blocking notifier
        chain "hear" offload & unoffload events
      - makes the bridge driver subscribe and handle those events
      - moves the bridge driver's handling of those events into 2 new
        functions called br_switchdev_port_{,un}offload. These functions
        contain in fact the core of the logic that was previously in
        switchdev_bridge_port_{,un}offload, just that now we go through an
        extra indirection layer to reach them.
      
      Unlike all the other switchdev notification structures, the structure
      used to carry the bridge port information, struct
      switchdev_notifier_brport_info, does not contain a "bool handled".
      This is because in the current usage pattern, we always know that a
      switchdev bridge port offloading event will be handled by the bridge,
      because the switchdev_bridge_port_offload() call was initiated by a
      NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a
      bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event
      couldn't have happened.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      957e2235
  9. Jul 22, 2021
    • Vladimir Oltean's avatar
      net: switchdev: fix FDB entries towards foreign ports not getting propagated to us · 2b0a5688
      Vladimir Oltean authored
      
      The newly introduced switchdev_handle_fdb_{add,del}_to_device helpers
      solved a problem but introduced another one. They have a severe design
      bug: they do not propagate FDB events on foreign interfaces to us, i.e.
      this use case:
      
               br0
              /   \
             /     \
            /       \
           /         \
         swp0       eno0
      (switchdev)  (foreign)
      
      when an address is learned on eno0, what is supposed to happen is that
      this event should also be propagated towards swp0. Somehow I managed to
      convince myself that this did work correctly, but obviously it does not.
      
      The trouble with foreign interfaces is that we must reach a switchdev
      net_device pointer through a foreign net_device that has no direct
      upper/lower relationship with it. So we need to do exploratory searching
      through the lower interfaces of the foreign net_device's bridge upper
      (to reach swp0 from eno0, we must check its upper, br0, for lower
      interfaces that pass the check_cb and foreign_dev_check_cb). This is
      something that the previous code did not do, it just assumed that "dev"
      will become a switchdev interface at some point, somehow, probably by
      magic.
      
      With this patch, assisted address learning on the CPU port works again
      in DSA:
      
      ip link add br0 type bridge
      ip link set swp0 master br0
      ip link set eno0 master br0
      ip link set br0 up
      
      [   46.708929] mscc_felix 0000:00:00.5 swp0: Adding FDB entry towards eno0, addr 00:04:9f:05:f4:ab vid 0 as host address
      
      Fixes: 8ca07176 ("net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE")
      Reported-by: default avatarEric Woudstra <ericwouds@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b0a5688
  10. Jul 21, 2021
    • Vladimir Oltean's avatar
      net: switchdev: recurse into __switchdev_handle_fdb_del_to_device · 71f4f89a
      Vladimir Oltean authored
      
      The difference between __switchdev_handle_fdb_del_to_device and
      switchdev_handle_del_to_device is that the former takes an extra
      orig_dev argument, while the latter starts with dev == orig_dev.
      
      We should recurse into the variant that does not lose the orig_dev along
      the way. This is relevant when deleting FDB entries pointing towards a
      bridge (dev changes to the lower interfaces, but orig_dev shouldn't).
      
      The addition helper already recurses properly, just the deletion one
      doesn't.
      
      Fixes: 8ca07176 ("net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71f4f89a
  11. Jul 20, 2021
    • Vladimir Oltean's avatar
      net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE · 8ca07176
      Vladimir Oltean authored
      
      Currently DSA has an issue with FDB entries pointing towards the bridge
      in the presence of br_fdb_replay() being called at port join and leave
      time.
      
      In particular, each bridge port will ask for a replay for the FDB
      entries pointing towards the bridge when it joins, and for another
      replay when it leaves.
      
      This means that for example, a bridge with 4 switch ports will notify
      DSA 4 times of the bridge MAC address.
      
      But if the MAC address of the bridge changes during the normal runtime
      of the system, the bridge notifies switchdev [ once ] of the deletion of
      the old MAC address as a local FDB towards the bridge, and of the
      insertion [ again once ] of the new MAC address as a local FDB.
      
      This is a problem, because DSA keeps the old MAC address as a host FDB
      entry with refcount 4 (4 ports asked for it using br_fdb_replay). So the
      old MAC address will not be deleted. Additionally, the new MAC address
      will only be installed with refcount 1, and when the first switch port
      leaves the bridge (leaving 3 others as still members), it will delete
      with it the new MAC address of the bridge from the local FDB entries
      kept by DSA (because the br_fdb_replay call on deletion will bring the
      entry's refcount from 1 to 0).
      
      So the problem, really, is that the number of br_fdb_replay() calls is
      not matched with the refcount that a host FDB is offloaded to DSA during
      normal runtime.
      
      An elegant way to solve the problem would be to make the switchdev
      notification emitted by br_fdb_change_mac_address() result in a host FDB
      kept by DSA which has a refcount exactly equal to the number of ports
      under that bridge. Then, no matter how many DSA ports join or leave that
      bridge, the host FDB entry will always be deleted when there are exactly
      zero remaining DSA switch ports members of the bridge.
      
      To implement the proposed solution, we remember that the switchdev
      objects and port attributes have some helpers provided by switchdev,
      which can be optionally called by drivers:
      switchdev_handle_port_obj_{add,del} and switchdev_handle_port_attr_set.
      These helpers:
      - fan out a switchdev object/attribute emitted for the bridge towards
        all the lower interfaces that pass the check_cb().
      - fan out a switchdev object/attribute emitted for a bridge port that is
        a LAG towards all the lower interfaces that pass the check_cb().
      
      In other words, this is the model we need for the FDB events too:
      something that will keep an FDB entry emitted towards a physical port as
      it is, but translate an FDB entry emitted towards the bridge into N FDB
      entries, one per physical port.
      
      Of course, there are many differences between fanning out a switchdev
      object (VLAN) on 3 lower interfaces of a LAG and fanning out an FDB
      entry on 3 lower interfaces of a LAG. Intuitively, an FDB entry towards
      a LAG should be treated specially, because FDB entries are unicast, we
      can't just install the same address towards 3 destinations. It is
      imaginable that drivers might want to treat this case specifically, so
      create some methods for this case and do not recurse into the LAG lower
      ports, just the bridge ports.
      
      DSA also listens for FDB entries on "foreign" interfaces, aka interfaces
      bridged with us which are not part of our hardware domain: think an
      Ethernet switch bridged with a Wi-Fi AP. For those addresses, DSA
      installs host FDB entries. However, there we have the same problem
      (those host FDB entries are installed with a refcount of only 1) and an
      even bigger one which we did not have with FDB entries towards the
      bridge:
      
      br_fdb_replay() is currently not called for FDB entries on foreign
      interfaces, just for the physical port and for the bridge itself.
      
      So when DSA sniffs an address learned by the software bridge towards a
      foreign interface like an e1000 port, and then that e1000 leaves the
      bridge, DSA remains with the dangling host FDB address. That will be
      fixed separately by replaying all FDB entries and not just the ones
      towards the port and the bridge.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ca07176
  12. Jun 28, 2021
    • Vladimir Oltean's avatar
      net: switchdev: add a context void pointer to struct switchdev_notifier_info · 69bfac96
      Vladimir Oltean authored
      
      In the case where the driver asks for a replay of a certain type of
      event (port object or attribute) for a bridge port that is a LAG, it may
      do so because this port has just joined the LAG.
      
      But there might already be other switchdev ports in that LAG, and it is
      preferable that those preexisting switchdev ports do not act upon the
      replayed event.
      
      The solution is to add a context to switchdev events, which is NULL most
      of the time (when the bridge layer initiates the call) but which can be
      set to a value controlled by the switchdev driver when a replay is
      requested. The driver can then check the context to figure out if all
      ports within the LAG should act upon the switchdev event, or just the
      ones that match the context.
      
      We have to modify all switchdev_handle_* helper functions as well as the
      prototypes in the drivers that use these helpers too, because these
      helpers hide the underlying struct switchdev_notifier_info from us and
      there is no way to retrieve the context otherwise.
      
      The context structure will be populated and used in later patches.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69bfac96
  13. Feb 15, 2021
  14. Feb 13, 2021
  15. Jan 28, 2021
    • Masahiro Yamada's avatar
      net: switchdev: use obj-$(CONFIG_NET_SWITCHDEV) form in net/Makefile · 0cfd99b4
      Masahiro Yamada authored
      
      CONFIG_NET_SWITCHDEV is a bool option. Change the ifeq conditional to
      the standard obj-$(CONFIG_NET_SWITCHDEV) form.
      
      Use obj-y in net/switchdev/Makefile because Kbuild visits this Makefile
      only when CONFIG_NET_SWITCHDEV=y.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Link: https://lore.kernel.org/r/20210125231659.106201-3-masahiroy@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0cfd99b4
    • Rasmus Villemoes's avatar
      net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP · 20776b46
      Rasmus Villemoes authored
      
      It's not true that switchdev_port_obj_notify() only inspects the
      ->handled field of "struct switchdev_notifier_port_obj_info" if
      call_switchdev_blocking_notifiers() returns 0 - there's a WARN_ON()
      triggering for a non-zero return combined with ->handled not being
      true. But the real problem here is that -EOPNOTSUPP is not being
      properly handled.
      
      The wrapper functions switchdev_handle_port_obj_add() et al change a
      return value of -EOPNOTSUPP to 0, and the treatment of ->handled in
      switchdev_port_obj_notify() seems to be designed to change that back
      to -EOPNOTSUPP in case nobody actually acted on the notifier (i.e.,
      everybody returned -EOPNOTSUPP).
      
      Currently, as soon as some device down the stack passes the check_cb()
      check, ->handled gets set to true, which means that
      switchdev_port_obj_notify() cannot actually ever return -EOPNOTSUPP.
      
      This, for example, means that the detection of hardware offload
      support in the MRP code is broken: switchdev_port_obj_add() used by
      br_mrp_switchdev_send_ring_test() always returns 0, so since the MRP
      code thinks the generation of MRP test frames has been offloaded, no
      such frames are actually put on the wire. Similarly,
      br_mrp_switchdev_set_ring_role() also always returns 0, causing
      mrp->ring_role_offloaded to be set to 1.
      
      To fix this, continue to set ->handled true if any callback returns
      success or any error distinct from -EOPNOTSUPP. But if all the
      callbacks return -EOPNOTSUPP, make sure that ->handled stays false, so
      the logic in switchdev_port_obj_notify() can propagate that
      information.
      
      Fixes: 9a9f26e8 ("bridge: mrp: Connect MRP API with the switchdev API")
      Fixes: f30f0601 ("switchdev: Add helpers to aid traversal through lower devices")
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarRasmus Villemoes <rasmus.villemoes@prevas.dk>
      Link: https://lore.kernel.org/r/20210125124116.102928-1-rasmus.villemoes@prevas.dk
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      20776b46
  16. Jan 12, 2021
    • Vladimir Oltean's avatar
      net: switchdev: remove the transaction structure from port attributes · bae33f2b
      Vladimir Oltean authored
      
      Since the introduction of the switchdev API, port attributes were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      attribute notifier structures, and converts drivers to not look at this
      member.
      
      In part, this patch contains a revert of my previous commit 2e554a7a
      ("net: dsa: propagate switchdev vlan_filtering prepare phase to
      drivers").
      
      For the most part, the conversion was trivial except for:
      - Rocker's world implementation based on Broadcom OF-DPA had an odd
        implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
        done mechanically, by pasting the implementation twice, then only
        keeping the code that would get executed during prepare phase on top,
        then only keeping the code that gets executed during the commit phase
        on bottom, then simplifying the resulting code until this was obtained.
      - DSA's offloading of STP state, bridge flags, VLAN filtering and
        multicast router could be converted right away. But the ageing time
        could not, so a shim was introduced and this was left for a further
        commit.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Acked-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
      Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bae33f2b
    • Vladimir Oltean's avatar
      net: switchdev: delete switchdev_port_obj_add_now · cf6def51
      Vladimir Oltean authored
      
      After the removal of the transactional model inside
      switchdev_port_obj_add_now, it has no added value and we can just call
      switchdev_port_obj_notify directly, bypassing this function. Let's
      delete it.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Acked-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf6def51
    • Vladimir Oltean's avatar
      net: switchdev: remove the transaction structure from port object notifiers · ffb68fc5
      Vladimir Oltean authored
      
      Since the introduction of the switchdev API, port objects were
      transmitted to drivers for offloading using a two-step transactional
      model, with a prepare phase that was supposed to catch all errors, and a
      commit phase that was supposed to never fail.
      
      Some classes of failures can never be avoided, like hardware access, or
      memory allocation. In the latter case, merely attempting to move the
      memory allocation to the preparation phase makes it impossible to avoid
      memory leaks, since commit 91cf8ece ("switchdev: Remove unused
      transaction item queue") which has removed the unused mechanism of
      passing on the allocated memory between one phase and another.
      
      It is time we admit that separating the preparation from the commit
      phase is something that is best left for the driver to decide, and not
      something that should be baked into the API, especially since there are
      no switchdev callers that depend on this.
      
      This patch removes the struct switchdev_trans member from switchdev port
      object notifier structures, and converts drivers to not look at this
      member.
      
      Where driver conversion is trivial (like in the case of the Marvell
      Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is
      done in this patch.
      
      Where driver conversion needs more attention (DSA, Mellanox Spectrum),
      the conversion is left for subsequent patches and here we only fake the
      prepare/commit phases at a lower level, just not in the switchdev
      notifier itself.
      
      Where the code has a natural structure that is best left alone as a
      preparation and a commit phase (as in the case of the Ocelot switch),
      that structure is left in place, just made to not depend upon the
      switchdev transactional model.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Acked-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffb68fc5
  17. Sep 24, 2020
  18. Jul 14, 2020
  19. Jun 13, 2020
    • Masahiro Yamada's avatar
      treewide: replace '---help---' in Kconfig files with 'help' · a7f7f624
      Masahiro Yamada authored
      
      Since commit 84af7a61 ("checkpatch: kconfig: prefer 'help' over
      '---help---'"), the number of '---help---' has been gradually
      decreasing, but there are still more than 2400 instances.
      
      This commit finishes the conversion. While I touched the lines,
      I also fixed the indentation.
      
      There are a variety of indentation styles found.
      
        a) 4 spaces + '---help---'
        b) 7 spaces + '---help---'
        c) 8 spaces + '---help---'
        d) 1 space + 1 tab + '---help---'
        e) 1 tab + '---help---'    (correct indentation)
        f) 1 tab + 1 space + '---help---'
        g) 1 tab + 2 spaces + '---help---'
      
      In order to convert all of them to 1 tab + 'help', I ran the
      following commend:
      
        $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      a7f7f624
  20. Feb 27, 2020
  21. Feb 18, 2020
  22. May 30, 2019
  23. May 21, 2019
  24. Mar 02, 2019
  25. Feb 27, 2019
  26. Feb 25, 2019
  27. Feb 06, 2019
  28. Jan 17, 2019
  29. Dec 13, 2018
  30. Nov 24, 2018
    • Petr Machata's avatar
      switchdev: Replace port obj add/del SDO with a notification · d17d9f5e
      Petr Machata authored
      
      Drop switchdev_ops.switchdev_port_obj_add and _del. Drop the uses of
      this field from all clients, which were migrated to use switchdev
      notification in the previous patches.
      
      Add a new function switchdev_port_obj_notify() that sends the switchdev
      notifications SWITCHDEV_PORT_OBJ_ADD and _DEL.
      
      Update switchdev_port_obj_del_now() to dispatch to this new function.
      Drop __switchdev_port_obj_add() and update switchdev_port_obj_add()
      likewise.
      
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d17d9f5e
    • Petr Machata's avatar
      switchdev: Add helpers to aid traversal through lower devices · f30f0601
      Petr Machata authored
      
      After the transition from switchdev operations to notifier chain (which
      will take place in following patches), the onus is on the driver to find
      its own devices below possible layer of LAG or other uppers.
      
      The logic to do so is fairly repetitive: each driver is looking for its
      own devices among the lowers of the notified device. For those that it
      finds, it calls a handler. To indicate that the event was handled,
      struct switchdev_notifier_port_obj_info.handled is set. The differences
      lie only in what constitutes an "own" device and what handler to call.
      
      Therefore abstract this logic into two helpers,
      switchdev_handle_port_obj_add() and switchdev_handle_port_obj_del(). If
      a driver only supports physical ports under a bridge device, it will
      simply avoid this layer of indirection.
      
      One area where this helper diverges from the current switchdev behavior
      is the case of mixed lowers, some of which are switchdev ports and some
      of which are not. Previously, such scenario would fail with -EOPNOTSUPP.
      The helper could do that for lowers for which the passed-in predicate
      doesn't hold. That would however break the case that switchdev ports
      from several different drivers are stashed under one master, a scenario
      that switchdev currently happily supports. Therefore tolerate any and
      all unknown netdevices, whether they are backed by a switchdev driver
      or not.
      
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f30f0601
    • Petr Machata's avatar
      switchdev: Add a blocking notifier chain · a93e3b17
      Petr Machata authored
      
      In general one can't assume that a switchdev notifier is called in a
      non-atomic context, and correspondingly, the switchdev notifier chain is
      an atomic one.
      
      However, port object addition and deletion messages are delivered from a
      process context. Even the MDB addition messages, whose delivery is
      scheduled from atomic context, are queued and the delivery itself takes
      place in blocking context. For VLAN messages in particular, keeping the
      blocking nature is important for error reporting.
      
      Therefore introduce a blocking notifier chain and related service
      functions to distribute the notifications for which a blocking context
      can be assumed.
      
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a93e3b17
  31. Nov 10, 2017
Loading