Skip to content

ansible/networkd: Reliability improvements

Martin Roukala requested to merge mupuf/ci-tron:networkd_reliability into main

It's the second time in 24h that network-setup.service crashed on my gateway, and ended taking down hostapd and thus all the PDUs connected through WiFi.

Here is an example backtrace:

Nov 15 10:13:11 mupuf-gateway network_setup[110]:   File "/usr/local/bin/networkd", line 167, in run
Nov 15 10:13:11 mupuf-gateway network_setup[110]:     nics = [n for n in NIC.list_ifs() if n.is_ethernet or n.is_wifi]
Nov 15 10:13:11 mupuf-gateway network_setup[110]:                        ^^^^^^^^^^^^^^
Nov 15 10:13:11 mupuf-gateway network_setup[110]:   File "/usr/local/bin/networkd", line 153, in list_ifs
Nov 15 10:13:11 mupuf-gateway network_setup[110]:     nics.append(NIC(name))
Nov 15 10:13:11 mupuf-gateway network_setup[110]:                 ^^^^^^^^^
Nov 15 10:13:11 mupuf-gateway network_setup[110]:   File "/usr/local/bin/networkd", line 64, in __init__
Nov 15 10:13:11 mupuf-gateway network_setup[110]:     raise ValueError('The NIC name does not exist')
Nov 15 10:13:11 mupuf-gateway network_setup[110]: ValueError: The NIC name does not exist
Nov 15 10:13:11 mupuf-gateway systemd[1]: network-setup.service: Main process exited, code=exited, status=1/FAILURE

This MR fixes this exact issue, and add multiple layers of protections to prevent situations like this from happening.

Merge request reports