[th/platform-netlink-alloc] use pre-allocated receive buffer for rtnetlink socket
UPDATE 2022.2.2:
MOTIVATION:
Trying to improve performance when receiving netlink messages (in particular, on scenarios with with a huge number of netlink messages, like on a BGP router with 800k routes, where we easily get 50messages/sec). The approach here is to safe one heap allocation in the receive code path (the receive path for such objects is often short, because we through away the routes relatively early, or because we have them already in the cache from the previous dump).
This is also inspired by n-dhcp4, which pre-allocates a 64k receive buffer for the DHCP lease. In that case, there is actually a problem, because we might have thousands of n-dhcp4 instances (with many interfaces), wasting memory (see https://github.com/nettools/n-dhcp4/pull/27). But in the platform case, there is only one single instance, so the branch only requires one 32k buffer.
I was testing performance with many routes.
With commit 3416b009, performance still seems to be good (after we dropped the BGP filter in commit c37a21ea).
Still, I figure we don't need a malloc for each message we receive... of course, while parsing the message we malloc a NMPObject
instance (and there are possibly many objects in one message). So this probably does not significantly reduces the allocations during processing of netlink events. Still...
It's similar to n-dhcp which has/had a receive buffer per instance. The difference is, that we potentially have many n-dhcp instances (where it was a problem to keep around 64k buffers for each). While, in practice we have fewer NMPlatform instances (one per netns, and today there is only exactly one netns/NMPlatform instance).