libnm-core/nm-setting-ip-config.c · 2d1ae8dd976c3de0c50f1be7a07686a1c05880ca · NetworkManager / NetworkManager

libnm: use nm_utils_escaped_tokens_*() for parsing NMIPRoutingRule · 2d1ae8dd
Thomas Haller authored Apr 12, 2019
Replace nm_utils_str_simpletokens_extract_next() by
nm_utils_escaped_tokens_split().

nm_utils_escaped_tokens_split() should become our first choice for
parsing and tokenizing.

Note that both nm_utils_str_simpletokens_extract_next() and
nm_utils_escaped_tokens_split() need to strdup the string once,
and tokenizing takes O(n). So, they are roughtly the same performance
wise. The only difference is, that as we iterate through the tokens,
we might abort early on error with nm_utils_str_simpletokens_extract_next()
and not parse the entire string. But that is a small benefit, since we
anyway always strdup() the string (being O(n) already).

Note that to-string will no longer escape ',' and ';'. This is a change
in behavior, of unreleased API. Also note, that escaping these is no
longer necessary, because nmcli soon will also use nm_utils_escaped_tokens_*().

Another change in behavior is that nm_utils_str_simpletokens_extract_next()
treated invalid escape sequences (backslashes followed by an arbitrary
character), buy stripping the backslash. nm_utils_escaped_tokens_*()
leaves such backslashes as is, and only honors them if they are followed
by a whitespace (the delimiter) or another backslash. The disadvantage
of the new approach is that backslashes are treated differently
depending on the following character. The benefit is, that most
backslashes can now be written verbatim, not requiring them to escape
them with a double-backslash.

Yes, there is a problem with these nested escape schemes:

  - the caller may already need to escape backslash in shell.

  - then nmcli will use backslash escaping to split the rules at ','.

  - then nm_ip_routing_rule_from_string() will honor backslash escaping
    for spaces.

  - then iifname and oifname use backslash escaping for nm_utils_buf_utf8safe_escape()
    to express non-UTF-8 characters (because interface names are not
    necessarily UTF-8).

This is only redeamed because escaping is really only necessary for very
unusual cases, if you want to embed a backslash, a space, a comma, or a
non-UTF-8 character. But if you have to, now you will be able to express
that.

The other upside of these layers of escaping is that they become all
indendent from each other:

  - shell can accept quoted/escaped arguments and will unescape them.

  - nmcli can do the tokenizing for ',' (and escape the content
    unconditionally when converting to string).

  - nm_ip_routing_rule_from_string() can do its tokenizing without
    special consideration of utf8safe escaping.

  - NMIPRoutingRule takes iifname/oifname as-is and is not concerned
    about nm_utils_buf_utf8safe_escape(). However, before configuring
    the rule in kernel, this utf8safe escape will be unescaped to get
    the interface name (which is non-UTF8 binary).

(cherry picked from commit b6d0be2d)
2d1ae8dd