Skip to content
  • Thomas Haller's avatar
    libnm: use nm_utils_escaped_tokens_*() for parsing NMIPRoutingRule · 2d1ae8dd
    Thomas Haller authored
    Replace nm_utils_str_simpletokens_extract_next() by
    nm_utils_escaped_tokens_split().
    
    nm_utils_escaped_tokens_split() should become our first choice for
    parsing and tokenizing.
    
    Note that both nm_utils_str_simpletokens_extract_next() and
    nm_utils_escaped_tokens_split() need to strdup the string once,
    and tokenizing takes O(n). So, they are roughtly the same performance
    wise. The only difference is, that as we iterate through the tokens,
    we might abort early on error with nm_utils_str_simpletokens_extract_next()
    and not parse the entire string. But that is a small benefit, since we
    anyway always strdup() the string (being O(n) already).
    
    Note that to-string will no longer escape ',' and ';'. This is a change
    in behavior, of unreleased API. Also note, that escaping these is no
    longer necessary, because nmcli soon will also use nm_utils_escaped_tokens_*().
    
    Another change in behavior is that nm_utils_str_simpletokens_extract_next()
    treated invalid escape sequences (backslashes followed by an arbitrary
    character), buy stripping the backslash. nm_utils_escaped_tokens_*()
    leaves such backslashes as is, and only honors them if they are followed
    by a whitespace (the delimiter) or another backslash. The disadvantage
    of the new approach is that backslashes are treated differently
    depending on the following character. The benefit is, that most
    backslashes can now be written verbatim, not requiring them to escape
    them with a double-backslash.
    
    Yes, there is a problem with these nested escape schemes:
    
      - the caller may already need to escape backslash in shell.
    
      - then nmcli will use backslash escaping to split the rules at ','.
    
      - then nm_ip_routing_rule_from_string() will honor backslash escaping
        for spaces.
    
      - then iifname and oifname use backslash escaping for nm_utils_buf_utf8safe_escape()
        to express non-UTF-8 characters (because interface names are not
        necessarily UTF-8).
    
    This is only redeamed because escaping is really only necessary for very
    unusual cases, if you want to embed a backslash, a space, a comma, or a
    non-UTF-8 character. But if you have to, now you will be able to express
    that.
    
    The other upside of these layers of escaping is that they become all
    indendent from each other:
    
      - shell can accept quoted/escaped arguments and will unescape them.
    
      - nmcli can do the tokenizing for ',' (and escape the content
        unconditionally when converting to string).
    
      - nm_ip_routing_rule_from_string() can do its tokenizing without
        special consideration of utf8safe escaping.
    
      - NMIPRoutingRule takes iifname/oifname as-is and is not concerned
        about nm_utils_buf_utf8safe_escape(). However, before configuring
        the rule in kernel, this utf8safe escape will be unescaped to get
        the interface name (which is non-UTF8 binary).
    
    (cherry picked from commit b6d0be2d)
    2d1ae8dd