Skip to content
Snippets Groups Projects
  1. Feb 14, 2024
  2. Aug 31, 2023
  3. Apr 11, 2023
  4. Jan 12, 2023
  5. Dec 09, 2022
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Rework process_dynptr_func · 27060531
      Kumar Kartikeya Dwivedi authored
      
      Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
      for use in callback state, because in case of user ringbuf helpers,
      there is no dynptr on the stack that is passed into the callback. To
      reflect such a state, a special register type was created.
      
      However, some checks have been bypassed incorrectly during the addition
      of this feature. First, for arg_type with MEM_UNINIT flag which
      initialize a dynptr, they must be rejected for such register type.
      Secondly, in the future, there are plans to add dynptr helpers that
      operate on the dynptr itself and may change its offset and other
      properties.
      
      In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
      to such helpers, however the current code simply returns 0.
      
      The rejection for helpers that release the dynptr is already handled.
      
      For fixing this, we take a step back and rework existing code in a way
      that will allow fitting in all classes of helpers and have a coherent
      model for dealing with the variety of use cases in which dynptr is used.
      
      First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
      with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
      
      Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
      fact. To make the distinction clear, use MEM_RDONLY flag to indicate
      that the helper only operates on the memory pointed to by the dynptr,
      not the dynptr itself. In C parlance, it would be equivalent to taking
      the dynptr as a point to const argument.
      
      When either of these flags are not present, the helper is allowed to
      mutate both the dynptr itself and also the memory it points to.
      Currently, the read only status of the memory is not tracked in the
      dynptr, but it would be trivial to add this support inside dynptr state
      of the register.
      
      With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
      better reflect its usage, it can no longer be passed to helpers that
      initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
      
      A note to reviewers is that in code that does mark_stack_slots_dynptr,
      and unmark_stack_slots_dynptr, we implicitly rely on the fact that
      PTR_TO_STACK reg is the only case that can reach that code path, as one
      cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
      both cases such helpers won't be setting that flag.
      
      The next patch will add a couple of selftest cases to make sure this
      doesn't break.
      
      Fixes: 20571567 ("bpf: Add bpf_user_ringbuf_drain() helper")
      Acked-by: default avatarJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221207204141.308952-4-memxor@gmail.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      27060531
  6. Oct 26, 2022
    • Yonghong Song's avatar
      bpf: Implement cgroup storage available to non-cgroup-attached bpf progs · c4bcfb38
      Yonghong Song authored
      
      Similar to sk/inode/task storage, implement similar cgroup local storage.
      
      There already exists a local storage implementation for cgroup-attached
      bpf programs.  See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper
      bpf_get_local_storage(). But there are use cases such that non-cgroup
      attached bpf progs wants to access cgroup local storage data. For example,
      tc egress prog has access to sk and cgroup. It is possible to use
      sk local storage to emulate cgroup local storage by storing data in socket.
      But this is a waste as it could be lots of sockets belonging to a particular
      cgroup. Alternatively, a separate map can be created with cgroup id as the key.
      But this will introduce additional overhead to manipulate the new map.
      A cgroup local storage, similar to existing sk/inode/task storage,
      should help for this use case.
      
      The life-cycle of storage is managed with the life-cycle of the
      cgroup struct.  i.e. the storage is destroyed along with the owning cgroup
      with a call to bpf_cgrp_storage_free() when cgroup itself
      is deleted.
      
      The userspace map operations can be done by using a cgroup fd as a key
      passed to the lookup, update and delete operations.
      
      Typically, the following code is used to get the current cgroup:
          struct task_struct *task = bpf_get_current_task_btf();
          ... task->cgroups->dfl_cgrp ...
      and in structure task_struct definition:
          struct task_struct {
              ....
              struct css_set __rcu            *cgroups;
              ....
          }
      With sleepable program, accessing task->cgroups is not protected by rcu_read_lock.
      So the current implementation only supports non-sleepable program and supporting
      sleepable program will be the next step together with adding rcu_read_lock
      protection for rcu tagged structures.
      
      Since map name BPF_MAP_TYPE_CGROUP_STORAGE has been used for old cgroup local
      storage support, the new map name BPF_MAP_TYPE_CGRP_STORAGE is used
      for cgroup storage available to non-cgroup-attached bpf programs. The old
      cgroup storage supports bpf_get_local_storage() helper to get the cgroup data.
      The new cgroup storage helper bpf_cgrp_storage_get() can provide similar
      functionality. While old cgroup storage pre-allocates storage memory, the new
      mechanism can also pre-allocate with a user space bpf_map_update_elem() call
      to avoid potential run-time memory allocation failure.
      Therefore, the new cgroup storage can provide all functionality w.r.t.
      the old one. So in uapi bpf.h, the old BPF_MAP_TYPE_CGROUP_STORAGE is alias to
      BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED to indicate the old cgroup storage can
      be deprecated since the new one can provide the same functionality.
      
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20221026042850.673791-1-yhs@fb.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c4bcfb38
  7. Oct 06, 2022
    • Andrii Nakryiko's avatar
      scripts/bpf_doc.py: update logic to not assume sequential enum values · ce3e44a0
      Andrii Nakryiko authored
      
      Relax bpf_doc.py's expectation of all BPF_FUNC_xxx enumerators having
      sequential values increasing by one. Instead, only make sure that
      relative order of BPF helper descriptions in comments matches
      enumerators definitions order.
      
      Also additionally make sure that helper IDs are not duplicated.
      
      And also make sure that for cases when we have multiple descriptions for
      the same BPF helper (e.g., for bpf_get_socket_cookie()), all such
      descriptions are grouped together.
      
      Such checks should capture all the same (and more) issues in upstream
      UAPI headers, but also handle backported kernels correctly.
      
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/r/20221006042452.2089843-2-andrii@kernel.org
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ce3e44a0
    • Andrii Nakryiko's avatar
      bpf: explicitly define BPF_FUNC_xxx integer values · 8a76145a
      Andrii Nakryiko authored
      
      Historically enum bpf_func_id's BPF_FUNC_xxx enumerators relied on
      implicit sequential values being assigned by compiler. This is
      convenient, as new BPF helpers are always added at the very end, but it
      also has its downsides, some of them being:
      
        - with over 200 helpers now it's very hard to know what's each helper's ID,
          which is often important to know when working with BPF assembly (e.g.,
          by dumping raw bpf assembly instructions with llvm-objdump -d
          command). it's possible to work around this by looking into vmlinux.h,
          dumping /sys/btf/kernel/vmlinux, looking at libbpf-provided
          bpf_helper_defs.h, etc. But it always feels like an unnecessary step
          and one should be able to quickly figure this out from UAPI header.
      
        - when backporting and cherry-picking only some BPF helpers onto older
          kernels it's important to be able to skip some enum values for helpers
          that weren't backported, but preserve absolute integer IDs to keep BPF
          helper IDs stable so that BPF programs stay portable across upstream
          and backported kernels.
      
      While neither problem is insurmountable, they come up frequently enough
      and are annoying enough to warrant improving the situation. And for the
      backporting the problem can easily go unnoticed for a while, especially
      if backport is done with people not very familiar with BPF subsystem overall.
      
      Anyways, it's easy to fix this by making sure that __BPF_FUNC_MAPPER
      macro provides explicit helper IDs. Unfortunately that would potentially
      break existing users that use UAPI-exposed __BPF_FUNC_MAPPER and are
      expected to pass macro that accepts only symbolic helper identifier
      (e.g., map_lookup_elem for bpf_map_lookup_elem() helper).
      
      As such, we need to introduce a new macro (___BPF_FUNC_MAPPER) which
      would specify both identifier and integer ID, but in such a way as to
      allow existing __BPF_FUNC_MAPPER be expressed in terms of new
      ___BPF_FUNC_MAPPER macro. And that's what this patch is doing. To avoid
      duplication and allow __BPF_FUNC_MAPPER stay *exactly* the same,
      ___BPF_FUNC_MAPPER accepts arbitrary "context" arguments, which can be
      used to pass any extra macros, arguments, and whatnot. In our case we
      use this to pass original user-provided macro that expects single
      argument and __BPF_FUNC_MAPPER is using it's own three-argument
      __BPF_FUNC_MAPPER_APPLY intermediate macro to impedance-match new and
      old "callback" macros.
      
      Once we resolve this, we use new ___BPF_FUNC_MAPPER to define enum
      bpf_func_id with explicit values. The other users of __BPF_FUNC_MAPPER
      in kernel (namely in kernel/bpf/disasm.c) are kept exactly the same both
      as demonstration that backwards compat works, but also to avoid
      unnecessary code churn.
      
      Note that new ___BPF_FUNC_MAPPER() doesn't forcefully insert comma
      between values, as that might not be appropriate in all possible cases
      where ___BPF_FUNC_MAPPER might be used by users. This doesn't reduce
      usability, as it's trivial to insert that comma inside "callback" macro.
      
      To validate all the manually specified IDs are exactly right, we used
      BTF to compare before and after values:
      
        $ bpftool btf dump file ~/linux-build/default/vmlinux | rg bpf_func_id -A 211 > after.txt
        $ git stash # stach UAPI changes
        $ make -j90
        ... re-building kernel without UAPI changes ...
        $ bpftool btf dump file ~/linux-build/default/vmlinux | rg bpf_func_id -A 211 > before.txt
        $ diff -u before.txt after.txt
        --- before.txt  2022-10-05 10:48:18.119195916 -0700
        +++ after.txt   2022-10-05 10:46:49.446615025 -0700
        @@ -1,4 +1,4 @@
        -[14576] ENUM 'bpf_func_id' encoding=UNSIGNED size=4 vlen=211
        +[9560] ENUM 'bpf_func_id' encoding=UNSIGNED size=4 vlen=211
                'BPF_FUNC_unspec' val=0
                'BPF_FUNC_map_lookup_elem' val=1
                'BPF_FUNC_map_update_elem' val=2
      
      As can be seen from diff above, the only thing that changed was resulting BTF
      type ID of ENUM bpf_func_id, not any of the enumerators, their names or integer
      values.
      
      The only other place that needed fixing was scripts/bpf_doc.py used to generate
      man pages and bpf_helper_defs.h header for libbpf and selftests. That script is
      tightly-coupled to exact shape of ___BPF_FUNC_MAPPER macro definition, so had
      to be trivially adapted.
      
      Cc: Quentin Monnet <quentin@isovalent.com>
      Reported-by: default avatarAndrea Terzolo <andrea.terzolo@polito.it>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221006042452.2089843-1-andrii@kernel.org
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8a76145a
  8. Aug 25, 2022
  9. Aug 23, 2022
  10. Jul 21, 2022
  11. Jun 17, 2022
  12. May 23, 2022
    • Joanne Koong's avatar
      bpf: Add verifier support for dynptrs · 97e03f52
      Joanne Koong authored
      
      This patch adds the bulk of the verifier work for supporting dynamic
      pointers (dynptrs) in bpf.
      
      A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure
      defined internally as:
      
      struct bpf_dynptr_kern {
          void *data;
          u32 size;
          u32 offset;
      } __aligned(8);
      
      The upper 8 bits of *size* is reserved (it contains extra metadata about
      read-only status and dynptr type). Consequently, a dynptr only supports
      memory less than 16 MB.
      
      There are different types of dynptrs (eg malloc, ringbuf, ...). In this
      patchset, the most basic one, dynptrs to a bpf program's local memory,
      is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE
      is supported.
      
      In the verifier, dynptr state information will be tracked in stack
      slots. When the program passes in an uninitialized dynptr
      (ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding
      to the frame pointer where the dynptr resides at are marked
      STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg
      bpf_dynptr_read + bpf_dynptr_write which are added later in this
      patchset), the verifier enforces that the dynptr has been initialized
      properly by checking that their corresponding stack slots have been
      marked as STACK_DYNPTR.
      
      The 6th patch in this patchset adds test cases that the verifier should
      successfully reject, such as for example attempting to use a dynptr
      after doing a direct write into it inside the bpf program.
      
      Signed-off-by: default avatarJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com
      97e03f52
  13. May 20, 2022
  14. Jan 19, 2022
  15. Jan 15, 2022
  16. Oct 21, 2021
  17. Jul 15, 2021
    • Alexei Starovoitov's avatar
      bpf: Introduce bpf timers. · b00628b1
      Alexei Starovoitov authored
      
      Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
      in hash/array/lru maps as a regular field and helpers to operate on it:
      
      // Initialize the timer.
      // First 4 bits of 'flags' specify clockid.
      // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
      long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags);
      
      // Configure the timer to call 'callback_fn' static function.
      long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);
      
      // Arm the timer to expire 'nsec' nanoseconds from the current time.
      long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags);
      
      // Cancel the timer and wait for callback_fn to finish if it was running.
      long bpf_timer_cancel(struct bpf_timer *timer);
      
      Here is how BPF program might look like:
      struct map_elem {
          int counter;
          struct bpf_timer timer;
      };
      
      struct {
          __uint(type, BPF_MAP_TYPE_HASH);
          __uint(max_entries, 1000);
          __type(key, int);
          __type(value, struct map_elem);
      } hmap SEC(".maps");
      
      static int timer_cb(void *map, int *key, struct map_elem *val);
      /* val points to particular map element that contains bpf_timer. */
      
      SEC("fentry/bpf_fentry_test1")
      int BPF_PROG(test1, int a)
      {
          struct map_elem *val;
          int key = 0;
      
          val = bpf_map_lookup_elem(&hmap, &key);
          if (val) {
              bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME);
              bpf_timer_set_callback(&val->timer, timer_cb);
              bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0);
          }
      }
      
      This patch adds helper implementations that rely on hrtimers
      to call bpf functions as timers expire.
      The following patches add necessary safety checks.
      
      Only programs with CAP_BPF are allowed to use bpf_timer.
      
      The amount of timers used by the program is constrained by
      the memcg recorded at map creation time.
      
      The bpf_timer_init() helper needs explicit 'map' argument because inner maps
      are dynamic and not known at load time. While the bpf_timer_set_callback() is
      receiving hidden 'aux->prog' argument supplied by the verifier.
      
      The prog pointer is needed to do refcnting of bpf program to make sure that
      program doesn't get freed while the timer is armed. This approach relies on
      "user refcnt" scheme used in prog_array that stores bpf programs for
      bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is
      paired with bpf_timer_cancel() that will drop the prog refcnt. The
      ops->map_release_uref is responsible for cancelling the timers and dropping
      prog refcnt when user space reference to a map reaches zero.
      This uref approach is done to make sure that Ctrl-C of user space process will
      not leave timers running forever unless the user space explicitly pinned a map
      that contained timers in bpffs.
      
      bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't
      have user references (is not held by open file descriptor from user space and
      not pinned in bpffs).
      
      The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
      and free the timer if given map element had it allocated.
      "bpftool map update" command can be used to cancel timers.
      
      The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because
      '__u64 :64' has 1 byte alignment of 8 byte padding.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com
      b00628b1
  18. Mar 05, 2021
  19. Dec 04, 2020
  20. Nov 25, 2020
  21. Nov 18, 2020
  22. Oct 29, 2020
  23. Oct 21, 2020
  24. Sep 29, 2020
    • Alan Maguire's avatar
      bpf: Add bpf_snprintf_btf helper · c4d0bfb4
      Alan Maguire authored
      
      A helper is added to support tracing kernel type information in BPF
      using the BPF Type Format (BTF).  Its signature is
      
      long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
      		      u32 btf_ptr_size, u64 flags);
      
      struct btf_ptr * specifies
      
      - a pointer to the data to be traced
      - the BTF id of the type of data pointed to
      - a flags field is provided for future use; these flags
        are not to be confused with the BTF_F_* flags
        below that control how the btf_ptr is displayed; the
        flags member of the struct btf_ptr may be used to
        disambiguate types in kernel versus module BTF, etc;
        the main distinction is the flags relate to the type
        and information needed in identifying it; not how it
        is displayed.
      
      For example a BPF program with a struct sk_buff *skb
      could do the following:
      
      	static struct btf_ptr b = { };
      
      	b.ptr = skb;
      	b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
      	bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);
      
      Default output looks like this:
      
      (struct sk_buff){
       .transport_header = (__u16)65535,
       .mac_header = (__u16)65535,
       .end = (sk_buff_data_t)192,
       .head = (unsigned char *)0x000000007524fd8b,
       .data = (unsigned char *)0x000000007524fd8b,
       .truesize = (unsigned int)768,
       .users = (refcount_t){
        .refs = (atomic_t){
         .counter = (int)1,
        },
       },
      }
      
      Flags modifying display are as follows:
      
      - BTF_F_COMPACT:	no formatting around type information
      - BTF_F_NONAME:		no struct/union member names/types
      - BTF_F_PTR_RAW:	show raw (unobfuscated) pointer values;
      			equivalent to %px.
      - BTF_F_ZERO:		show zero-valued struct/union members;
      			they are not displayed by default
      
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
      c4d0bfb4
  25. Aug 25, 2020
  26. Jul 18, 2020
    • Jakub Sitnicki's avatar
      bpf: Introduce SK_LOOKUP program type with a dedicated attach point · e9ddbb77
      Jakub Sitnicki authored
      
      Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type
      BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer
      when looking up a listening socket for a new connection request for
      connection oriented protocols, or when looking up an unconnected socket for
      a packet for connection-less protocols.
      
      When called, SK_LOOKUP BPF program can select a socket that will receive
      the packet. This serves as a mechanism to overcome the limits of what
      bind() API allows to express. Two use-cases driving this work are:
      
       (1) steer packets destined to an IP range, on fixed port to a socket
      
           192.0.2.0/24, port 80 -> NGINX socket
      
       (2) steer packets destined to an IP address, on any port to a socket
      
           198.51.100.1, any port -> L7 proxy socket
      
      In its run-time context program receives information about the packet that
      triggered the socket lookup. Namely IP version, L4 protocol identifier, and
      address 4-tuple. Context can be further extended to include ingress
      interface identifier.
      
      To select a socket BPF program fetches it from a map holding socket
      references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
      helper to record the selection. Transport layer then uses the selected
      socket as a result of socket lookup.
      
      In its basic form, SK_LOOKUP acts as a filter and hence must return either
      SK_PASS or SK_DROP. If the program returns with SK_PASS, transport should
      look for a socket to receive the packet, or use the one selected by the
      program if available, while SK_DROP informs the transport layer that the
      lookup should fail.
      
      This patch only enables the user to attach an SK_LOOKUP program to a
      network namespace. Subsequent patches hook it up to run on local delivery
      path in ipv4 and ipv6 stacks.
      
      Suggested-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200717103536.397595-3-jakub@cloudflare.com
      e9ddbb77
  27. Jul 01, 2020
  28. Jun 25, 2020
  29. May 11, 2020
    • Quentin Monnet's avatar
      bpf: Minor fixes to BPF helpers documentation · ab8d7809
      Quentin Monnet authored
      
      Minor improvements to the documentation for BPF helpers:
      
      * Fix formatting for the description of "bpf_socket" for
        bpf_getsockopt() and bpf_setsockopt(), thus suppressing two warnings
        from rst2man about "Unexpected indentation".
      * Fix formatting for return values for bpf_sk_assign() and seq_file
        helpers.
      * Fix and harmonise formatting, in particular for function/struct names.
      * Remove blank lines before "Return:" sections.
      * Replace tabs found in the middle of text lines.
      * Fix typos.
      * Add a note to the footer (in Python script) about "bpftool feature
        probe", including for listing features available to unprivileged
        users, and add a reference to bpftool man page.
      
      Thanks to Florian for reporting two typos (duplicated words).
      
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200511161536.29853-4-quentin@isovalent.com
      ab8d7809
  30. May 10, 2020
    • Yonghong Song's avatar
      bpf: Add bpf_seq_printf and bpf_seq_write helpers · 492e639f
      Yonghong Song authored
      
      Two helpers bpf_seq_printf and bpf_seq_write, are added for
      writing data to the seq_file buffer.
      
      bpf_seq_printf supports common format string flag/width/type
      fields so at least I can get identical results for
      netlink and ipv6_route targets.
      
      For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW
      specifically indicates a write failure due to overflow, which
      means the object will be repeated in the next bpf invocation
      if object collection stays the same. Note that if the object
      collection is changed, depending how collection traversal is
      done, even if the object still in the collection, it may not
      be visited.
      
      For bpf_seq_printf, format %s, %p{i,I}{4,6} needs to
      read kernel memory. Reading kernel memory may fail in
      the following two cases:
        - invalid kernel address, or
        - valid kernel address but requiring a major fault
      If reading kernel memory failed, the %s string will be
      an empty string and %p{i,I}{4,6} will be all 0.
      Not returning error to bpf program is consistent with
      what bpf_trace_printk() does for now.
      
      bpf_seq_printf may return -EBUSY meaning that internal percpu
      buffer for memory copy of strings or other pointees is
      not available. Bpf program can return 1 to indicate it
      wants the same object to be repeated. Right now, this should not
      happen on no-RT kernels since migrate_disable(), which guards
      bpf prog call, calls preempt_disable().
      
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200509175914.2476661-1-yhs@fb.com
      492e639f
  31. Mar 13, 2020
  32. Feb 26, 2020
  33. Jan 14, 2020
Loading