Commits · 48a516363e294a4098622dd77a5ecd4ee924121f · drm / misc / kernel

Feb 09, 2024

work around gcc bugs with 'asm goto' with outputs · 4356e9f8

Linus Torvalds authored 1 year ago

We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c3 ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f18034 ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").

Then, much later, we ended up removing the workaround in commit
43c249ea ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.

Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround.  But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.

It looks like there are at least two separate issues that all hit in
this area:

 (a) some versions of gcc don't mark the asm goto as 'volatile' when it
     has outputs:

        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420

     which is easy to work around by just adding the 'volatile' by hand.

 (b) Internal compiler errors:

        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422



     which are worked around by adding the extra empty 'asm' as a
     barrier, as in the original workaround.

but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.

but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.

Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/


Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

4356e9f8

Jan 24, 2024

samples/cgroup: add .gitignore file for generated samples · 443b3490

Linus Torvalds authored 1 year ago

Make 'git status' quietly happy again after a full allmodconfig build.

Fixes: 60433a9d ("samples: introduce new samples subdir for cgroup")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

443b3490

Jan 18, 2024

samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI] · 629291dd

Song Shuai authored 1 year ago


Add RISC-V variants of the ftrace-direct* samples.

Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20231130121531.1178502-5-bjorn@kernel.org


Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

629291dd

Dec 20, 2023

samples/bpf: Use %lu format specifier for unsigned long values · 32f24938

Colin Ian King authored 1 year ago

Currently %ld format specifiers are being used for unsigned long
values. Fix this by using %lu instead. Cleans up cppcheck warnings:

warning: %ld in format string (no. 1) requires 'long' but the argument
type is 'unsigned long'. [invalidPrintfArgType_sint]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/bpf/20231219152307.368921-1-colin.i.king@gmail.com

32f24938

Dec 19, 2023

tracing: Allow creating instances with specified system events · d2356997

Steven Rostedt (Google) authored 1 year ago

A trace instance may only need to enable specific events. As the eventfs
directory of an instance currently creates all events which adds overhead,
allow internal instances to be created with just the events in systems
that they care about. This currently only deals with systems and not
individual events, but this should bring down the overhead of creating
instances for specific use cases quite bit.

The trace_array_get_by_name() now has another parameter "systems". This
parameter is a const string pointer of a comma/space separated list of
event systems that should be created by the trace_array. (Note if the
trace_array already exists, this parameter is ignored).

The list of systems is saved and if a module is loaded, its events will
not be added unless the system for those events also match the systems
string.

Link: https://lore.kernel.org/linux-trace-kernel/20231213093701.03fddec0@gandalf.local.home



Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Arun Easi   <aeasi@marvell.com>
Cc: Daniel Wagner <dwagner@suse.de>
Tested-by: Dmytro Maluka <dmaluka@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

d2356997

Dec 13, 2023

media: videobuf2: core: Rename min_buffers_needed field in vb2_queue · 80c2b40a

Benjamin Gaignard authored 1 year ago


Rename min_buffers_needed into min_queued_buffers and update
the documentation about it.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil: Drop the change where min_queued_buffers + 1 buffers would be]
[hverkuil: allocated. Now this patch only renames this field instead of making]
[hverkuil: a functional change as well.]
[hverkuil: Renamed 3 remaining min_buffers_needed occurrences.]

80c2b40a

Dec 11, 2023

samples/cgroup: introduce memcg memory.events listener · becf6529

Dmitry Rokosov authored 1 year ago

This is a simple listener for memory events that handles counter changes
in runtime.  It can be set up for a specific memory cgroup v2.

The output example:
=====
$ /tmp/memcg_event_listener test
Initialized MEMCG events with counters:
MEMCG events:
	low: 0
	high: 0
	max: 0
	oom: 0
	oom_kill: 0
	oom_group_kill: 0
Started monitoring memory events from '/sys/fs/cgroup/test/memory.events'...
Received event in /sys/fs/cgroup/test/memory.events:
*** 1 MEMCG oom_kill event, change counter 0 => 1
Received event in /sys/fs/cgroup/test/memory.events:
*** 1 MEMCG oom_kill event, change counter 1 => 2
Received event in /sys/fs/cgroup/test/memory.events:
*** 1 MEMCG oom_kill event, change counter 2 => 3
Received event in /sys/fs/cgroup/test/memory.events:
*** 1 MEMCG oom_kill event, change counter 3 => 4
Received event in /sys/fs/cgroup/test/memory.events:
*** 2 MEMCG max events, change counter 0 => 2
Received event in /sys/fs/cgroup/test/memory.events:
*** 8 MEMCG max events, change counter 2 => 10
*** 1 MEMCG oom event, change counter 0 => 1
Received event in /sys/fs/cgroup/test/memory.events:
*** 1 MEMCG oom_kill event, change counter 4 => 5
^CExiting memcg event listener...
=====

Link: https://lkml.kernel.org/r/20231123071945.25811-3-ddrokosov@salutedevices.com


Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

becf6529

samples: introduce new samples subdir for cgroup · 60433a9d

Dmitry Rokosov authored 1 year ago

Patch series "samples: introduce cgroup events listeners", v3.

To begin with, this patch series relocates the cgroup example code to the
samples/cgroup directory, which is the appropriate location for such code
snippets.

Furthermore, a new memcg events listener is introduced.  This listener is
a simple yet effective tool for monitoring memory events and managing
counter changes during runtime.

Additionally, as per Andrew Morton's suggestion, a helpful reminder
comment is included in the memcontrol implementation.  This comment serves
to ensure that the samples code is updated whenever new events are added.


This patch (of 3):

Move the cgroup_event_listener for cgroup v1 to the samples directory. 
This suggestion was proposed by Andrew Morton during the discussion [1].

Link: https://lore.kernel.org/all/20231106140934.3f5d4960141562fe8da53906@linux-foundation.org/ [1]
Link: https://lkml.kernel.org/r/20231123071945.25811-1-ddrokosov@salutedevices.com
Link: https://lkml.kernel.org/r/20231123071945.25811-2-ddrokosov@salutedevices.com


Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

60433a9d

Nov 30, 2023

samples: Replace strlcpy() with strscpy() · 40b2519d

Kees Cook authored 1 year ago

strlcpy() reads the entire source buffer first. This read may exceed
the destination size limit. This is both inefficient and can lead
to linear read overflows if a source string is not NUL-terminated[1].
Additionally, it returns the size of the source string, not the
resulting size of the destination string. In an effort to remove strlcpy()
completely[2], replace strlcpy() here with strscpy().

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [1]
Link: https://github.com/KSPP/linux/issues/89

 [2]
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Geliang Tang <geliang.tang@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20231116191510.work.550-kees@kernel.org


Signed-off-by: Kees Cook <keescook@chromium.org>

40b2519d

Nov 28, 2023

eventfd: simplify eventfd_signal() · 3652117f

Christian Brauner authored 1 year ago

Ever since the eventfd type was introduced back in 2007 in commit
e1ad7468 ("signal/timer/event: eventfd core") the eventfd_signal()
function only ever passed 1 as a value for @n. There's no point in
keeping that additional argument.

Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org


Acked-by: Xu Yilun <yilun.xu@intel.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl
Acked-by: Eric Farman <farman@linux.ibm.com>  # s390
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>

3652117f

Nov 23, 2023

media: sample: v4l: Stop direct calls to queue num_buffers field · 88d9ce34

Benjamin Gaignard authored 1 year ago and

Mauro Carvalho Chehab committed 1 year ago


Use vb2_get_num_buffers() to avoid using queue num_buffers field directly.
This allows us to change how the number of buffers is computed in the
future.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Reviewed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>

88d9ce34

Oct 26, 2023

samples/landlock: Support TCP restrictions · 5e990dce

Konstantin Meskhidze authored 1 year ago


Add TCP restrictions to the sandboxer demo. It's possible to allow a
sandboxer to bind/connect to a list of specified ports restricting
network actions to the rest of them. This is controlled with the new
LL_TCP_BIND and LL_TCP_CONNECT environment variables.

Rename ENV_PATH_TOKEN to ENV_DELIMITER.

Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20231026014751.414649-12-konstantin.meskhidze@huawei.com


[mic: Extend commit message]
Signed-off-by: Mickaël Salaün <mic@digikod.net>

5e990dce

samples/bpf: Allow building with custom bpftool · 37db10bc

Viktor Malik authored 1 year ago

samples/bpf build its own bpftool boostrap to generate vmlinux.h as well
as some BPF objects. This is a redundant step if bpftool has been
already built, so update samples/bpf/Makefile such that it accepts a
path to bpftool passed via the BPFTOOL variable. The approach is
practically the same as tools/testing/selftests/bpf/Makefile uses.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bd746954ac271b02468d8d951ff9f11e655d485b.1698213811.git.vmalik@redhat.com

37db10bc

samples/bpf: Fix passing LDFLAGS to libbpf · f56bcfad

Viktor Malik authored 1 year ago


samples/bpf/Makefile passes LDFLAGS=$(TPROGS_LDFLAGS) to libbpf build
without surrounding quotes, which may cause compilation errors when
passing custom TPROGS_USER_LDFLAGS.

For example:

    $ make -C samples/bpf/ TPROGS_USER_LDFLAGS="-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec"
    make: Entering directory './samples/bpf'
    make -C ../../ M=./samples/bpf BPF_SAMPLES_PATH=./samples/bpf
    make[1]: Entering directory '.'
    make -C ./samples/bpf/../../tools/lib/bpf RM='rm -rf' EXTRA_CFLAGS="-Wall -O2 -Wmissing-prototypes -Wstrict-prototypes  -I./usr/include -I./tools/testing/selftests/bpf/ -I./samples/bpf/libbpf/include -I./tools/include -I./tools/perf -I./tools/lib -DHAVE_ATTR_TEST=0" \
            LDFLAGS=-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec srctree=./samples/bpf/../../ \
            O= OUTPUT=./samples/bpf/libbpf/ DESTDIR=./samples/bpf/libbpf prefix= \
            ./samples/bpf/libbpf/libbpf.a install_headers
    make: invalid option -- 'c'
    make: invalid option -- '='
    make: invalid option -- '/'
    make: invalid option -- 'u'
    make: invalid option -- '/'
    [...]

Fix the error by properly quoting $(TPROGS_LDFLAGS).

Suggested-by: Donald Zickus <dzickus@redhat.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/c690de6671cc6c983d32a566d33fd7eabd18b526.1698213811.git.vmalik@redhat.com

f56bcfad

samples/bpf: Allow building with custom CFLAGS/LDFLAGS · 870f09f1

Viktor Malik authored 1 year ago


Currently, it is not possible to specify custom flags when building
samples/bpf. The flags are defined in TPROGS_CFLAGS/TPROGS_LDFLAGS
variables, however, when trying to override those from the make command,
compilation fails.

For example, when trying to build with PIE:

    $ make -C samples/bpf TPROGS_CFLAGS="-fpie" TPROGS_LDFLAGS="-pie"

This is because samples/bpf/Makefile updates these variables, especially
appends include paths to TPROGS_CFLAGS and these updates are overridden
by setting the variables from the make command.

This patch introduces variables TPROGS_USER_CFLAGS/TPROGS_USER_LDFLAGS
for this purpose, which can be set from the make command and their
values are propagated to TPROGS_CFLAGS/TPROGS_LDFLAGS.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/2d81100b830a71f0e72329cc7781edaefab75f62.1698213811.git.vmalik@redhat.com

870f09f1

Oct 24, 2023

vfio/mtty: Enable migration support · 2b88119e

Alex Williamson authored 1 year ago

The mtty driver exposes a PCI serial device to userspace and therefore
makes an easy target for a sample device supporting migration. The device
does not make use of DMA, therefore we can easily claim support for the
migration P2P states, as well as dirty logging. This implementation also
makes use of PRE_COPY support in order to provide migration stream
compatibility testing, which should generally be considered good practice.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20231016224736.2575718-3-alex.williamson@redhat.com

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

2b88119e

vfio/mtty: Overhaul mtty interrupt handling · 293fbc28

Alex Williamson authored 1 year ago

The mtty driver does not currently conform to the vfio SET_IRQS uAPI.
For example, it claims to support mask and unmask of INTx, but actually
does nothing. It claims to support AUTOMASK for INTx, but doesn't. It
fails to teardown eventfds under the full semantics specified by the
SET_IRQS ioctl. It also fails to teardown eventfds when the device is
closed, leading to memory leaks. It claims to support the request IRQ,
but doesn't.

Fix all these.

A side effect of this is that QEMU will now report a warning:

vfio <uuid>: Failed to set up UNMASK eventfd signaling for interrupt \
INTX-0: VFIO_DEVICE_SET_IRQS failure: Inappropriate ioctl for device

The fact is that the unmask eventfd was never supported but quietly
failed. mtty never honored the AUTOMASK behavior, therefore there
was nothing to unmask. QEMU is verbose about the failure, but
properly falls back to userspace unmasking.

Fixes: 9d1a546c ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20231016224736.2575718-2-alex.williamson@redhat.com

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

293fbc28

Oct 23, 2023

samples: bpf: Fix syscall_tp openat argument · 69a19170

Denys Zagorui authored 1 year ago


This modification doesn't change behaviour of the syscall_tp
But such code is often used as a reference so it should be
correct anyway

Signed-off-by: Denys Zagorui <dzagorui@cisco.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231019113521.4103825-1-dzagorui@cisco.com

69a19170

Oct 09, 2023

samples: kprobes: Fixes a typo · a110d172

Atul Kumar Pant authored 1 year ago

Fixes typo in a function name.

Link: https://lore.kernel.org/all/20230817170819.77857-1-atulpant.linux@gmail.com/



Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

a110d172

Sep 28, 2023

vfio: use __aligned_u64 in struct vfio_device_gfx_plane_info · a7bea9f4

Stefan Hajnoczi authored 1 year ago


The memory layout of struct vfio_device_gfx_plane_info is
architecture-dependent due to a u64 field and a struct size that is not
a multiple of 8 bytes:
- On x86_64 the struct size is padded to a multiple of 8 bytes.
- On x32 the struct size is only a multiple of 4 bytes, not 8.
- Other architectures may vary.

Use __aligned_u64 to make memory layout consistent. This reduces the
chance of 32-bit userspace on a 64-bit kernel breakage.

This patch increases the struct size on x32 but this is safe because of
the struct's argsz field. The kernel may grow the struct as long as it
still supports smaller argsz values from userspace (e.g. applications
compiled against older kernel headers).

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20230918205617.1478722-3-stefanha@redhat.com


Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

a7bea9f4

samples/bpf: Add -fsanitize=bounds to userspace programs · 9e09b750

Ruowen Qin authored 1 year ago


The sanitizer flag, which is supported by both clang and gcc, would make
it easier to debug array index out-of-bounds problems in these programs.

Make the Makfile smarter to detect ubsan support from the compiler and
add the '-fsanitize=bounds' accordingly.

Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230927045030.224548-2-ruowenq2@illinois.edu

9e09b750

Sep 21, 2023

samples/bpf: syscall_tp_user: Fix array out-of-bound access · 9220c3ef

Jinghao Jia authored 1 year ago


Commit 06744f24 ("samples/bpf: Add openat2() enter/exit tracepoint
to syscall_tp sample") added two more eBPF programs to support the
openat2() syscall. However, it did not increase the size of the array
that holds the corresponding bpf_links. This leads to an out-of-bound
access on that array in the bpf_object__for_each_program loop and could
corrupt other variables on the stack. On our testing QEMU, it corrupts
the map1_fds array and causes the sample to fail:

  # ./syscall_tp
  prog #0: map ids 4 5
  verify map:4 val: 5
  map_lookup failed: Bad file descriptor

Dynamically allocate the array based on the number of programs reported
by libbpf to prevent similar inconsistencies in the future

Fixes: 06744f24 ("samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sample")
Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Link: https://lore.kernel.org/r/20230917214220.637721-4-jinghao7@illinois.edu


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

9220c3ef

samples/bpf: syscall_tp_user: Rename num_progs into nr_tests · 0ee352fe

Jinghao Jia authored 1 year ago


The variable name num_progs causes confusion because that variable
really controls the number of rounds the test should be executed.

Rename num_progs into nr_tests for the sake of clarity.

Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Link: https://lore.kernel.org/r/20230917214220.637721-3-jinghao7@illinois.edu


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

0ee352fe

Sep 08, 2023

selftests/bpf: trace_helpers.c: Optimize kallsyms cache · c698eaeb

Rong Tao authored 1 year ago


Static ksyms often have problems because the number of symbols exceeds the
MAX_SYMS limit. Like changing the MAX_SYMS from 300000 to 400000 in
commit e76a0143("selftests/bpf: Bump and validate MAX_SYMS") solves
the problem somewhat, but it's not the perfect way.

This commit uses dynamic memory allocation, which completely solves the
problem caused by the limitation of the number of kallsyms. At the same
time, add APIs:

    load_kallsyms_local()
    ksym_search_local()
    ksym_get_addr_local()
    free_kallsyms_local()

There are used to solve the problem of selftests/bpf updating kallsyms
after attach new symbols during testmod testing.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/tencent_C9BDA68F9221F21BE4081566A55D66A9700A@qq.com

c698eaeb

Aug 24, 2023

samples/bpf: Add note to README about the XDP utilities moved to xdp-tools · 5a9fd0f7

Toke Høiland-Jørgensen authored 1 year ago

To help users find the XDP utilities, add a note to the README about the
new location and the conversion documentation in the commit messages.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-8-toke@redhat.com

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

5a9fd0f7

samples/bpf: Cleanup .gitignore · 91b96513

Toke Høiland-Jørgensen authored 1 year ago


Remove no longer present XDP utilities from .gitignore. Apart from the
recently removed XDP utilities this also includes the previously removed
xdpsock and xsk utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-7-toke@redhat.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

91b96513

samples/bpf: Remove the xdp_sample_pkts utility · cced0699

Toke Høiland-Jørgensen authored 1 year ago


The functionality of this utility is covered by the xdpdump utility in
xdp-tools.

There's a slight difference in usage as the xdpdump utility's main focus is
to dump packets before or after they are processed by an existing XDP
program. However, xdpdump also has the --load-xdp-program switch, which
will make it attach its own program if no existing program is loaded. With
this, xdp_sample_pkts usage can be converted as:

xdp_sample_pkts eth0
  --> xdpdump --load-xdp-program eth0

To get roughly equivalent behaviour.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-6-toke@redhat.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

cced0699

samples/bpf: Remove the xdp1 and xdp2 utilities · eaca21d6

Toke Høiland-Jørgensen authored 1 year ago


The functionality of these utilities have been incorporated into the
xdp-bench utility in xdp-tools.

Equivalent functionality is:

xdp1 eth0
  --> xdp-bench drop -p parse-ip -l load-bytes eth0

xdp2 eth0
  --> xdp-bench drop -p swap-macs eth0

Note that there's a slight difference in behaviour of those examples: the
swap-macs operation of xdp-bench doesn't use the bpf_xdp_load_bytes()
helper to load the packet data, whereas the xdp2 utility did so
unconditionally. For the parse-ip action the use of bpf_xdp_load_bytes()
can be selected by the '-l load-bytes' switch, with the difference that the
xdp-bench utility will perform two separate calls to the helper, one to
load the ethernet header and another to load the IP header; where the xdp1
utility only performed one call always loading 60 bytes of data.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-5-toke@redhat.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

eaca21d6

samples/bpf: Remove the xdp_rxq_info utility · 0e445e11

Toke Høiland-Jørgensen authored 1 year ago


The functionality of this utility has been incorporated into the xdp-bench
utility in xdp-tools, by way of the --rxq-stats argument to the 'drop',
'pass' and 'tx' commands of xdp-bench.

Some examples of how to convert xdp_rxq_info invocations into equivalent
xdp-bench commands:

xdp_rxq_info -d eth0
  --> xdp-bench pass --rxq-stats eth0

xdp_rxq_info -d eth0 -a XDP_DROP -m
  --> xdp-bench drop --rxq-stats -p swap-macs eth0

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-4-toke@redhat.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

0e445e11

samples/bpf: Remove the xdp_redirect* utilities · 91dda69b

Toke Høiland-Jørgensen authored 1 year ago


These utilities have all been ported to xdp-tools as functions of the
xdp-bench utility. The four different utilities in samples are incorporated
as separate subcommands to xdp-bench, with most of the command line
parameters left intact, except that mandatory arguments are always
positional in xdp-bench. For full usage details see the --help output of
each command, or the xdp-bench man page.

Some examples of how to convert usage to xdp-bench are:

xdp_redirect eth0 eth1
  --> xdp-bench redirect eth0 eth1

xdp_redirect_map eth0 eth1
  --> xdp-bench redirect-map eth0 eth1

xdp_redirect_map_multi eth0 eth1 eth2 eth3
  --> xdp-bench redirect-multi eth0 eth1 eth2 eth3

xdp_redirect_cpu -d eth0 -c 0 -c 1
  --> xdp-bench redirect-cpu -c 0 -c 1 eth0

xdp_redirect_cpu -d eth0 -c 0 -c 1 -r eth1
  --> xdp-bench redirect-cpu -c 0 -c 1 eth0 -r redirect -D eth1

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-3-toke@redhat.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

91dda69b

samples/bpf: Remove the xdp_monitor utility · e7c9e73d

Toke Høiland-Jørgensen authored 1 year ago


This utility has been ported as-is to xdp-tools as 'xdp-monitor'. The only
difference in usage between the samples and xdp-tools versions is that the
'-v' command line parameter has been changed to '-e' in the xdp-tools
version for consistency with the other utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-2-toke@redhat.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

e7c9e73d

Aug 22, 2023

samples: ftrace: Replace bti assembly with hint for older compiler · e332938e

GONG, Ruiqi authored 1 year ago

When cross-building the arm64 kernel with allmodconfig using GCC 9.4,
the following error occurs on multiple files under samples/ftrace/:

/tmp/ccPC1ODs.s: Assembler messages:
/tmp/ccPC1ODs.s:8: Error: selected processor does not support `bti c'

Fix this issue by replacing `bti c` with `hint 34`, which is compatible
for the older compiler.

Link: https://lore.kernel.org/linux-trace-kernel/20230820111509.1470826-1-gongruiqi@huaweicloud.com



Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Fixes: 8c3526fb ("arm64: ftrace: Add direct call trampoline samples support")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

e332938e

Aug 21, 2023

samples/bpf: simplify spintest with kprobe.multi · 456d5355

Daniel T. Lee authored 1 year ago


With the introduction of kprobe.multi, it is now possible to attach
multiple kprobes to a single BPF program without the need for multiple
definitions. Additionally, this method supports wildcard-based
matching, allowing for further simplification of BPF programs. In here,
an asterisk (*) wildcard is used to map to all symbols relevant to
spin_{lock|unlock}.

Furthermore, since kprobe.multi handles symbol matching, this commit
eliminates the need for the previous logic of reading the ksym table to
verify the existence of symbols.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-10-danieltimlee@gmail.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

456d5355

samples/bpf: refactor syscall tracing programs using BPF_KSYSCALL macro · 8dc80551

Daniel T. Lee authored 1 year ago


This commit refactors the syscall tracing programs by adopting the
BPF_KSYSCALL macro. This change aims to enhance the clarity and
simplicity of the BPF programs by reducing the complexity of argument
parsing from pt_regs.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-9-danieltimlee@gmail.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

8dc80551

samples/bpf: fix broken map lookup probe · d93a7cf6

Daniel T. Lee authored 1 year ago


In the commit 7c4cd051 ("bpf: Fix syscall's stackmap lookup
potential deadlock"), a potential deadlock issue was addressed, which
resulted in *_map_lookup_elem not triggering BPF programs.
(prior to lookup, bpf_disable_instrumentation() is used)

To resolve the broken map lookup probe using "htab_map_lookup_elem",
this commit introduces an alternative approach. Instead, it utilize
"bpf_map_copy_value" and apply a filter specifically for the hash table
with map_type.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Fixes: 7c4cd051 ("bpf: Fix syscall's stackmap lookup potential deadlock")
Link: https://lore.kernel.org/r/20230818090119.477441-8-danieltimlee@gmail.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

d93a7cf6

samples/bpf: fix bio latency check with tracepoint · 92632115

Daniel T. Lee authored 1 year ago

Recently, a new tracepoint for the block layer, specifically the
block_io_start/done tracepoints, was introduced in commit 5a80bd07
("block: introduce block_io_start/block_io_done tracepoints").

Previously, the kprobe entry used for this purpose was quite unstable
and inherently broke relevant probes [1]. Now that a stable tracepoint
is available, this commit replaces the bio latency check with it.

One of the changes made during this replacement is the key used for the
hash table. Since 'struct request' cannot be used as a hash key, the
approach taken follows that which was implemented in bcc/biolatency [2].
(uses dev:sector for the key)

[1]: https://github.com/iovisor/bcc/issues/4261
[2]: https://github.com/iovisor/bcc/pull/4691



Fixes: 450b7879 ("block: move blk_account_io_{start,done} to blk-mq.c")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-7-danieltimlee@gmail.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

92632115

samples/bpf: make tracing programs to be more CO-RE centric · 11430421

Daniel T. Lee authored 1 year ago

The existing tracing programs have been developed for a considerable
period of time and, as a result, do not properly incorporate the
features of the current libbpf, such as CO-RE. This is evident in
frequent usage of functions like PT_REGS* and the persistence of "hack"
methods using underscore-style bpf_probe_read_kernel from the past.

These programs are far behind the current level of libbpf and can
potentially confuse users. Therefore, this commit aims to convert the
outdated BPF programs to be more CO-RE centric.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-6-danieltimlee@gmail.com

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

11430421

samples/bpf: fix symbol mismatch by compiler optimization · 02dabc24

Daniel T. Lee authored 1 year ago


Currently, multiple kprobe programs are suffering from symbol mismatch
due to compiler optimization. These optimizations might induce
additional suffix to the symbol name such as '.isra' or '.constprop'.

    # egrep ' finish_task_switch| __netif_receive_skb_core' /proc/kallsyms
    ffffffff81135e50 t finish_task_switch.isra.0
    ffffffff81dd36d0 t __netif_receive_skb_core.constprop.0
    ffffffff8205cc0e t finish_task_switch.isra.0.cold
    ffffffff820b1aba t __netif_receive_skb_core.constprop.0.cold

To avoid this, this commit replaces the original kprobe section to
kprobe.multi in order to match symbol with wildcard characters. Here,
asterisk is used for avoiding symbol mismatch.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-5-danieltimlee@gmail.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

02dabc24

samples/bpf: unify bpf program suffix to .bpf with tracing programs · 4a0ee788

Daniel T. Lee authored 1 year ago


Currently, BPF programs typically have a suffix of .bpf.c. However,
some programs still utilize a mixture of _kern.c suffix alongside the
naming convention. In order to achieve consistency in the naming of
these programs, this commit unifies the inconsistency in the naming
convention of BPF kernel programs.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-4-danieltimlee@gmail.com


Signed-off-by: Alexei Starovoitov <ast@kernel.org>

4a0ee788

samples/bpf: convert to vmlinux.h with tracing programs · e7e6c774

Daniel T. Lee authored 1 year ago

This commit replaces separate headers with a single vmlinux.h to
tracing programs. Thanks to that, we no longer need to define the
argument structure for tracing programs directly. For example, argument
for the sched_switch tracpepoint (sched_switch_args) can be replaced
with the vmlinux.h provided trace_event_raw_sched_switch.

Additional defines have been added to the BPF program either directly
or through the inclusion of net_shared.h. Defined values are
PERF_MAX_STACK_DEPTH, IFNAMSIZ constants and __stringify() macro. This
change enables the BPF program to access internal structures with BTF
generated "vmlinux.h" header.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-3-danieltimlee@gmail.com

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

e7e6c774

Admin message

Admin message