Commits · 1a40bb31fcf1 · Dmitry Baryshkov / msm

Oct 02, 2024

move asm/unaligned.h to linux/unaligned.h · 5f60d5f6

Al Viro authored 5 months ago

asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h

5f60d5f6

Sep 20, 2024

tools: Add riscv barrier implementation · 6d74d178

Charlie Jenkins authored 7 months ago


Many of the other architectures use their custom barrier implementations.
Use the barrier code from the kernel sources to optimize barriers in
tools.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20240806-optimize_ring_buffer_read_riscv-v2-1-ca7e193ae198@rivosinc.com


Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

6d74d178

Sep 13, 2024

s390/vdso: Wire up getrandom() vdso implementation · b920aa77

Heiko Carstens authored 6 months ago and

Jason A. Donenfeld committed 6 months ago


Provide the s390 specific vdso getrandom() architecture backend.

_vdso_rng_data required data is placed within the _vdso_data vvar page,
by using a hardcoded offset larger than vdso_data.

As required the chacha20 implementation does not write to the stack.

The implementation follows more or less the arm64 implementations and
makes use of vector instructions. It has a fallback to the getrandom()
system call for machines where the vector facility is not installed.

The check if the vector facility is installed, as well as an
optimization for machines with the vector-enhancements facility 2, is
implemented with alternatives, avoiding runtime checks.

Note that __kernel_getrandom() is implemented without the vdso user
wrapper which would setup a stack frame for odd cases (aka very old
glibc variants) where the caller has not done that. All callers of
__kernel_getrandom() are required to setup a stack frame, like the C ABI
requires it.

The vdso testcases vdso_test_getrandom and vdso_test_chacha pass.

Benchmark on a z16:

    $ ./vdso_test_getrandom bench-single
       vdso: 25000000 times in 0.493703559 seconds
    syscall: 25000000 times in 6.584025337 seconds

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

b920aa77

powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32 · 53cee505

Christophe Leroy authored 6 months ago and

Jason A. Donenfeld committed 6 months ago


To be consistent with other VDSO functions, the function is called
__kernel_getrandom()

__arch_chacha20_blocks_nostack() fonction is implemented basically
with 32 bits operations. It performs 4 QUARTERROUND operations in
parallele. There are enough registers to avoid using the stack:

On input:
	r3: output bytes
	r4: 32-byte key input
	r5: 8-byte counter input/output
	r6: number of 64-byte blocks to write to output

During operation:
	stack: pointer to counter (r5) and non-volatile registers (r14-131)
	r0: counter of blocks (initialised with r6)
	r4: Value '4' after key has been read, used for indexing
	r5-r12: key
	r14-r15: block counter
	r16-r31: chacha state

At the end:
	r0, r6-r12: Zeroised
	r5, r14-r31: Restored

Performance on powerpc 885 (using kernel selftest):
	~# ./vdso_test_getrandom bench-single
	   vdso: 25000000 times in 62.938002291 seconds
	   libc: 25000000 times in 535.581916866 seconds
	syscall: 25000000 times in 531.525042806 seconds

Performance on powerpc 8321 (using kernel selftest):
	~# ./vdso_test_getrandom bench-single
	   vdso: 25000000 times in 16.899318858 seconds
	   libc: 25000000 times in 131.050596522 seconds
	syscall: 25000000 times in 129.794790389 seconds

This first patch adds support for VDSO32. As selftests cannot easily
be generated only for VDSO32, and because the following patch brings
support for VDSO64 anyway, this patch opts out all code in
__arch_chacha20_blocks_nostack() so that vdso_test_chacha will not
fail to compile and will not crash on PPC64/PPC64LE, allthough the
selftest itself will fail.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

53cee505

arm64: vDSO: Wire up getrandom() vDSO implementation · 712676ea

Adhemerval Zanella authored 6 months ago and

Jason A. Donenfeld committed 6 months ago


Hook up the generic vDSO implementation to the aarch64 vDSO data page.
The _vdso_rng_data required data is placed within the _vdso_data vvar
page, by using a offset larger than the vdso_data.

The vDSO function requires a ChaCha20 implementation that does not write
to the stack, and that can do an entire ChaCha20 permutation.  The one
provided uses NEON on the permute operation, with a fallback to the
syscall for chips that do not support AdvSIMD.

This also passes the vdso_test_chacha test along with
vdso_test_getrandom. The vdso_test_getrandom bench-single result on
Neoverse-N1 shows:

   vdso: 25000000 times in 0.783884250 seconds
   libc: 25000000 times in 8.780275399 seconds
syscall: 25000000 times in 8.786581518 seconds

A small fixup to arch/arm64/include/asm/mman.h was required to avoid
pulling kernel code into the vDSO, similar to what's already done in
arch/arm64/include/asm/rwonce.h.

Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

712676ea

LoongArch: vDSO: Wire up getrandom() vDSO implementation · 18efd0b1

Xi Ruoyao authored 6 months ago and

Jason A. Donenfeld committed 6 months ago


Hook up the generic vDSO implementation to the LoongArch vDSO data page
by providing the required __arch_chacha20_blocks_nostack,
__arch_get_k_vdso_rng_data, and getrandom_syscall implementations. Also
wire up the selftests.

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Acked-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

18efd0b1

Aug 30, 2024

selftests: vDSO: don't hard-code location of vDSO sources · 20a9af05

Christophe Leroy authored 7 months ago and

Jason A. Donenfeld committed 6 months ago


Architectures use different location for vDSO sources:

	arch/mips/vdso
	arch/sparc/vdso
	arch/arm64/kernel/vdso
	arch/riscv/kernel/vdso
	arch/csky/kernel/vdso
	arch/x86/um/vdso
	arch/x86/entry/vdso
	arch/powerpc/kernel/vdso
	arch/arm/vdso
	arch/loongarch/vdso

Don't hard-code vdso sources location in selftest Makefile. Instead
create a vdso/ symbolic link in tools/arch/$arch/ and update Makefile
accordingly.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

20a9af05

Aug 07, 2024

tools/include: Sync arm64 headers with the kernel sources · d5b85489

Namhyung Kim authored 7 months ago


To pick up changes from:

  9ef54a38 arm64: cputype: Add Cortex-A725 definitions
  58d245e0 arm64: cputype: Add Cortex-X1C definitions
  fd2ff5f0 arm64: cputype: Add Cortex-X925 definitions
  add332c4 arm64: cputype: Add Cortex-A720 definitions
  be5a6f23 arm64: cputype: Add Cortex-X3 definitions

This should be used to beautify x86 syscall arguments and it addresses
these tools/perf build warnings:

  Warning: Kernel ABI header differences:
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Please see tools/include/uapi/README for details (it's in the first patch
of this series).

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

d5b85489

tools/include: Sync x86 headers with the kernel sources · f6d9883f

Namhyung Kim authored 7 months ago


To pick up changes from:

  149fd471 perf/x86/intel: Support Perfmon MSRs aliasing
  21b362cc x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems
  4f460bff cpufreq: acpi: move MSR_K7_HWCR_CPB_DIS_BIT into msr-index.h
  7ea81936 x86/cpufeatures: Add HWP highest perf change feature flag
  78ce84b9 x86/cpufeatures: Flip the /proc/cpuinfo appearance logic
  1beb348d x86/sev: Provide SVSM discovery support

This should be used to beautify x86 syscall arguments and it addresses
these tools/perf build warnings:

  Warning: Kernel ABI header differences:
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Please see tools/include/uapi/README for details (it's in the first patch
of this series).

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

f6d9883f

Aug 06, 2024

tools/include: Sync uapi/linux/kvm.h with the kernel sources · a625df39

Namhyung Kim authored 7 months ago

And other arch-specific UAPI headers to pick up changes from:

  4b23e0c1 KVM: Ensure new code that references immediate_exit gets extra scrutiny
  85542adb KVM: x86: Add KVM_RUN_X86_GUEST_MODE kvm_run flag
  6fef5185 KVM: x86: Add a capability to configure bus frequency for APIC timer
  34ff6590 x86/sev: Use kernel provided SVSM Calling Areas
  5dcc1e76 Merge tag 'kvm-x86-misc-6.11' of https://github.com/kvm-x86/linux

 into HEAD
  9a0d2f49 KVM: PPC: Book3S HV: Add one-reg interface for HASHPKEYR register
  e9eb790b KVM: PPC: Book3S HV: Add one-reg interface for HASHKEYR register
  1a1e6865 KVM: PPC: Book3S HV: Add one-reg interface for DEXCR register

This should be used to beautify KVM syscall arguments and it addresses
these tools/perf build warnings:

  Warning: Kernel ABI header differences:
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
  diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
  diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h

Please see tools/include/uapi/README for details (it's in the first patch
of this series).

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

a625df39

Aug 02, 2024

tools/x86/kcpuid: Introduce a complete cpuid bitfields CSV file · cbbd847d

Ahmed S. Darwish authored 8 months ago

For parsing the cpuid bitfields, kcpuid uses an incomplete CSV file with
300+ bitfields.

Use an auto-generated CSV file from the x86-cpuid.org project instead.
It provides complete bitfields coverage: 830+ bitfields, all with proper
descriptions.

The auto-generated file has the following blurb automatically added:

   # SPDX-License-Identifier: CC0-1.0
   # Generator: x86-cpuid-db v1.0

The generator tag includes the project's workspace "git describe"
version string.  It is intended for projects like KernelCI, to aid in
verifying that the auto-generated files have not been tampered with.

The file also has the blurb:

   # Auto-generated file.
   # Please submit all updates and bugfixes to https://x86-cpuid.org



It's thus kindly requested that the Linux kernel's x86 tree maintainers
enforce sending all updates to x86-cpuid.org's upstream database first,
thus benefiting the whole ecosystem.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://gitlab.com/x86-cpuid.org/x86-cpuid-db/-/blob/v1.0/LICENSE.rst
Link: https://gitlab.com/x86-cpuid.org/x86-cpuid-db
Link: https://lore.kernel.org/all/20240718134755.378115-9-darwi@linutronix.de

cbbd847d

tools/x86/kcpuid: Parse subleaf ranges if provided · 58921443

Ahmed S. Darwish authored 8 months ago


It's a common pattern in cpuid leaves to have the same bitfields format
repeated across a number of subleaves.  Typically, this is used for
enumerating hierarchial structures like cache and TLB levels, CPU
topology levels, etc.

Modify kcpuid.c to handle subleaf ranges in the CSV file subleaves
column.  For example, make it able to parse lines in the form:

 # LEAF, SUBLEAVES,  reg,    bits,    short_name             , ...
    0xb,       1:0,  eax,     4:0,    x2apic_id_shift        , ...
    0xb,       1:0,  ebx,    15:0,    domain_lcpus_count     , ...
    0xb,       1:0,  ecx,     7:0,    domain_nr              , ...

This way, full output can be printed to the user.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-8-darwi@linutronix.de

58921443

tools/x86/kcpuid: Recognize all leaves with subleaves · b0a59d14

Ahmed S. Darwish authored 8 months ago


cpuid.csv will be extended in further commits with all-publicly-known
CPUID leaves and bitfields.  Thus, modify has_subleafs() to identify all
known leaves with subleaves.

Remove the redundant "is_amd" check since all x86 vendors already report
the maxium supported extended leaf at leaf 0x80000000 EAX register.

The extra mentioned leaves are:

  - Leaf 0x12, Intel Software Guard Extensions (SGX) enumeration
  - Leaf 0x14, Intel process trace (PT) enumeration
  - Leaf 0x17, Intel SoC vendor attributes enumeration
  - Leaf 0x1b, Intel PCONFIG (Platform configuration) enumeration
  - Leaf 0x1d, Intel AMX (Advanced Matrix Extensions) tile information
  - Leaf 0x1f, Intel v2 extended topology enumeration
  - Leaf 0x23, Intel ArchPerfmonExt (Architectural PMU ext) enumeration
  - Leaf 0x80000020, AMD Platform QoS extended features enumeration
  - Leaf 0x80000026, AMD v2 extended topology enumeration

Set the 'max_subleaf' variable for all the newly marked leaves with extra
subleaves.  Ideally, this should be fetched from the CSV file instead,
but the current kcpuid code architecture has two runs: one run to
serially invoke the cpuid instructions and save all the output in-memory,
and one run to parse this in-memory output through the CSV specification.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-7-darwi@linutronix.de

b0a59d14

tools/x86/kcpuid: Strip bitfield names leading/trailing whitespace · 9ecbc60a

Ahmed S. Darwish authored 8 months ago


While parsing and saving bitfield names from the CSV file, an extra
leading space is copied verbatim.  That extra space is not a big issue
now, but further commits will add a new CSV file with much more padding
for the bitfield's name column.

Strip leading/trailing whitespaces while saving bitfield names.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-6-darwi@linutronix.de

9ecbc60a

tools/x86/kcpuid: Protect against faulty "max subleaf" values · cf96ab1a

Ahmed S. Darwish authored 8 months ago


Protect against the kcpuid code parsing faulty max subleaf numbers
through a min() expression.  Thus, ensuring that max_subleaf will always
be ≤ MAX_SUBLEAF_NUM.

Use "u32" for the subleaf numbers since kcpuid is compiled with -Wextra,
which includes signed/unsigned comparisons warnings.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-5-darwi@linutronix.de

cf96ab1a

tools/x86/kcpuid: Set max possible subleaves count to 64 · 5dd7ca42

Ahmed S. Darwish authored 8 months ago


cpuid.csv will be extended in further commits with all-publicly-known
CPUID leaves and bitfields.  One of the new leaves is 0xd for extended
CPU state enumeration.  Depending on XCR0 dword bits, it can export up to
64 subleaves.

Set kcpuid.c MAX_SUBLEAF_NUM to 64.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-4-darwi@linutronix.de

5dd7ca42

tools/x86/kcpuid: Properly align long-description columns · a52e735f

Ahmed S. Darwish authored 8 months ago


When kcpuid is invoked with "--all --details", the detailed description
column is not properly aligned for all bitfield rows:

CPUID_0x4_ECX[0x0]:
	 cache_level        	: 0x1       	- Cache Level ...
	 cache_self_init     - Cache Self Initialization

This is due to differences in output handling between boolean single-bit
"bitflags" and multi-bit bitfields.  For the former, the bitfield's value
is not outputted as it is implied to be true by just outputting the
bitflag's name in its respective line.

If long descriptions were requested through the --all parameter, properly
align the bitflag's description columns through extra tabs.  With that,
the sample output above becomes:

CPUID_0x4_ECX[0x0]:
	 cache_level        	: 0x1       	- Cache Level ...
	 cache_self_init     			- Cache Self Initialization

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-3-darwi@linutronix.de

a52e735f

tools/x86/kcpuid: Remove unused variable · 39e47005

Ahmed S. Darwish authored 8 months ago


Global variable "num_leafs" is set in multiple places but is never read
anywhere.  Remove it.

Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240718134755.378115-2-darwi@linutronix.de

39e47005

Jul 10, 2024

clone3: drop __ARCH_WANT_SYS_CLONE3 macro · 505d66d1

Arnd Bergmann authored 10 months ago


When clone3() was introduced, it was not obvious how each architecture
deals with setting up the stack and keeping the register contents in
a fork()-like system call, so this was left for the architecture
maintainers to implement, with __ARCH_WANT_SYS_CLONE3 defined by those
that already implement it.

Five years later, we still have a few architectures left that are missing
clone3(), and the macro keeps getting in the way as it's fundamentally
different from all the other __ARCH_WANT_SYS_* macros that are meant
to provide backwards-compatibility with applications using older
syscalls that are no longer provided by default.

Address this by reversing the polarity of the macro, adding an
__ARCH_BROKEN_SYS_CLONE3 macro to all architectures that don't
already provide the syscall, and remove __ARCH_WANT_SYS_CLONE3
from all the other ones.

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

505d66d1

Jun 12, 2024

tools/x86/kcpuid: Add missing dir via Makefile · 1d2a03d2

Christian Heusel authored 9 months ago

So far the Makefile just installed the csv into $(HWDATADIR)/cpuid.csv, which
made it unaware about $DESTDIR. Add $DESTDIR to the install command and while
at it also create the directory, should it not exist already. This eases the
packaging of kcpuid and allows i.e. for the install on Arch to look like this:

  $ make BINDIR=/usr/bin DESTDIR="$pkgdir" -C tools/arch/x86/kcpuid install

Some background on DESTDIR:

DESTDIR is commonly used in packaging for staged installs (regardless of the
used package manager):

  https://www.gnu.org/prep/standards/html_node/DESTDIR.html



So the package is built and installed into a directory which the package
manager later picks up and creates some archive from it.

What is specific to Arch Linux here is only the usage of $pkgdir in the
example, DESTDIR itself is widely used.

  [ bp: Extend the commit message with Christian's info on DESTDIR as a GNU
    coding standards thing. ]

Signed-off-by: Christian Heusel <christian@heusel.eu>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240531111757.719528-2-christian@heusel.eu

1d2a03d2

Jun 04, 2024

tools headers arm64: Sync arm64's cputype.h with the kernel sources · dc6abbbd

Arnaldo Carvalho de Melo authored 9 months ago


To get the changes in:

  0ce85db6 ("arm64: cputype: Add Neoverse-V3 definitions")
  02a0a046 ("arm64: cputype: Add Cortex-X4 definitions")
  f4d9d9dc ("arm64: Add Neoverse-V2 part")

That makes this perf source code to be rebuilt:

  CC      /tmp/build/perf-tools/util/arm-spe.o

The changes in the above patch add MIDR_NEOVERSE_V[23] and
MIDR_NEOVERSE_V1 is used in arm-spe.c, so probably we need to add those
and perhaps MIDR_CORTEX_X4 to that array? Or maybe we need to leave this
for later when this is all tested on those machines?

  static const struct midr_range neoverse_spe[] = {
          MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
          MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
          MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
          {},
  };

Mark Rutland recommended about arm-spe.c:

"I would not touch this for now -- someone would have to go audit the
TRMs to check that those other cores have the same encoding, and I think
it'd be better to do that as a follow-up."

That addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Besar Wicaksono <bwicaksono@nvidia.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/lkml/Zl8cYk0Tai2fs7aM@x1


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

dc6abbbd

May 28, 2024

tools headers UAPI: Sync kvm headers with the kernel sources · 88e52051

Arnaldo Carvalho de Melo authored 10 months ago

To pick the changes in:

  4af663c2 ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
  4f5defae ("KVM: SEV: introduce KVM_SEV_INIT2 operation")
  26c44aa9 ("KVM: SEV: define VM types for SEV and SEV-ES")
  ac5c4802 ("KVM: SEV: publish supported VMSA features")
  651d61bc ("KVM: PPC: Fix documentation for ppc mmu caps")

That don't change functionality in tools/perf, as no new ioctl is added
for the 'perf trace' scripts to harvest.

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
    diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/lkml/ZlYxAdHjyAkvGtMW@x1


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

88e52051

tools arch x86: Sync the msr-index.h copy with the kernel sources · ac4b0690

Arnaldo Carvalho de Melo authored 10 months ago

To pick up the changes from these csets:

  53bc516a ("x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place")

That patch just move definitions around, so this just silences this perf
build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/lkml/ZlYe8jOzd1_DyA7X@x1


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

ac4b0690

May 14, 2024

tools arch x86: Add dell-uart-backlight-emulator · d9bab776

Hans de Goede authored 10 months ago


Dell All In One (AIO) models released after 2017 use a backlight controller
board connected to an UART.

Add a small emulator to allow development and testing of
the drivers/platform/x86/dell/dell-uart-backlight.c driver for
this board, without requiring access to an actual Dell All In One.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240513144603.93874-3-hdegoede@redhat.com

d9bab776

May 02, 2024

x86/insn: Add support for APX EVEX instructions to the opcode map · 690ca3a3

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago


To support APX functionality, the EVEX prefix is used to:

 - promote legacy instructions
 - promote VEX instructions
 - add new instructions

Promoted VEX instructions require no extra annotation because the opcodes
do not change and the permissive nature of the instruction decoder already
allows them to have an EVEX prefix.

Promoted legacy instructions and new instructions are placed in map 4 which
has not been used before.

Create a new table for map 4 and add APX instructions.

Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add
support for APX EVEX to the instruction decoder logic". SCALABLE
instructions must be represented in both no-prefix (NP) and 66 prefix
forms.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-9-adrian.hunter@intel.com

690ca3a3

x86/insn: Add support for APX EVEX to the instruction decoder logic · 87bbaf1a

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago


Intel Advanced Performance Extensions (APX) extends the EVEX prefix to
support:

 - extended general purpose registers (EGPRs) i.e. r16 to r31
 - Push-Pop Acceleration (PPX) hints
 - new data destination (NDD) register
 - suppress status flags writes (NF) of common instructions
 - new instructions

Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.

The extended EVEX prefix does not need amended instruction decoder logic,
except in one area. Some instructions are defined as SCALABLE which means
the EVEX.W bit and EVEX.pp bits are used to determine operand size.
Specifically, if an instruction is SCALABLE and EVEX.W is zero, then
EVEX.pp value 0 (representing no prefix NP) means default operand size,
whereas EVEX.pp value 1 (representing 66 prefix) means operand size
override i.e. 16 bits

Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and
amend the logic appropriately.

Amend the awk script that generates the attribute tables from the opcode
map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com

87bbaf1a

x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map · 159039af

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago

Support for REX2 has been added to the instruction decoder logic and the
awk script that generates the attribute tables from the opcode map.

Add REX2 prefix byte (0xD5) to the opcode map.

Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2.

Add JMPABS to the opcode map and add annotation (REX2) to identify that it
has a mandatory REX2 prefix. A separate opcode attribute table is not
needed at this time because JMPABS has the same attribute encoding as the
MOV instruction that it shares an opcode with i.e. INAT_MOFFSET.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-7-adrian.hunter@intel.com

159039af

x86/insn: Add support for REX2 prefix to the instruction decoder logic · eada38d5

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago

Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named
REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31.

The REX2 prefix is effectively an extended version of the REX prefix.

REX2 and EVEX are also used with PUSH/POP instructions to provide a
Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to
fast-forward register data between matching PUSH and POP instructions.

REX2 is valid only with opcodes in maps 0 and 1. Similar extension for
other maps is provided by the EVEX prefix, covered in a separate patch.

Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used
for a new 64-bit absolute direct jump instruction JMPABS.

Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.

Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute
flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify
opcodes (only JMPABS) that require a mandatory REX2 prefix
(INAT_REX2_VARIANT).

Amend logic to read the REX2 prefix and get the opcode attribute for the
map number (0 or 1) encoded in the REX2 prefix.

Amend the awk script that generates the attribute tables from the opcode
map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)"
as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com

eada38d5

x86/insn: Add misc new Intel instructions · 9dd36128

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago


The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.

Add instructions documented in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference March 2024
319433-052, that have not been added yet:

	AADD
	AAND
	AOR
	AXOR
	CMPccXADD
	PBNDKB
	RDMSRLIST
	URDMSR
	UWRMSR
	VBCSTNEBF162PS
	VBCSTNESH2PS
	VCVTNEEBF162PS
	VCVTNEEPH2PS
	VCVTNEOBF162PS
	VCVTNEOPH2PS
	VCVTNEPS2BF16
	VPDPB[SU,UU,SS]D[,S]
	VPDPW[SU,US,UU]D[,S]
	VPMADD52HUQ
	VPMADD52LUQ
	VSHA512MSG1
	VSHA512MSG2
	VSHA512RNDS2
	VSM3MSG1
	VSM3MSG2
	VSM3RNDS2
	VSM4KEY4
	VSM4RNDS4
	WRMSRLIST
	TCMMIMFP16PS
	TCMMRLFP16PS
	TDPFP16PS
	PREFETCHIT1
	PREFETCHIT0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-5-adrian.hunter@intel.com

9dd36128

x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS · b8000264

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago


The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.

Intel Architecture Instruction Set Extensions and Future Features manual
number 319433-044 of May 2021, documented VEX versions of instructions
VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them
listed as EVEX only.

Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS,
VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX
or EVEX prefix.

Fixes: 0153d98f ("x86/insn: Add misc instructions to x86 instruction decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-4-adrian.hunter@intel.com

b8000264

x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map · 59162e0c

Adrian Hunter authored 10 months ago and

Ingo Molnar committed 10 months ago


The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.

Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size
only i.e. (d64). That was based on Intel SDM Opcode Map. However that is
contradicted by the Instruction Set Reference section for PUSH in the
same manual.

Remove 64-bit operand size only annotation from opcode 0x68 PUSH
instruction.

Example:

  $ cat pushw.s
  .global  _start
  .text
  _start:
          pushw   $0x1234
          mov     $0x1,%eax   # system call number (sys_exit)
          int     $0x80
  $ as -o pushw.o pushw.s
  $ ld -s -o pushw pushw.o
  $ objdump -d pushw | tail -4
  0000000000401000 <.text>:
    401000:       66 68 34 12             pushw  $0x1234
    401004:       b8 01 00 00 00          mov    $0x1,%eax
    401009:       cd 80                   int    $0x80
  $ perf record -e intel_pt//u ./pushw
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.014 MB perf.data ]

 Before:

  $ perf script --insn-trace=disasm
  Warning:
  1 instruction trace errors
           pushw   10349 [000] 10586.869237014:            401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           pushw $0x1234
           pushw   10349 [000] 10586.869237014:            401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb %al, (%rax)
           pushw   10349 [000] 10586.869237014:            401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb %cl, %ch
           pushw   10349 [000] 10586.869237014:            40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw)           addb $0x2e, (%rax)
   instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction

 After:

  $ perf script --insn-trace=disasm
             pushw   10349 [000] 10586.869237014:            401000 [unknown] (./pushw)           pushw $0x1234
             pushw   10349 [000] 10586.869237014:            401004 [unknown] (./pushw)           movl $1, %eax

Fixes: eb13296c ("x86: Instruction decoder API")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-3-adrian.hunter@intel.com

59162e0c

x86/insn: Add Key Locker instructions to the opcode map · a5dd673a

Chang S. Bae authored 10 months ago and

Ingo Molnar committed 10 months ago


The x86 instruction decoder needs to know these new instructions that
are going to be used in the crypto library as well as the x86 core
code. Add the following:

LOADIWKEY:
	Load a CPU-internal wrapping key.

ENCODEKEY128:
	Wrap a 128-bit AES key to a key handle.

ENCODEKEY256:
	Wrap a 256-bit AES key to a key handle.

AESENC128KL:
	Encrypt a 128-bit block of data using a 128-bit AES key
	indicated by a key handle.

AESENC256KL:
	Encrypt a 128-bit block of data using a 256-bit AES key
	indicated by a key handle.

AESDEC128KL:
	Decrypt a 128-bit block of data using a 128-bit AES key
	indicated by a key handle.

AESDEC256KL:
	Decrypt a 128-bit block of data using a 256-bit AES key
	indicated by a key handle.

AESENCWIDE128KL:
	Encrypt 8 128-bit blocks of data using a 128-bit AES key
	indicated by a key handle.

AESENCWIDE256KL:
	Encrypt 8 128-bit blocks of data using a 256-bit AES key
	indicated by a key handle.

AESDECWIDE128KL:
	Decrypt 8 128-bit blocks of data using a 128-bit AES key
	indicated by a key handle.

AESDECWIDE256KL:
	Decrypt 8 128-bit blocks of data using a 256-bit AES key
	indicated by a key handle.

The detail can be found in Intel Software Developer Manual.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502105853.5338-2-adrian.hunter@intel.com

a5dd673a

Apr 29, 2024

tools/arch/x86/intel_sdsi: Add current meter support · f2464458

David E. Box authored 11 months ago


Add support to read the 'meter_current' file. The display is the same as
the 'meter_certificate', but will show the current snapshot of the
counters.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-10-david.e.box@linux.intel.com


Signed-off-by: Hans de Goede <hdegoede@redhat.com>

f2464458

tools/arch/x86/intel_sdsi: Simplify ascii printing · 53310fe9

David E. Box authored 11 months ago


Add #define for feature length and move NUL assignment from callers to
get_feature().

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-9-david.e.box@linux.intel.com


Signed-off-by: Hans de Goede <hdegoede@redhat.com>

53310fe9

tools/arch/x86/intel_sdsi: Fix meter_certificate decoding · 09d70ded

David E. Box authored 11 months ago

Fix errors in the calculation of the start position of the counters and in
the display loop. While here, use a #define for the bundle count and size.

Fixes: 7fdc03a7 ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-8-david.e.box@linux.intel.com

Signed-off-by: Hans de Goede <hdegoede@redhat.com>

09d70ded

tools/arch/x86/intel_sdsi: Fix meter_show display · 76f2bc17

David E. Box authored 11 months ago

Fixes sdsi_meter_cert_show() to correctly decode and display the meter
certificate output. Adds and displays a missing version field, displays the
ASCII name of the signature, and fixes the print alignment.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>

76f2bc17

tools/arch/x86/intel_sdsi: Fix maximum meter bundle length · a66f962f

David E. Box authored 11 months ago

The maximum number of bundles in the meter certificate was set to 8 which
is much less than the maximum. Instead, since the bundles appear at the end
of the file, set it based on the remaining file size from the bundle start
position.

Fixes: 7fdc03a7 ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240411025856.2782476-6-david.e.box@linux.intel.com

Signed-off-by: Hans de Goede <hdegoede@redhat.com>

a66f962f

Apr 28, 2024

arm64/sysreg: Update PIE permission encodings · 12d712dc

Shiqi Liu authored 11 months ago


Fix left shift overflow issue when the parameter idx is greater than or
equal to 8 in the calculation of perm in PIRx_ELx_PERM macro.

Fix this by modifying the encoding to use a long integer type.

Signed-off-by: Shiqi Liu <shiqiliu@hust.edu.cn>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240421063328.29710-1-shiqiliu@hust.edu.cn


Signed-off-by: Will Deacon <will@kernel.org>

12d712dc

Apr 27, 2024

tools headers x86 cpufeatures: Sync with the kernel sources to pick BHI mitigation changes · 8f211643

Arnaldo Carvalho de Melo authored 11 months ago

To pick the changes from:

  95a6ccbd ("x86/bhi: Mitigate KVM by default")
  ec9404e4 ("x86/bhi: Add BHI mitigation knob")
  be482ff9 ("x86/bhi: Enumerate Branch History Injection (BHI) bug")
  0f4a8376 ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S")
  7390db8a ("x86/bhi: Add support for clearing branch history at syscall entry")

This causes these perf files to be rebuilt and brings some X86_FEATURE
that will be used when updating the copies of
tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources:

      CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
      CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/ZirIx4kPtJwGFZS0@x1


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

8f211643

Apr 23, 2024

tools arch x86: Sync the msr-index.h copy with the kernel sources · b29781af

Arnaldo Carvalho de Melo authored 1 year ago

To pick up the changes from these csets:

  be482ff9 ("x86/bhi: Enumerate Branch History Injection (BHI) bug")
  0f4a8376 ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S")

That cause no changes to tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > x86_msr.before
  $ objdump -dS /tmp/build/perf-tools-next/util/amd-sample-raw.o > amd-sample-raw.o.before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ make -C tools/perf O=/tmp/build/perf-tools-next
  <SNIP>
  CC      /tmp/build/perf-tools-next/trace/beauty/tracepoints/x86_msr.o
  <SNIP>
  CC      /tmp/build/perf-tools-next/util/amd-sample-raw.o
  <SNIP>
  $ objdump -dS /tmp/build/perf-tools-next/util/amd-sample-raw.o > amd-sample-raw.o.after
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > x86_msr.after
  $ diff -u x86_msr.before x86_msr.after
  $ diff -u amd-sample-raw.o.before amd-sample-raw.o.after

Just silences this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/ZifCnEZFx5MZQuIW@x1


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

b29781af

Admin message

Admin message