Commits · 1a40bb31fcf1 · Dmitry Baryshkov / msm

Oct 06, 2024

x86/reboot: emergency callbacks are now registered by common KVM code · 2a5fe5a0

Paolo Bonzini authored 5 months ago

Guard them with CONFIG_KVM_X86_COMMON rather than the two vendor modules.
In practice this has no functional change, because CONFIG_KVM_X86_COMMON
is set if and only if at least one vendor-specific module is being built.
However, it is cleaner to specify CONFIG_KVM_X86_COMMON for functions that
are used in kvm.ko.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Fixes: 6d55a942 ("x86/reboot: Unconditionally define cpu_emergency_virt_cb typedef")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

2a5fe5a0

KVM: x86: leave kvm.ko out of the build if no vendor module is requested · ea4290d7

Paolo Bonzini authored 5 months ago

kvm.ko is nothing but library code shared by kvm-intel.ko and kvm-amd.ko.
It provides no functionality on its own and it is unnecessary unless one
of the vendor-specific module is compiled. In particular, /dev/kvm is
not created until one of kvm-intel.ko or kvm-amd.ko is loaded.

Use CONFIG_KVM to decide if it is built-in or a module, but use the
vendor-specific modules for the actual decision on whether to build it.

This also fixes a build failure when CONFIG_KVM_INTEL and CONFIG_KVM_AMD
are both disabled. The cpu_emergency_register_virt_callback() function
is called from kvm.ko, but it is only defined if at least one of
CONFIG_KVM_INTEL and CONFIG_KVM_AMD is provided.

Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ea4290d7

Oct 04, 2024

arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386 · 3eddb108

Easwar Hariharan authored 5 months ago


Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering
from erratum 3194386 added in commit 75b3c43e ("arm64: errata:
Expand speculative SSBS workaround")

CC: Mark Rutland <mark.rutland@arm.com>
CC: James More <james.morse@arm.com>
CC: Will Deacon <will@kernel.org>
CC: stable@vger.kernel.org # 6.6+
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/20241003225239.321774-1-eahariha@linux.microsoft.com


Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

3eddb108

Oct 03, 2024

KVM: x86/mmu: fix KVM_X86_QUIRK_SLOT_ZAP_ALL for shadow MMU · fcd1ec9c

Paolo Bonzini authored 5 months ago

As was tried in commit 4e103134 ("KVM: x86/mmu: Zap only the relevant
pages when removing a memslot"), all shadow pages, i.e. non-leaf SPTEs,
need to be zapped. All of the accounting for a shadow page is tied to the
memslot, i.e. the shadow page holds a reference to the memslot, for all
intents and purposes. Deleting the memslot without removing all relevant
shadow pages, as is done when KVM_X86_QUIRK_SLOT_ZAP_ALL is disabled,
results in NULL pointer derefs when tearing down the VM.

Reintroduce from that commit the code that walks the whole memslot when
there are active shadow MMU pages.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

fcd1ec9c

x86/ftrace: Include <asm/ptrace.h> · ad686707

Sami Tolvanen authored 6 months ago

<asm/ftrace.h> uses struct pt_regs in several places. Include
<asm/ptrace.h> to ensure it's visible. This is needed to make sure
object files that only include <asm/asm-prototypes.h> compile.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/20240916221557.846853-2-samitolvanen@google.com

Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

ad686707

KVM: arm64: Fix kvm_has_feat*() handling of negative features · a1d402ab

Marc Zyngier authored 5 months ago


Oliver reports that the kvm_has_feat() helper is not behaviing as
expected for negative feature. On investigation, the main issue
seems to be caused by the following construct:

 #define get_idreg_field(kvm, id, fld)				\
 	(id##_##fld##_SIGNED ?					\
	 get_idreg_field_signed(kvm, id, fld) :			\
	 get_idreg_field_unsigned(kvm, id, fld))

where one side of the expression evaluates as something signed,
and the other as something unsigned. In retrospect, this is totally
braindead, as the compiler converts this into an unsigned expression.
When compared to something that is 0, the test is simply elided.

Epic fail. Similar issue exists in the expand_field_sign() macro.

The correct way to handle this is to chose between signed and unsigned
comparisons, so that both sides of the ternary expression are of the
same type (bool).

In order to keep the code readable (sort of), we introduce new
comparison primitives taking an operator as a parameter, and
rewrite the kvm_has_feat*() helpers in terms of these primitives.

Fixes: c62d7a23 ("KVM: arm64: Add feature checking helpers")
Reported-by: Oliver Upton <oliver.upton@linux.dev>
Tested-by: Oliver Upton <oliver.upton@linux.dev>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241002204239.2051637-1-maz@kernel.org


Signed-off-by: Marc Zyngier <maz@kernel.org>

a1d402ab

Oct 02, 2024

move asm/unaligned.h to linux/unaligned.h · 5f60d5f6

Al Viro authored 5 months ago

asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h

5f60d5f6

arc: get rid of private asm/unaligned.h · 00429083

Al Viro authored 1 year ago


Declarations local to arch/*/kernel/*.c are better off *not* in a public
header - arch/arc/kernel/unaligned.h is just fine for those
bits.

Unlike the parisc case, here we have an extra twist - asm/mmu.h
has an implicit dependency on struct pt_regs, and in some users
that used to be satisfied by include of asm/ptrace.h from
asm/unaligned.h (note that asm/mmu.h itself did _not_ pull asm/unaligned.h
- it relied upon the users having pulled asm/unaligned.h before asm/mmu.h
got there).

Seeing that asm/mmu.h only wants struct pt_regs * arguments in
an extern, just pre-declare it there - less brittle that way.

With that done _all_ asm/unaligned.h instances are reduced to include
of asm-generic/unaligned.h and can be removed - unaligned.h is in
mandatory-y in include/asm-generic/Kbuild.

What's more, we can move asm-generic/unaligned.h to linux/unaligned.h
and switch includes of <asm/unaligned.h> to <linux/unaligned.h>; that's
better off as an auto-generated commit, though, to be done by Linus
at -rc1 time next cycle.

Acked-by: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

00429083

parisc: get rid of private asm/unaligned.h · 134d9882

Al Viro authored 1 year ago


Declarations local to arch/*/kernel/*.c are better off *not* in a public
header - arch/parisc/kernel/unaligned.h is just fine for those
bits.

With that done parisc asm/unaligned.h is reduced to include
of asm-generic/unaligned.h and can be removed - unaligned.h is in
mandatory-y in include/asm-generic/Kbuild.

Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

134d9882

Oct 01, 2024

riscv: Fix kernel stack size when KASAN is enabled · cfb10de1

Alexandre Ghiti authored 6 months ago


We use Kconfig to select the kernel stack size, doubling the default
size if KASAN is enabled.

But that actually only works if KASAN is selected from the beginning,
meaning that if KASAN config is added later (for example using
menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the
default size, which is not enough for KASAN as reported in [1].

So fix this by moving the logic to compute the right kernel stack into a
header.

Fixes: a7555f6b ("riscv: stack: Add config of thread stack size")
Reported-by:  <syzbot+ba9eac24453387a9d502@syzkaller.appspotmail.com>
Closes: https://lore.kernel.org/all/000000000000eb301906222aadc2@google.com/

 [1]
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240917150328.59831-1-alexghiti@rivosinc.com


Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

cfb10de1

KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM · a9f41588

Mark Brown authored 6 months ago

When pKVM saves and restores the host floating point state on a SVE system,
it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
for the PE is. But it uses a buffer allocated with kvm_host_sve_max_vl, the
maximum VL shared by all PEs in the system. This means that if we run on a
system where the maximum VLs are not consistent, we will overflow the buffer
on PEs which support larger VLs.

Since the host will not currently attempt to make use of non-shared VLs, fix
this by explicitly setting the EL2 VL to be the maximum shared VL when we
save and restore. This will enforce the limit on host VL usage. Should we
wish to support asymmetric VLs, this code will need to be updated along with
the required changes for the host:

  https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@kernel.org



Fixes: b5b99556 ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240912-kvm-arm64-limit-guest-vl-v2-1-dd2c29cb2ac9@kernel.org


[maz: added punctuation to the commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>

a9f41588

KVM: arm64: Fix __pkvm_init_vcpu cptr_el2 error path · 78fee419

Vincent Donnefort authored 6 months ago


On an error, hyp_vcpu will be accessed while this memory has already
been relinquished to the host and unmapped from the hypervisor. Protect
the CPTR assignment with an early return.

Fixes: b5b99556 ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20240919110500.2345927-1-vdonnefort@google.com


Signed-off-by: Marc Zyngier <maz@kernel.org>

78fee419

arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS · b3d6121e

Mark Rutland authored 5 months ago


The Kconfig logic to select HAVE_DYNAMIC_FTRACE_WITH_ARGS is incorrect,
and HAVE_DYNAMIC_FTRACE_WITH_ARGS may be selected when it is not
supported by the combination of clang and GNU LD, resulting in link-time
errors:

  aarch64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.meminit.data' in mm/sparse.o] sections
  aarch64-linux-gnu-ld: final link failed: bad value

... which can be seen when building with CC=clang using a binutils
version older than 2.36.

We originally fixed that in commit:

  45bd8951 ("arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang")

... by splitting the "select HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement
into separete CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS and
GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS options which individually select
HAVE_DYNAMIC_FTRACE_WITH_ARGS.

Subsequently we accidentally re-introduced the common "select
HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement in commit:

  26299b3f ("ftrace: arm64: move from REGS to ARGS")

... then we removed it again in commit:

  68a63a41 ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y")

... then we accidentally re-introduced it again in commit:

  2aa6ac03 ("arm64: ftrace: Add direct call support")

Fix this for the third time by keeping the unified select statement and
making this depend onf either GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS or
CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS. This is more consistent with
usual style and less likely to go wrong in future.

Fixes: 2aa6ac03 ("arm64: ftrace: Add direct call support")
Cc: <stable@vger.kernel.org> # 6.4.x
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240930120448.3352564-1-mark.rutland@arm.com


Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

b3d6121e

arm64: errata: Expand speculative SSBS workaround once more · 081eb793

Mark Rutland authored 5 months ago

A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
special-purpose register does not affect subsequent speculative
instructions, permitting speculative store bypassing for a window of
time.

We worked around this for a number of CPUs in commits:

* 7187bb7d ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
* 75b3c43e ("arm64: errata: Expand speculative SSBS workaround")
* 145502cac7ea70b5 ("arm64: errata: Expand speculative SSBS workaround (again)")

Since then, a (hopefully final) batch of updates have been published,
with two more affected CPUs. For the affected CPUs the existing
mitigation is sufficient, as described in their respective Software
Developer Errata Notice (SDEN) documents:

* Cortex-A715 (MP148) SDEN v15.0, erratum 3456084
  https://developer.arm.com/documentation/SDEN-2148827/1500/

* Neoverse-N3 (MP195) SDEN v5.0, erratum 3456111
  https://developer.arm.com/documentation/SDEN-3050973/0500/



Enable the existing mitigation by adding the relevant MIDRs to
erratum_spec_ssbs_list, and update silicon-errata.rst and the
Kconfig text accordingly.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240930111705.3352047-3-mark.rutland@arm.com


Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

081eb793

arm64: cputype: Add Neoverse-N3 definitions · 92472570

Mark Rutland authored 5 months ago

Add cputype definitions for Neoverse-N3. These will be used for errata
detection in subsequent patches.

These values can be found in Table A-261 ("MIDR_EL1 bit descriptions")
in issue 02 of the Neoverse-N3 TRM, which can be found at:

  https://developer.arm.com/documentation/107997/0000/?lang=en



Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240930111705.3352047-2-mark.rutland@arm.com


Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

92472570

arm64: Force position-independent veneers · 9abe390e

Mark Rutland authored 5 months ago

Certain portions of code always need to be position-independent
regardless of CONFIG_RELOCATABLE, including code which is executed in an
idmap or which is executed before relocations are applied. In some
kernel configurations the LLD linker generates position-dependent
veneers for such code, and when executed these result in early boot-time
failures.

Marc Zyngier encountered a boot failure resulting from this when
building a (particularly cursed) configuration with LLVM, as he reported
to the list:

  https://lore.kernel.org/linux-arm-kernel/86wmjwvatn.wl-maz@kernel.org/

In Marc's kernel configuration, the .head.text and .rodata.text sections
end up more than 128MiB apart, requiring a veneer to branch between the
two:

| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w _text
| ffff800080000000 g       .head.text     0000000000000000 _text
| [mark@lakrids:~/src/linux]% usekorg 14.1.0 aarch64-linux-objdump -t vmlinux | grep -w primary_entry
| ffff8000889df0e0 g       .rodata.text   000000000000006c primary_entry,

... consequently, LLD inserts a position-dependent veneer for the branch
from _stext (in .head.text) to primary_entry (in .rodata.text):

| ffff800080000000 <_text>:
| ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
| ffff800080000004:       14003fff        b       ffff800080010000 <__AArch64AbsLongThunk_primary_entry>
...
| ffff800080010000 <__AArch64AbsLongThunk_primary_entry>:
| ffff800080010000:       58000050        ldr     x16, ffff800080010008 <__AArch64AbsLongThunk_primary_entry+0x8>
| ffff800080010004:       d61f0200        br      x16
| ffff800080010008:       889df0e0        .word   0x889df0e0
| ffff80008001000c:       ffff8000        .word   0xffff8000

... and as this is executed early in boot before the kernel is mapped in
TTBR1 this results in a silent boot failure.

Fix this by passing '--pic-veneer' to the linker, which will cause the
linker to use position-independent veneers, e.g.

| ffff800080000000 <_text>:
| ffff800080000000:       fa405a4d        ccmp    x18, #0x0, #0xd, pl     // pl = nfrst
| ffff800080000004:       14003fff        b       ffff800080010000 <__AArch64ADRPThunk_primary_entry>
...
| ffff800080010000 <__AArch64ADRPThunk_primary_entry>:
| ffff800080010000:       f004e3f0        adrp    x16, ffff800089c8f000 <__idmap_text_start>
| ffff800080010004:       91038210        add     x16, x16, #0xe0
| ffff800080010008:       d61f0200        br      x16

I've opted to pass '--pic-veneer' unconditionally, as:

* In addition to solving the boot failure, these sequences are generally
  nicer as they require fewer instructions and don't need to perform
  data accesses.

* While the position-independent veneer sequences have a limited +/-2GiB
  range, this is not a new restriction. Even kernels built with
  CONFIG_RELOCATABLE=n are limited to 2GiB in size as we have several
  structues using 32-bit relative offsets and PPREL32 relocations, which
  are similarly limited to +/-2GiB in range. These include extable
  entries, jump table entries, and alt_instr entries.

* GNU LD defaults to using position-independent veneers, and supports
  the same '--pic-veneer' option, so this change is not expected to
  adversely affect GNU LD.

I've tested with GNU LD 2.30 to 2.42 inclusive and LLVM 13.0.1 to 19.1.0
inclusive, using the kernel.org binaries from:

* https://mirrors.edge.kernel.org/pub/tools/crosstool/
* https://mirrors.edge.kernel.org/pub/tools/llvm/



Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Marc Zyngier <maz@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240927101838.3061054-1-mark.rutland@arm.com


Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

9abe390e

Sep 30, 2024

powerpc/vdso: allow r30 in vDSO code generation of getrandom · 4b058c9f

Jason A. Donenfeld authored 5 months ago

For gettimeofday, -ffixed-r30 was passed to work around a bug in Go
code, where the vDSO trampoline forgot to save and restore this register
across function calls. But Go requires a different trampoline for every
call, and there's no reason that new Go code needs to be broken and add
more bugs. So remove -ffixed-r30 for getrandom.

Fixes: 8072b39c ("powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240925175021.1526936-2-Jason@zx2c4.com

4b058c9f

Sep 29, 2024

x86: kvm: fix build error · 3f749bef

Linus Torvalds authored 5 months ago


The cpu_emergency_register_virt_callback() function is used
unconditionally by the x86 kvm code, but it is declared (and defined)
conditionally:

  #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
  void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
  ...

leading to a build error when neither KVM_INTEL nor KVM_AMD support is
enabled:

  arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
  arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
  12517 |         cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
  arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
  12522 |         cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix the build by defining empty helper functions the same way the old
cpu_emergency_disable_virtualization() function was dealt with for the
same situation.

Maybe we could instead have made the call sites conditional, since the
callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
fallback.  I'll leave that to the kvm people to argue about, this at
least gets the build going for that particular config.

Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3f749bef

Sep 27, 2024

[tree-wide] finally take no_llseek out · cb787f4a

Al Viro authored 5 months ago


no_llseek had been defined to NULL two years ago, in commit 868941b1
("fs: remove no_llseek")

To quote that commit,

  At -rc1 we'll need do a mechanical removal of no_llseek -

  git grep -l -w no_llseek | grep -v porting.rst | while read i; do
	sed -i '/\<no_llseek\>/d' $i
  done

  would do it.

Unfortunately, that hadn't been done.  Linus, could you do that now, so
that we could finally put that thing to rest? All instances are of the
form
	.llseek = no_llseek,
so it's obviously safe.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cb787f4a

Sep 26, 2024

cfi: encode cfi normalized integers + kasan/gcov bug in Kconfig · 4c66f830

Alice Ryhl authored 5 months ago

There is a bug in the LLVM implementation of KASAN and GCOV that makes
these options incompatible with the CFI_ICALL_NORMALIZE_INTEGERS option.

The bug has already been fixed in llvm/clang [1] and rustc [2]. However,
Kconfig currently has no way to gate features on the LLVM version inside
rustc, so we cannot write down a precise `depends on` clause in this
case. Instead, a `def_bool` option is defined for whether
CFI_ICALL_NORMALIZE_INTEGERS is available, and its default value is set
to false when GCOV or KASAN are turned on. End users using a patched
clang/rustc can turn on the HAVE_CFI_ICALL_NORMALIZE_INTEGERS option
directly to override this.

An alternative solution is to inspect a binary created by clang or rustc
to see whether the faulty CFI tags are in the binary. This would be a
precise check, but it would involve hard-coding the *hashed* version of
the CFI tag. This is because there's no way to get clang or rustc to
output the unhased version of the CFI tag. Relying on the precise
hashing algorithm using by CFI seems too fragile, so I have not pursued
this option. Besides, this kind of hack is exactly what lead to the LLVM
bug in the first place.

If the CFI_ICALL_NORMALIZE_INTEGERS option is used without CONFIG_RUST,
then we actually can perform a precise check today: just compare the
clang version number. This works since clang and llvm are always updated
in lockstep. However, encoding this in Kconfig would give the
HAVE_CFI_ICALL_NORMALIZE_INTEGERS option a dependency on CONFIG_RUST,
which is not possible as the reverse dependency already exists.

HAVE_CFI_ICALL_NORMALIZE_INTEGERS is defined to be a `def_bool` instead
of `bool` to avoid asking end users whether they want to turn on the
option. Turning it on explicitly is something only experts should do, so
making it hard to do so is not an issue.

I added a `depends on CFI_CLANG` clause to the new Kconfig option. I'm
not sure whether that makes sense or not, but it doesn't seem to make a
big difference.

In a future kernel release, I would like to add a Kconfig option similar
to CLANG_VERSION/RUSTC_VERSION for inspecting the version of the LLVM
inside rustc. Once that feature lands, this logic will be replaced with
a precise version check. This check is not being introduced here to
avoid introducing a new _VERSION constant in a fix.

Link: https://github.com/llvm/llvm-project/pull/104826 [1]
Link: https://github.com/rust-lang/rust/pull/129373

[2]
Fixes: ce4a2620 ("cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202409231044.4f064459-oliver.sang@intel.com

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20240925-cfi-norm-kasan-fix-v1-1-0328985cdf33@google.com

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

4c66f830

x86/cpu: Add two Intel CPU model numbers · d1fb034b

Tony Luck authored 6 months ago

Pantherlake is a mobile CPU. Diamond Rapids next generation Xeon.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20240923173750.16874-1-tony.luck%40intel.com

d1fb034b

x86/tdx: Fix "in-kernel MMIO" check · d4fc4d01

Alexey Gladkov (Intel) authored 6 months ago


TDX only supports kernel-initiated MMIO operations. The handle_mmio()
function checks if the #VE exception occurred in the kernel and rejects
the operation if it did not.

However, userspace can deceive the kernel into performing MMIO on its
behalf. For example, if userspace can point a syscall to an MMIO address,
syscall does get_user() or put_user() on it, triggering MMIO #VE. The
kernel will treat the #VE as in-kernel MMIO.

Ensure that the target MMIO address is within the kernel before decoding
instruction.

Fixes: 31d58c4e ("x86/tdx: Handle in-kernel MMIO")
Signed-off-by: Alexey Gladkov (Intel) <legion@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/565a804b80387970460a4ebc67c88d1380f61ad1.1726237595.git.legion%40kernel.org

d4fc4d01

sh: Remove unused declarations for make_maskreg_irq() and irq_mask_register · 977fae6d

Gaosheng Cui authored 7 months ago and

John Paul Adrian Glaubitz committed 5 months ago


make_maskreg_irq() and irq_mask_register have been removed since
commit 5a4053b2 ("sh: Kill off dead boards."), so remove the
unused declarations.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>

977fae6d

Sep 25, 2024

x86/pvh: Add 64bit relocation page tables · 47ffe057

Jason Andryuk authored 7 months ago


The PVH entry point is 32bit.  For a 64bit kernel, the entry point must
switch to 64bit mode, which requires a set of page tables.  In the past,
PVH used init_top_pgt.

This works fine when the kernel is loaded at LOAD_PHYSICAL_ADDR, as the
page tables are prebuilt for this address.  If the kernel is loaded at a
different address, they need to be adjusted.

__startup_64() adjusts the prebuilt page tables for the physical load
address, but it is 64bit code.  The 32bit PVH entry code can't call it
to adjust the page tables, so it can't readily be re-used.

64bit PVH entry needs page tables set up for identity map, the kernel
high map and the direct map.  pvh_start_xen() enters identity mapped.
Inside xen_prepare_pvh(), it jumps through a pv_ops function pointer
into the highmap.  The direct map is used for __va() on the initramfs
and other guest physical addresses.

Add a dedicated set of prebuild page tables for PVH entry.  They are
adjusted in assembly before loading.

Add XEN_ELFNOTE_PHYS32_RELOC to indicate support for relocation
along with the kernel's loading constraints.  The maximum load address,
KERNEL_IMAGE_SIZE - 1, is determined by a single pvh_level2_ident_pgt
page.  It could be larger with more pages.

Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20240823193630.2583107-6-jason.andryuk@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

47ffe057

x86/kernel: Move page table macros to header · e3e8cd90

Jason Andryuk authored 7 months ago


The PVH entry point will need an additional set of prebuild page tables.
Move the macros and defines to pgtable_64.h, so they can be re-used.

Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Message-ID: <20240823193630.2583107-5-jason.andryuk@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

e3e8cd90

x86/pvh: Set phys_base when calling xen_prepare_pvh() · b464b461

Jason Andryuk authored 7 months ago


phys_base needs to be set for __pa() to work in xen_pvh_init() when
finding the hypercall page.  Set it before calling into
xen_prepare_pvh(), which calls xen_pvh_init().  Clear it afterward to
avoid __startup_64() adding to it and creating an incorrect value.

Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20240823193630.2583107-4-jason.andryuk@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

b464b461

x86/pvh: Make PVH entrypoint PIC for x86-64 · 1db29f99

Jason Andryuk authored 7 months ago


The PVH entrypoint is 32bit non-PIC code running the uncompressed
vmlinux at its load address CONFIG_PHYSICAL_START - default 0x1000000
(16MB).  The kernel is loaded at that physical address inside the VM by
the VMM software (Xen/QEMU).

When running a Xen PVH Dom0, the host reserved addresses are mapped 1-1
into the PVH container.  There exist system firmwares (Coreboot/EDK2)
with reserved memory at 16MB.  This creates a conflict where the PVH
kernel cannot be loaded at that address.

Modify the PVH entrypoint to be position-indepedent to allow flexibility
in load address.  Only the 64bit entry path is converted.  A 32bit
kernel is not PIC, so calling into other parts of the kernel, like
xen_prepare_pvh() and mk_pgtable_32(), don't work properly when
relocated.

This makes the code PIC, but the page tables need to be updated as well
to handle running from the kernel high map.

The UNWIND_HINT_END_OF_STACK is to silence:
vmlinux.o: warning: objtool: pvh_start_xen+0x7f: unreachable instruction
after the lret into 64bit code.

Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20240823193630.2583107-3-jason.andryuk@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

1db29f99

xen/pvh: Setup gsi for passthrough device · b166b8ab

Jiqian Chen authored 6 months ago


In PVH dom0, the gsis don't get registered, but the gsi of
a passthrough device must be configured for it to be able to be
mapped into a domU.

When assigning a device to passthrough, proactively setup the gsi
of the device during that process.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-ID: <20240924061437.2636766-3-Jiqian.Chen@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

b166b8ab

Sep 24, 2024

LoongArch: vDSO: Tune chacha implementation · 9805f39d

Xi Ruoyao authored 6 months ago and

Jason A. Donenfeld committed 5 months ago


As Christophe pointed out, tuning the chacha implementation by
scheduling the instructions like what GCC does can improve the
performance.

The tuning does not introduce too much complexity (basically it's just
reordering some instructions). And the tuning does not hurt readibility
too much: actually the tuned code looks even more similar to a
textbook-style implementation based on 128-bit vectors.  So overall it's
a good deal to me.

Tested with vdso_test_getchacha and benched with vdso_test_getrandom.
On a LA664 the speedup is 5%, and I expect a larger speedup on LA[2-4]64
with a lower issue rate.

Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/all/77655d9e-fc05-4300-8f0d-7b2ad840d091@csgroup.eu/


Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

9805f39d

LoongArch: Remove posix_types.h include from sigcontext.h · 64c35d6c

Xi Ruoyao authored 6 months ago

Nothing in sigcontext.h seems to require anything from
linux/posix_types.h.  This include seems a MIPS relic originated from
an error in Linux 2.6.11-rc2 (in 2005).

The unneeded include was found debugging some vDSO self test build
failure (it's not the root cause though).

Link: https://lore.kernel.org/linux-mips/20240828030413.143930-2-xry111@xry111.site/
Link: https://lore.kernel.org/loongarch/0b540679ec8cfccec75aeb3463810924f6ff71e6.camel@xry111.site/


Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

64c35d6c

LoongArch: Fix memleak in pci_acpi_scan_root() · 5016c3a3

Wentao Guan authored 6 months ago


Add kfree(root_ops) in this case to avoid memleak of root_ops,
leaks when pci_find_bus() != 0.

Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

5016c3a3

LoongArch: Simplify _percpu_read() and _percpu_write() · d4f31acf

Uroš Bizjak authored 6 months ago

Now _percpu_read() and _percpu_write() macros call __percpu_read()
and __percpu_write() static inline functions that result in a single
assembly instruction. However, percpu infrastructure expects its leaf
definitions to encode the size of their percpu variable, so the patch
merges all the asm clauses from the static inline function into the
corresponding leaf macros.

The secondary effect of this change is to avoid explicit __percpu
annotations for function arguments. Currently, __percpu macro is defined
in include/linux/compiler_types.h, but with proposed patch [1], __percpu
definition will need macros from include/asm-generic/percpu.h, creating
forward dependency loop.

The proposed solution is the same as x86 architecture uses.

[1] https://lore.kernel.org/lkml/20240812115945.484051-4-ubizjak@gmail.com/



Tested-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

d4f31acf

LoongArch: Improve hardware page table walker · f93f67d0

Huacai Chen authored 6 months ago


LoongArch has similar problems explained in commit 7f0b1bf0
("arm64: Fix barriers used for page table modifications"), when hardware
page table walker (PTW) enabled, speculative accesses may cause spurious
page fault in kernel space. Theoretically, in order to completely avoid
spurious page fault we need a "dbar + ibar" pair between the page table
modifications and the subsequent memory accesses using the corresponding
virtual address. But "ibar" is too heavy for performace, so we only use
a "dbar 0b11000" in set_pte(). And let spurious_fault() filter the rest
rare spurious page faults which should be avoided by "ibar".

Besides, we replace the llsc loop with amo in set_pte() which has better
performace, and refactor mmu_context.h to 1) avoid any load/store/branch
instructions between the writing of CSR.ASID & CSR.PGDL, 2) ensure flush
tlb operation is after updating ASID.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

f93f67d0

LoongArch: Add ARCH_HAS_SET_DIRECT_MAP support · f04de6d8

Huacai Chen authored 6 months ago


Add set_direct_map_*() functions for setting the direct map alias for
the page to its default permissions and to an invalid state that cannot
be cached in a TLB. (See d253ca0c ("x86/mm/cpa: Add set_direct_map_*()
functions")) Add a similar implementation for LoongArch.

This fixes the KFENCE warnings during hibernation:

 ==================================================================
 BUG: KFENCE: invalid read in swsusp_save+0x368/0x4d8

 Invalid read at 0x00000000f7b89a3c:
  swsusp_save+0x368/0x4d8
  hibernation_snapshot+0x3f0/0x4e0
  hibernate+0x20c/0x440
  state_store+0x128/0x140
  kernfs_fop_write_iter+0x160/0x260
  vfs_write+0x2c0/0x520
  ksys_write+0x74/0x160
  do_syscall+0xb0/0x160

 CPU: 0 UID: 0 PID: 812 Comm: bash Tainted: G    B              6.11.0-rc1+ #1566
 Tainted: [B]=BAD_PAGE
 Hardware name: Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0 10/21/2022
 ==================================================================

Note: We can only set permissions for KVRANGE/XKVRANGE kernel addresses.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

f04de6d8

LoongArch: Add ARCH_HAS_SET_MEMORY support · e86935f7

Huacai Chen authored 6 months ago


Add set_memory_ro/rw/x/nx architecture hooks to change the page
attribution.

Use own set_memory.h rather than generic set_memory.h (i.e.
include/asm-generic/set_memory.h), because we want to add other function
prototypes here.

Note: We can only set attributes for KVRANGE/XKVRANGE kernel addresses.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

e86935f7

LoongArch: Rework CPU feature probe from CPUCFG/IOCSR · 34e3c450

Jiaxun Yang authored 6 months ago


Probe ISA level, TLB, IOCSR information from CPUCFG to improve kernel
resilience to different core implementations.

BTW, IOCSR register definition appears to be a platform-specific spec
instead of an architecture spec, even for the Loongson CPUs there is no
guarantee that IOCSR will always present.

Thus it's dangerous to perform IOCSR probing without checking CPU type
and instruction availability.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

34e3c450

LoongArch: Enable ACPI BGRT handling · d0bb0b60

Bibo Mao authored 6 months ago


Add ACPI BGRT support on LoongArch so it can display image provied by
acpi table at boot stage and switch to graphical UI smoothly.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

d0bb0b60

LoongArch: Enable generic CPU vulnerabilites support · e8dd556c

Tiezhu Yang authored 6 months ago

Currently, many architectures support generic CPU vulnerabilites, such
as x86, arm64 and riscv:

 commit 61dc0f55 ("x86/cpu: Implement CPU vulnerabilites sysfs functions")
 commit 61ae1321 ("arm64: enable generic CPU vulnerabilites support")
 commit 0e3f3649 ("riscv: Enable generic CPU vulnerabilites support")

All LoongArch CPUs (since Loongson-3A5000) implement a special mechanism
in the processor core to prevent "Meltdown" and "Spectre" attacks, so it
can enable generic CPU vulnerabilites support for LoongArch too.

Without this patch, there are no user interfaces of vulnerabilities to
check on LoongArch. The output of those files reflects the state of the
CPUs in the system, the output value "Not affected" means "CPU is not
affected by the vulnerability".

Before:

 # cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
 cat: /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow: No such file or directory
 # cat /sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 cat: /sys/devices/system/cpu/vulnerabilities/spec_store_bypass: No such file or directory
 # cat /sys/devices/system/cpu/vulnerabilities/meltdown
 cat: /sys/devices/system/cpu/vulnerabilities/meltdown: No such file or directory

After:

 # cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
 Not affected
 # cat /sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 Not affected
 # cat /sys/devices/system/cpu/vulnerabilities/meltdown
 Not affected

Link: https://www.loongson.cn/EN/news/show?id=633


Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

e8dd556c

Sep 23, 2024

s390/vdso: Use one large alternative instead of an alternative branch · d714abee

Heiko Carstens authored 6 months ago


Replace the alternative branch with a larger alternative that contains
both paths. That way the two paths are closer together and it is easier
to change both paths if the need should arise.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

d714abee

s390/vdso: Use SYM_DATA_START_LOCAL()/SYM_DATA_END() for data objects · c902b578

Heiko Carstens authored 6 months ago


Use SYM_DATA_START_LOCAL()/SYM_DATA_END() in vgetrandom-chacha.S so
that the constants end up in an object with correct size:

readelf -Ws vgetrandom-chacha.o
   Num:    Value          Size Type    Bind   Vis      Ndx Name
   ...
     5: 0000000000000000    32 OBJECT  LOCAL  DEFAULT    5 chacha20_constants

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

c902b578

Admin message

Admin message