Skip to content
Snippets Groups Projects
  1. Mar 12, 2025
  2. Mar 11, 2025
  3. Mar 10, 2025
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2025-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4d872d51
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - Fix out-of-bounds access on CPU-less AMD NUMA systems by the
         microcode code
      
       - Make the kernel SGX CPU init code less passive-aggressive about
         non-working SGX features, instead of silently keeping the driver
         disabled, this is something people are running into. This doesn't
         affect functionality, it's a sysadmin QoL fix
      
      * tag 'x86-urgent-2025-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes
        x86/sgx: Warn explicitly if X86_FEATURE_SGX_LC is not enabled
      4d872d51
    • Michael Kelley's avatar
      Drivers: hv: vmbus: Don't release fb_mmio resource in vmbus_free_mmio() · 73fe9073
      Michael Kelley authored
      
      The VMBus driver manages the MMIO space it owns via the hyperv_mmio
      resource tree. Because the synthetic video framebuffer portion of the
      MMIO space is initially setup by the Hyper-V host for each guest, the
      VMBus driver does an early reserve of that portion of MMIO space in the
      hyperv_mmio resource tree. It saves a pointer to that resource in
      fb_mmio. When a VMBus driver requests MMIO space and passes "true"
      for the "fb_overlap_ok" argument, the reserved framebuffer space is
      used if possible. In that case it's not necessary to do another request
      against the "shadow" hyperv_mmio resource tree because that resource
      was already requested in the early reserve steps.
      
      However, the vmbus_free_mmio() function currently does no special
      handling for the fb_mmio resource. When a framebuffer device is
      removed, or the driver is unbound, the current code for
      vmbus_free_mmio() releases the reserved resource, leaving fb_mmio
      pointing to memory that has been freed. If the same or another
      driver is subsequently bound to the device, vmbus_allocate_mmio()
      checks against fb_mmio, and potentially gets garbage. Furthermore
      a second unbind operation produces this "nonexistent resource" error
      because of the unbalanced behavior between vmbus_allocate_mmio() and
      vmbus_free_mmio():
      
      [   55.499643] resource: Trying to free nonexistent
      			resource <0x00000000f0000000-0x00000000f07fffff>
      
      Fix this by adding logic to vmbus_free_mmio() to recognize when
      MMIO space in the fb_mmio reserved area would be released, and don't
      release it. This filtering ensures the fb_mmio resource always exists,
      and makes vmbus_free_mmio() more parallel with vmbus_allocate_mmio().
      
      Fixes: be000f93 ("drivers:hv: Track allocations of children of hv_vmbus in private resource tree")
      Signed-off-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Tested-by: default avatarSaurabh Sengar <ssengar@linux.microsoft.com>
      Reviewed-by: default avatarSaurabh Sengar <ssengar@linux.microsoft.com>
      Link: https://lore.kernel.org/r/20250310035208.275764-1-mhklinux@outlook.com
      
      
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Message-ID: <20250310035208.275764-1-mhklinux@outlook.com>
      73fe9073
    • Florent Revest's avatar
      x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes · e3e89178
      Florent Revest authored
      
      Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves their
      CPU masks and unconditionally accesses per-CPU data for the first CPU of each
      mask.
      
      According to Documentation/admin-guide/mm/numaperf.rst:
      
        "Some memory may share the same node as a CPU, and others are provided as
        memory only nodes."
      
      Therefore, some node CPU masks may be empty and wouldn't have a "first CPU".
      
      On a machine with far memory (and therefore CPU-less NUMA nodes):
      - cpumask_of_node(nid) is 0
      - cpumask_first(0) is CONFIG_NR_CPUS
      - cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an
        index that is 1 out of bounds
      
      This does not have any security implications since flashing microcode is
      a privileged operation but I believe this has reliability implications by
      potentially corrupting memory while flashing a microcode update.
      
      When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes
      a microcode update. I get the following splat:
      
        UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y
        index 512 is out of range for type 'unsigned long[512]'
        [...]
        Call Trace:
         dump_stack
         __ubsan_handle_out_of_bounds
         load_microcode_amd
         request_microcode_amd
         reload_store
         kernfs_fop_write_iter
         vfs_write
         ksys_write
         do_syscall_64
         entry_SYSCALL_64_after_hwframe
      
      Change the loop to go over only NUMA nodes which have CPUs before determining
      whether the first CPU on the respective node needs microcode update.
      
        [ bp: Massage commit message, fix typo. ]
      
      Fixes: 7ff6edf4 ("x86/microcode/AMD: Fix mixed steppings support")
      Signed-off-by: default avatarFlorent Revest <revest@chromium.org>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20250310144243.861978-1-revest@chromium.org
      e3e89178
    • Vladis Dronov's avatar
      x86/sgx: Warn explicitly if X86_FEATURE_SGX_LC is not enabled · 65be5c95
      Vladis Dronov authored and Ingo Molnar's avatar Ingo Molnar committed
      The kernel requires X86_FEATURE_SGX_LC to be able to create SGX enclaves,
      not just X86_FEATURE_SGX.
      
      There is quite a number of hardware which has X86_FEATURE_SGX but not
      X86_FEATURE_SGX_LC. A kernel running on such hardware does not create
      the /dev/sgx_enclave file and does so silently.
      
      Explicitly warn if X86_FEATURE_SGX_LC is not enabled to properly notify
      users that the kernel disabled the SGX driver.
      
      The X86_FEATURE_SGX_LC, a.k.a. SGX Launch Control, is a CPU feature
      that enables LE (Launch Enclave) hash MSRs to be writable (with
      additional opt-in required in the 'feature control' MSR) when running
      enclaves, i.e. using a custom root key rather than the Intel proprietary
      key for enclave signing.
      
      I've hit this issue myself and have spent some time researching where
      my /dev/sgx_enclave file went on SGX-enabled hardware.
      
      Related links:
      
        https://github.com/intel/linux-sgx/issues/837
        https://patchwork.kernel.org/project/platform-driver-x86/patch/20180827185507.17087-3-jarkko.sakkinen@linux.intel.com/
      
      
      
      [ mingo: Made the error message a bit more verbose, and added other cases
               where the kernel fails to create the /dev/sgx_enclave device node. ]
      
      Signed-off-by: default avatarVladis Dronov <vdronov@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarKai Huang <kai.huang@intel.com>
      Cc: Jarkko Sakkinen <jarkko@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20250309172215.21777-2-vdronov@redhat.com
      65be5c95
    • Michael Kelley's avatar
      x86/hyperv: Fix output argument to hypercall that changes page visibility · 09beefef
      Michael Kelley authored
      
      The hypercall in hv_mark_gpa_visibility() is invoked with an input
      argument and an output argument. The output argument ostensibly returns
      the number of pages that were processed. But in fact, the hypercall does
      not provide any output, so the output argument is spurious.
      
      The spurious argument is harmless because Hyper-V ignores it, but in the
      interest of correctness and to avoid the potential for future problems,
      remove it.
      
      Signed-off-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Reviewed-by: default avatarNuno Das Neves <nunodasneves@linux.microsoft.com>
      Link: https://lore.kernel.org/r/20250226200612.2062-2-mhklinux@outlook.com
      
      
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Message-ID: <20250226200612.2062-2-mhklinux@outlook.com>
      09beefef
  4. Mar 09, 2025
    • Saurabh Sengar's avatar
      fbdev: hyperv_fb: Allow graceful removal of framebuffer · ea2f45ab
      Saurabh Sengar authored
      
      When a Hyper-V framebuffer device is unbind, hyperv_fb driver tries to
      release the framebuffer forcefully. If this framebuffer is in use it
      produce the following WARN and hence this framebuffer is never released.
      
      [   44.111220] WARNING: CPU: 35 PID: 1882 at drivers/video/fbdev/core/fb_info.c:70 framebuffer_release+0x2c/0x40
      < snip >
      [   44.111289] Call Trace:
      [   44.111290]  <TASK>
      [   44.111291]  ? show_regs+0x6c/0x80
      [   44.111295]  ? __warn+0x8d/0x150
      [   44.111298]  ? framebuffer_release+0x2c/0x40
      [   44.111300]  ? report_bug+0x182/0x1b0
      [   44.111303]  ? handle_bug+0x6e/0xb0
      [   44.111306]  ? exc_invalid_op+0x18/0x80
      [   44.111308]  ? asm_exc_invalid_op+0x1b/0x20
      [   44.111311]  ? framebuffer_release+0x2c/0x40
      [   44.111313]  ? hvfb_remove+0x86/0xa0 [hyperv_fb]
      [   44.111315]  vmbus_remove+0x24/0x40 [hv_vmbus]
      [   44.111323]  device_remove+0x40/0x80
      [   44.111325]  device_release_driver_internal+0x20b/0x270
      [   44.111327]  ? bus_find_device+0xb3/0xf0
      
      Fix this by moving the release of framebuffer and assosiated memory
      to fb_ops.fb_destroy function, so that framebuffer framework handles
      it gracefully.
      
      While we fix this, also replace manual registrations/unregistration of
      framebuffer with devm_register_framebuffer.
      
      Fixes: 68a2d20b ("drivers/video: add Hyper-V Synthetic Video Frame Buffer Driver")
      
      Signed-off-by: default avatarSaurabh Sengar <ssengar@linux.microsoft.com>
      Reviewed-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Tested-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Link: https://lore.kernel.org/r/1740845791-19977-3-git-send-email-ssengar@linux.microsoft.com
      
      
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Message-ID: <1740845791-19977-3-git-send-email-ssengar@linux.microsoft.com>
      ea2f45ab
    • Saurabh Sengar's avatar
      fbdev: hyperv_fb: Simplify hvfb_putmem · f5e728a5
      Saurabh Sengar authored
      
      The device object required in 'hvfb_release_phymem' function
      for 'dma_free_coherent' can also be obtained from the 'info'
      pointer, making 'hdev' parameter in 'hvfb_putmem' redundant.
      Remove the unnecessary 'hdev' argument from 'hvfb_putmem'.
      
      Signed-off-by: default avatarSaurabh Sengar <ssengar@linux.microsoft.com>
      Reviewed-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Tested-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Link: https://lore.kernel.org/r/1740845791-19977-2-git-send-email-ssengar@linux.microsoft.com
      
      
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Message-ID: <1740845791-19977-2-git-send-email-ssengar@linux.microsoft.com>
      f5e728a5
    • Michael Kelley's avatar
      fbdev: hyperv_fb: Fix hang in kdump kernel when on Hyper-V Gen 2 VMs · 30438637
      Michael Kelley authored
      Gen 2 Hyper-V VMs boot via EFI and have a standard EFI framebuffer
      device. When the kdump kernel runs in such a VM, loading the efifb
      driver may hang because of accessing the framebuffer at the wrong
      memory address.
      
      The scenario occurs when the hyperv_fb driver in the original kernel
      moves the framebuffer to a different MMIO address because of conflicts
      with an already-running efifb or simplefb driver. The hyperv_fb driver
      then informs Hyper-V of the change, which is allowed by the Hyper-V FB
      VMBus device protocol. However, when the kexec command loads the kdump
      kernel into crash memory via the kexec_file_load() system call, the
      system call doesn't know the framebuffer has moved, and it sets up the
      kdump screen_info using the original framebuffer address. The transition
      to the kdump kernel does not go through the Hyper-V host, so Hyper-V
      does not reset the framebuffer address like it would do on a reboot.
      When efifb tries to run, it accesses a non-existent framebuffer
      address, which traps to the Hyper-V host. After many such accesses,
      the Hyper-V host thinks the guest is being malicious, and throttles
      the guest to the point that it runs very slowly or appears to have hung.
      
      When the kdump kernel is loaded into crash memory via the kexec_load()
      system call, the problem does not occur. In this case, the kexec command
      builds the screen_info table itself in user space from data returned
      by the FBIOGET_FSCREENINFO ioctl against /dev/fb0, which gives it the
      new framebuffer location.
      
      This problem was originally reported in 2020 [1], resulting in commit
      3cb73bc3 ("hyperv_fb: Update screen_info after removing old
      framebuffer"). This commit solved the problem by setting orig_video_isVGA
      to 0, so the kdump kernel was unaware of the EFI framebuffer. The efifb
      driver did not try to load, and no hang occurred. But in 2024, commit
      c25a19af ("fbdev/hyperv_fb: Do not clear global screen_info")
      effectively reverted 3cb73bc3. Commit c25a19af has no reference
      to 3cb73bc3, so perhaps it was done without knowing the implications
      that were reported with 3cb73bc3. In any case, as of commit
      c25a19af, the original problem came back again.
      
      Interestingly, the hyperv_drm driver does not have this problem because
      it never moves the framebuffer. The difference is that the hyperv_drm
      driver removes any conflicting framebuffers *before* allocating an MMIO
      address, while the hyperv_fb drivers removes conflicting framebuffers
      *after* allocating an MMIO address. With the "after" ordering, hyperv_fb
      may encounter a conflict and move the framebuffer to a different MMIO
      address. But the conflict is essentially bogus because it is removed
      a few lines of code later.
      
      Rather than fix the problem with the approach from 2020 in commit
      3cb73bc3, instead slightly reorder the steps in hyperv_fb so
      conflicting framebuffers are removed before allocating an MMIO address.
      Then the default framebuffer MMIO address should always be available, and
      there's never any confusion about which framebuffer address the kdump
      kernel should use -- it's always the original address provided by
      the Hyper-V host. This approach is already used by the hyperv_drm
      driver, and is consistent with the usage guidelines at the head of
      the module with the function aperture_remove_conflicting_devices().
      
      This approach also solves a related minor problem when kexec_load()
      is used to load the kdump kernel. With current code, unbinding and
      rebinding the hyperv_fb driver could result in the framebuffer moving
      back to the default framebuffer address, because on the rebind there
      are no conflicts. If such a move is done after the kdump kernel is
      loaded with the new framebuffer address, at kdump time it could again
      have the wrong address.
      
      This problem and fix are described in terms of the kdump kernel, but
      it can also occur with any kernel started via kexec.
      
      See extensive discussion of the problem and solution at [2].
      
      [1] https://lore.kernel.org/linux-hyperv/20201014092429.1415040-1-kasong@redhat.com/
      [2] https://lore.kernel.org/linux-hyperv/BLAPR10MB521793485093FDB448F7B2E5FDE92@BLAPR10MB5217.namprd10.prod.outlook.com/
      
      
      
      Reported-by: default avatarThomas Tai <thomas.tai@oracle.com>
      Fixes: c25a19af ("fbdev/hyperv_fb: Do not clear global screen_info")
      Signed-off-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Link: https://lore.kernel.org/r/20250218230130.3207-1-mhklinux@outlook.com
      
      
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Message-ID: <20250218230130.3207-1-mhklinux@outlook.com>
      30438637
    • Michael Kelley's avatar
      drm/hyperv: Fix address space leak when Hyper-V DRM device is removed · aed70935
      Michael Kelley authored
      
      When a Hyper-V DRM device is probed, the driver allocates MMIO space for
      the vram, and maps it cacheable. If the device removed, or in the error
      path for device probing, the MMIO space is released but no unmap is done.
      Consequently the kernel address space for the mapping is leaked.
      
      Fix this by adding iounmap() calls in the device removal path, and in the
      error path during device probing.
      
      Fixes: f1f63cbb ("drm/hyperv: Fix an error handling path in hyperv_vmbus_probe()")
      Fixes: a0ab5abc ("drm/hyperv : Removing the restruction of VRAM allocation with PCI bar size")
      Signed-off-by: default avatarMichael Kelley <mhklinux@outlook.com>
      Reviewed-by: default avatarSaurabh Sengar <ssengar@linux.microsoft.com>
      Tested-by: default avatarSaurabh Sengar <ssengar@linux.microsoft.com>
      Link: https://lore.kernel.org/r/20250210193441.2414-1-mhklinux@outlook.com
      
      
      Signed-off-by: default avatarWei Liu <wei.liu@kernel.org>
      Message-ID: <20250210193441.2414-1-mhklinux@outlook.com>
      aed70935
    • Linus Torvalds's avatar
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.14-3' of... · 9712d38c
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Use the specified $(LD) when building userprogs with Clang
      
       - Pass the correct target triple when compile-testing UAPI headers
         with Clang
      
       - Fix pacman-pkg build error with KBUILD_OUTPUT
      
      * tag 'kbuild-fixes-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: install-extmod-build: Fix build when specifying KBUILD_OUTPUT
        docs: Kconfig: fix defconfig description
        kbuild: hdrcheck: fix cross build with clang
        kbuild: userprogs: use correct lld when linking through clang
      9712d38c
    • Linus Torvalds's avatar
      Merge tag 'usb-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0dc1f314
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB driver fixes for some reported issues. These
        contain:
      
         - typec driver fixes
      
         - dwc3 driver fixes
      
         - xhci driver fixes
      
         - renesas controller fixes
      
         - gadget driver fixes
      
         - a new USB quirk added
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: ucsi: Fix NULL pointer access
        usb: quirks: Add DELAY_INIT and NO_LPM for Prolific Mass Storage Card Reader
        usb: xhci: Fix host controllers "dying" after suspend and resume
        usb: dwc3: Set SUSPENDENABLE soon after phy init
        usb: hub: lack of clearing xHC resources
        usb: renesas_usbhs: Flush the notify_hotplug_work
        usb: renesas_usbhs: Use devm_usb_get_phy()
        usb: renesas_usbhs: Call clk_put()
        usb: dwc3: gadget: Prevent irq storm when TH re-executes
        usb: gadget: Check bmAttributes only if configuration is valid
        xhci: Restrict USB4 tunnel detection for USB3 devices to Intel hosts
        usb: xhci: Enable the TRB overfetch quirk on VIA VL805
        usb: gadget: Fix setting self-powered state on suspend
        usb: typec: ucsi: increase timeout for PPM reset operations
        acpi: typec: ucsi: Introduce a ->poll_cci method
        usb: typec: tcpci_rt1711h: Unmask alert interrupts to fix functionality
        usb: gadget: Set self-powered based on MaxPower and bmAttributes
        usb: gadget: u_ether: Set is_suspend flag if remote wakeup fails
        usb: atm: cxacru: fix a flaw in existing endpoint checks
      0dc1f314
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.14-rc6' of... · 51b38f3c
      Linus Torvalds authored
      Merge tag 'driver-core-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single driver core fix that resolves a reported memory leak.
      
        It's been in linux-next for 2 weeks now with no reported problems"
      
      * tag 'driver-core-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers: core: fix device leak in __fw_devlink_relax_cycles()
      51b38f3c
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 2cc699b3
      Linus Torvalds authored
      Pull char/misc/IIO driver fixes from Greg KH:
       "Here are a number of misc and char and iio driver fixes that have been
        sitting in my tree for way too long. They contain:
      
         - iio driver fixes for reported issues
      
         - regression fix for rtsx_usb card reader
      
         - mei and mhi driver fixes
      
         - small virt driver fixes
      
         - ntsync permissions fix
      
         - other tiny driver fixes for reported problems.
      
        All of these have been in linux-next for quite a while with no
        reported issues"
      
      * tag 'char-misc-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (30 commits)
        Revert "drivers/card_reader/rtsx_usb: Restore interrupt based detection"
        ntsync: Check wait count based on byte size.
        bus: simple-pm-bus: fix forced runtime PM use
        char: misc: deallocate static minor in error path
        eeprom: digsy_mtc: Make GPIO lookup table match the device
        drivers: virt: acrn: hsm: Use kzalloc to avoid info leak in pmcmd_ioctl
        binderfs: fix use-after-free in binder_devices
        slimbus: messaging: Free transaction ID in delayed interrupt scenario
        vbox: add HAS_IOPORT dependency
        cdx: Fix possible UAF error in driver_override_show()
        intel_th: pci: Add Panther Lake-P/U support
        intel_th: pci: Add Panther Lake-H support
        intel_th: pci: Add Arrow Lake support
        intel_th: msu: Fix less trivial kernel-doc warnings
        intel_th: msu: Fix kernel-doc warnings
        MAINTAINERS: change maintainer for FSI
        ntsync: Set the permissions to be 0666
        bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock
        mei: vsc: Use "wakeuphostint" when getting the host wakeup GPIO
        mei: me: add panther lake P DID
        ...
      2cc699b3
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a382b06d
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "arm64:
      
         - Fix a couple of bugs affecting pKVM's PSCI relay implementation
           when running in the hVHE mode, resulting in the host being entered
           with the MMU in an unknown state, and EL2 being in the wrong mode
      
        x86:
      
         - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow
      
         - Ensure DEBUGCTL is context switched on AMD to avoid running the
           guest with the host's value, which can lead to unexpected bus lock
           #DBs
      
         - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't
           properly emulate BTF. KVM's lack of context switching has meant BTF
           has always been broken to some extent
      
         - Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as
           the guest can enable DebugSwap without KVM's knowledge
      
         - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes
           to RO memory" phase without actually generating a write-protection
           fault
      
         - Fix a printf() goof in the SEV smoke test that causes build
           failures with -Werror
      
         - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when
           PERFMON_V2 isn't supported by KVM"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM
        KVM: selftests: Fix printf() format goof in SEV smoke test
        KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage
        KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
        KVM: SVM: Save host DR masks on CPUs with DebugSwap
        KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
        KVM: arm64: Initialize HCR_EL2.E2H early
        KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
        KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled
        KVM: x86: Snapshot the host's DEBUGCTL in common x86
        KVM: SVM: Suppress DEBUGCTL.BTF on AMD
        KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
        KVM: selftests: Assert that STI blocking isn't set after event injection
        KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow
      a382b06d
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-fixes-6.14-rcN.2' of https://github.com/kvm-x86/linux into HEAD · ea9bd29a
      Paolo Bonzini authored
      KVM x86 fixes for 6.14-rcN #2
      
       - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow.
      
       - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with
         the host's value, which can lead to unexpected bus lock #DBs.
      
       - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly
         emulate BTF.  KVM's lack of context switching has meant BTF has always been
         broken to some extent.
      
       - Always save DR masks for SNP vCPUs if DebugSwap is *supported*, as the guest
         can enable DebugSwap without KVM's knowledge.
      
       - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO
         memory" phase without actually generating a write-protection fault.
      
       - Fix a printf() goof in the SEV smoke test that causes build failures with
         -Werror.
      
       - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2
         isn't supported by KVM.
      ea9bd29a
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.14-4' of... · 1cdad678
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.14, take #4
      
      - Fix a couple of bugs affecting pKVM's PSCI relay implementation
        when running in the hVHE mode, resulting in the host being entered
        with the MMU in an unknown state, and EL2 being in the wrong mode.
      1cdad678
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of... · 1110ce6a
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "33 hotfixes. 24 are cc:stable and the remainder address post-6.13
        issues or aren't considered necessary for -stable kernels.
      
        26 are for MM and 7 are for non-MM.
      
         - "mm: memory_failure: unmap poisoned folio during migrate properly"
           from Ma Wupeng fixes a couple of two year old bugs involving the
           migration of hwpoisoned folios.
      
         - "selftests/damon: three fixes for false results" from SeongJae Park
           fixes three one year old bugs in the SAMON selftest code.
      
        The remainder are singletons and doubletons. Please see the individual
        changelogs for details"
      
      * tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits)
        mm/page_alloc: fix uninitialized variable
        rapidio: add check for rio_add_net() in rio_scan_alloc_net()
        rapidio: fix an API misues when rio_add_net() fails
        MAINTAINERS: .mailmap: update Sumit Garg's email address
        Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone"
        mm: fix finish_fault() handling for large folios
        mm: don't skip arch_sync_kernel_mappings() in error paths
        mm: shmem: remove unnecessary warning in shmem_writepage()
        userfaultfd: fix PTE unmapping stack-allocated PTE copies
        userfaultfd: do not block on locking a large folio with raised refcount
        mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages
        mm: shmem: fix potential data corruption during shmem swapin
        mm: fix kernel BUG when userfaultfd_move encounters swapcache
        selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries
        selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms
        selftests/damon/damos_quota: make real expectation of quota exceeds
        include/linux/log2.h: mark is_power_of_2() with __always_inline
        NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback
        mm, swap: avoid BUG_ON in relocate_cluster()
        mm: swap: use correct step in loop to wait all clusters in wait_for_allocation()
        ...
      1110ce6a
  5. Mar 08, 2025
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2025-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b7c90e3e
      Linus Torvalds authored
      Pull more x86 fixes from Ingo Molnar:
      
       - Add more model IDs to the AMD microcode version check, more people
         are hitting these checks
      
       - Fix a Xen guest boot warning related to AMD northbridge setup
      
       - Fix SEV guest bugs related to a recent changes in its locking logic
      
       - Fix a missing definition of PTRS_PER_PMD that assembly builds can hit
      
      * tag 'x86-urgent-2025-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode/AMD: Add some forgotten models to the SHA check
        x86/mm: Define PTRS_PER_PMD for assembly code too
        virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex
        virt: sev-guest: Allocate request data dynamically
        x86/amd_nb: Use rdmsr_safe() in amd_get_mmconfig_range()
      b7c90e3e
    • Borislav Petkov (AMD)'s avatar
      x86/microcode/AMD: Add some forgotten models to the SHA check · 058a6bec
      Borislav Petkov (AMD) authored and Ingo Molnar's avatar Ingo Molnar committed
      
      Add some more forgotten models to the SHA check.
      
      Fixes: 50cef76d ("x86/microcode/AMD: Load only SHA256-checksummed patches")
      Reported-by: default avatarToralf Förster <toralf.foerster@gmx.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarToralf Förster <toralf.foerster@gmx.de>
      Link: https://lore.kernel.org/r/20250307220256.11816-1-bp@kernel.org
      058a6bec
    • Ingo Molnar's avatar
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.14-2' of... · 2e51e0ac
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
       "Fix bugs in kernel build, hibernation, memory management and KVM"
      
      * tag 'loongarch-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: KVM: Fix GPA size issue about VM
        LoongArch: KVM: Reload guest CSR registers after sleep
        LoongArch: KVM: Add interrupt checking for AVEC
        LoongArch: Set hugetlb mmap base address aligned with pmd size
        LoongArch: Set max_pfn with the PFN of the last page
        LoongArch: Use polling play_dead() when resuming from hibernation
        LoongArch: Eliminate superfluous get_numa_distances_cnt()
        LoongArch: Convert unreachable() to BUG()
      2e51e0ac
    • Bibo Mao's avatar
      LoongArch: KVM: Fix GPA size issue about VM · 6bdbb73d
      Bibo Mao authored
      
      Physical address space is 48 bit on Loongson-3A5000 physical machine,
      however it is 47 bit for VM on Loongson-3A5000 system. Size of physical
      address space of VM is the same with the size of virtual user space (a
      half) of physical machine.
      
      Variable cpu_vabits represents user address space, kernel address space
      is not included (user space and kernel space are both a half of total).
      Here cpu_vabits, rather than cpu_vabits - 1, is to represent the size of
      guest physical address space.
      
      Also there is strict checking about page fault GPA address, inject error
      if it is larger than maximum GPA address of VM.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      6bdbb73d
    • Bibo Mao's avatar
      LoongArch: KVM: Reload guest CSR registers after sleep · 78d7bc5a
      Bibo Mao authored
      
      On host, the HW guest CSR registers are lost after suspend and resume
      operation. Since last_vcpu of boot CPU still records latest vCPU pointer
      so that the guest CSR register skips to reload when boot CPU resumes and
      vCPU is scheduled.
      
      Here last_vcpu is cleared so that guest CSR registers will reload from
      scheduled vCPU context after suspend and resume.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      78d7bc5a
    • Bibo Mao's avatar
      LoongArch: KVM: Add interrupt checking for AVEC · 6fb1867d
      Bibo Mao authored
      
      There is a newly added macro INT_AVEC with CSR ESTAT register, which is
      bit 14 used for LoongArch AVEC support. AVEC interrupt status bit 14 is
      supported with macro CSR_ESTAT_IS, so here replace the hard-coded value
      0x1fff with macro CSR_ESTAT_IS so that the AVEC interrupt status is also
      supported by KVM.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      6fb1867d
    • Bibo Mao's avatar
      LoongArch: Set hugetlb mmap base address aligned with pmd size · 3109d5ff
      Bibo Mao authored
      
      With ltp test case "testcases/bin/hugefork02", there is a dmesg error
      report message such as:
      
       kernel BUG at mm/hugetlb.c:5550!
       Oops - BUG[#1]:
       CPU: 0 UID: 0 PID: 1517 Comm: hugefork02 Not tainted 6.14.0-rc2+ #241
       Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
       pc 90000000004eaf1c ra 9000000000485538 tp 900000010edbc000 sp 900000010edbf940
       a0 900000010edbfb00 a1 9000000108d20280 a2 00007fffe9474000 a3 00007ffff3474000
       a4 0000000000000000 a5 0000000000000003 a6 00000000003cadd3 a7 0000000000000000
       t0 0000000001ffffff t1 0000000001474000 t2 900000010ecd7900 t3 00007fffe9474000
       t4 00007fffe9474000 t5 0000000000000040 t6 900000010edbfb00 t7 0000000000000001
       t8 0000000000000005 u0 90000000004849d0 s9 900000010edbfa00 s0 9000000108d20280
       s1 00007fffe9474000 s2 0000000002000000 s3 9000000108d20280 s4 9000000002b38b10
       s5 900000010edbfb00 s6 00007ffff3474000 s7 0000000000000406 s8 900000010edbfa08
          ra: 9000000000485538 unmap_vmas+0x130/0x218
         ERA: 90000000004eaf1c __unmap_hugepage_range+0x6f4/0x7d0
        PRMD: 00000004 (PPLV0 +PIE -PWE)
        EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
        ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
       ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
       PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
       Process hugefork02 (pid: 1517, threadinfo=00000000a670eaf4, task=000000007a95fc64)
       Call Trace:
       [<90000000004eaf1c>] __unmap_hugepage_range+0x6f4/0x7d0
       [<9000000000485534>] unmap_vmas+0x12c/0x218
       [<9000000000494068>] exit_mmap+0xe0/0x308
       [<900000000025fdc4>] mmput+0x74/0x180
       [<900000000026a284>] do_exit+0x294/0x898
       [<900000000026aa30>] do_group_exit+0x30/0x98
       [<900000000027bed4>] get_signal+0x83c/0x868
       [<90000000002457b4>] arch_do_signal_or_restart+0x54/0xfa0
       [<90000000015795e8>] irqentry_exit_to_user_mode+0xb8/0x138
       [<90000000002572d0>] tlb_do_page_fault_1+0x114/0x1b4
      
      The problem is that base address allocated from hugetlbfs is not aligned
      with pmd size. Here add a checking for hugetlbfs and align base address
      with pmd size. After this patch the test case "testcases/bin/hugefork02"
      passes to run.
      
      This is similar to the commit 7f24cbc9 ("mm/mmap: teach
      generic_get_unmapped_area{_topdown} to handle hugetlb mappings").
      
      Cc: stable@vger.kernel.org  # 6.13+
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      3109d5ff
    • Bibo Mao's avatar
      LoongArch: Set max_pfn with the PFN of the last page · c8477bb0
      Bibo Mao authored
      
      The current max_pfn equals to zero. In this case, it causes user cannot
      get some page information through /proc filesystem such as kpagecount.
      The following message is displayed by stress-ng test suite with command
      "stress-ng --verbose --physpage 1 -t 1".
      
       # stress-ng --verbose --physpage 1 -t 1
       stress-ng: error: [1691] physpage: cannot read page count for address 0x134ac000 in /proc/kpagecount, errno=22 (Invalid argument)
       stress-ng: error: [1691] physpage: cannot read page count for address 0x7ffff207c3a8 in /proc/kpagecount, errno=22 (Invalid argument)
       stress-ng: error: [1691] physpage: cannot read page count for address 0x134b0000 in /proc/kpagecount, errno=22 (Invalid argument)
       ...
      
      After applying this patch, the kernel can pass the test.
      
       # stress-ng --verbose --physpage 1 -t 1
       stress-ng: debug: [1701] physpage: [1701] started (instance 0 on CPU 3)
       stress-ng: debug: [1701] physpage: [1701] exited (instance 0 on CPU 3)
       stress-ng: debug: [1700] physpage: [1701] terminated (success)
      
      Cc: stable@vger.kernel.org  # 6.8+
      Fixes: ff6c3d81 ("NUMA: optimize detection of memory with no node id assigned by firmware")
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      c8477bb0
    • Huacai Chen's avatar
      LoongArch: Use polling play_dead() when resuming from hibernation · c9117434
      Huacai Chen authored
      
      When CONFIG_RANDOM_KMALLOC_CACHES or other randomization infrastructrue
      enabled, the idle_task's stack may different between the booting kernel
      and target kernel. So when resuming from hibernation, an ACTION_BOOT_CPU
      IPI wakeup the idle instruction in arch_cpu_idle_dead() and jump to the
      interrupt handler. But since the stack pointer is changed, the interrupt
      handler cannot restore correct context.
      
      So rename the current arch_cpu_idle_dead() to idle_play_dead(), make it
      as the default version of play_dead(), and the new arch_cpu_idle_dead()
      call play_dead() directly. For hibernation, implement an arch-specific
      hibernate_resume_nonboot_cpu_disable() to use the polling version (idle
      instruction is replace by nop, and irq is disabled) of play_dead(), i.e.
      poll_play_dead(), to avoid IPI handler corrupting the idle_task's stack
      when resuming from hibernation.
      
      This solution is a little similar to commit 406f992e ("x86 /
      hibernate: Use hlt_play_dead() when resuming from hibernation").
      
      Cc: stable@vger.kernel.org
      Tested-by: default avatarErpeng Xu <xuerpeng@uniontech.com>
      Tested-by: default avatarYuli Wang <wangyuli@uniontech.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      c9117434
    • Avenger-285714's avatar
      LoongArch: Eliminate superfluous get_numa_distances_cnt() · a0d3c8bc
      Avenger-285714 authored
      In LoongArch, get_numa_distances_cnt() isn't in use, resulting in a
      compiler warning.
      
      Fix follow errors with clang-18 when W=1e:
      
      arch/loongarch/kernel/acpi.c:259:28: error: unused function 'get_numa_distances_cnt' [-Werror,-Wunused-function]
        259 | static inline unsigned int get_numa_distances_cnt(struct acpi_table_slit *slit)
            |                            ^~~~~~~~~~~~~~~~~~~~~~
      1 error generated.
      
      Link: https://lore.kernel.org/all/Z7bHPVUH4lAezk0E@kernel.org/
      
      
      Signed-off-by: default avatarYuli Wang <wangyuli@uniontech.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      a0d3c8bc
    • Tiezhu Yang's avatar
      LoongArch: Convert unreachable() to BUG() · da64a235
      Tiezhu Yang authored
      
      When compiling on LoongArch, there exists the following objtool warning
      in arch/loongarch/kernel/machine_kexec.o:
      
        kexec_reboot() falls through to next function crash_shutdown_secondary()
      
      Avoid using unreachable() as it can (and will in the absence of UBSAN)
      generate fall-through code. Use BUG() so we get a "break BRK_BUG" trap
      (with unreachable annotation).
      
      Cc: stable@vger.kernel.org  # 6.12+
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      da64a235
    • Linus Torvalds's avatar
      Merge tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 2a520073
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix return address recovery of traced function in ftrace to ensure
         reliable stack unwinding
      
       - Fix compiler warnings and runtime crashes of vDSO selftests on s390
         by introducing a dedicated GNU hash bucket pointer with correct
         32-bit entry size
      
       - Fix test_monitor_call() inline asm, which misses CC clobber, by
         switching to an instruction that doesn't modify CC
      
      * tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/ftrace: Fix return address recovery of traced function
        selftests/vDSO: Fix GNU hash table entry size for s390x
        s390/traps: Fix test_monitor_call() inline assembly
      2a520073
  6. Mar 07, 2025
Loading