Skip to content
Snippets Groups Projects
  1. Oct 22, 2022
  2. Oct 21, 2022
    • Chang S. Bae's avatar
      x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly · 471f0aa7
      Chang S. Bae authored
      
      When an extended state component is not present in fpstate, but in init
      state, the function copies from init_fpstate via copy_feature().
      
      But, dynamic states are not present in init_fpstate because of all-zeros
      init states. Then retrieving them from init_fpstate will explode like this:
      
       BUG: kernel NULL pointer dereference, address: 0000000000000000
       ...
       RIP: 0010:memcpy_erms+0x6/0x10
        ? __copy_xstate_to_uabi_buf+0x381/0x870
        fpu_copy_guest_fpstate_to_uabi+0x28/0x80
        kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm]
        ? __this_cpu_preempt_check+0x13/0x20
        ? vmx_vcpu_put+0x2e/0x260 [kvm_intel]
        kvm_vcpu_ioctl+0xea/0x6b0 [kvm]
        ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm]
        ? __fget_light+0xd4/0x130
        __x64_sys_ioctl+0xe3/0x910
        ? debug_smp_processor_id+0x17/0x20
        ? fpregs_assert_state_consistent+0x27/0x50
        do_syscall_64+0x3f/0x90
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Adjust the 'mask' to zero out the userspace buffer for the features that
      are not available both from fpstate and from init_fpstate.
      
      The dynamic features depend on the compacted XSAVE format. Ensure it is
      enabled before reading XCOMP_BV in init_fpstate.
      
      Fixes: 2308ee57 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
      Reported-by: default avatarYuan Yao <yuan.yao@intel.com>
      Suggested-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Tested-by: default avatarYuan Yao <yuan.yao@intel.com>
      Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/
      Link: https://lkml.kernel.org/r/20221021185844.13472-1-chang.seok.bae@intel.com
      471f0aa7
    • Chen Zhongjin's avatar
      x86/unwind/orc: Fix unreliable stack dump with gcov · 230db824
      Chen Zhongjin authored
      
      When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL
      enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder,
      causing the stack trace to show all text addresses as unreliable:
      
        # echo l > /proc/sysrq-trigger
        [  477.521031] sysrq: Show backtrace of all active CPUs
        [  477.523813] NMI backtrace for cpu 0
        [  477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65
        [  477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
        [  477.526439] Call Trace:
        [  477.526854]  <TASK>
        [  477.527216]  ? dump_stack_lvl+0xc7/0x114
        [  477.527801]  ? dump_stack+0x13/0x1f
        [  477.528331]  ? nmi_cpu_backtrace.cold+0xb5/0x10d
        [  477.528998]  ? lapic_can_unplug_cpu+0xa0/0xa0
        [  477.529641]  ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0
        [  477.530393]  ? arch_trigger_cpumask_backtrace+0x1d/0x30
        [  477.531136]  ? sysrq_handle_showallcpus+0x1b/0x30
        [  477.531818]  ? __handle_sysrq.cold+0x4e/0x1ae
        [  477.532451]  ? write_sysrq_trigger+0x63/0x80
        [  477.533080]  ? proc_reg_write+0x92/0x110
        [  477.533663]  ? vfs_write+0x174/0x530
        [  477.534265]  ? handle_mm_fault+0x16f/0x500
        [  477.534940]  ? ksys_write+0x7b/0x170
        [  477.535543]  ? __x64_sys_write+0x1d/0x30
        [  477.536191]  ? do_syscall_64+0x6b/0x100
        [  477.536809]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
        [  477.537609]  </TASK>
      
      This happens when the compiled code for show_stack() has a single word
      on the stack, and doesn't use a tail call to show_stack_log_lvl().
      (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.)  Then the
      __unwind_start() skip logic hits an off-by-one bug and fails to unwind
      all the way to the intended starting frame.
      
      Fix it by reverting the following commit:
      
        f1d9a2ab ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
      
      The original justification for that commit no longer exists.  That
      original issue was later fixed in a different way, with the following
      commit:
      
        f2ac57a4 ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels")
      
      Fixes: f1d9a2ab ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
      Signed-off-by: default avatarChen Zhongjin <chenzhongjin@huawei.com>
      [jpoimboe: rewrite commit log]
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      230db824
    • Charlotte Tan's avatar
      iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check() · 5566e68d
      Charlotte Tan authored
      arch_rmrr_sanity_check() warns if the RMRR is not covered by an ACPI
      Reserved region, but it seems like it should accept an NVS region as
      well. The ACPI spec
      https://uefi.org/specs/ACPI/6.5/15_System_Address_Map_Interfaces.html
      uses similar wording for "Reserved" and "NVS" region types; for NVS
      regions it says "This range of addresses is in use or reserved by the
      system and must not be used by the operating system."
      
      There is an old comment on this mailing list that also suggests NVS
      regions should pass the arch_rmrr_sanity_check() test:
      
       The warnings come from arch_rmrr_sanity_check() since it checks whether
       the region is E820_TYPE_RESERVED. However, if the purpose of the check
       is to detect RMRR has regions that may be used by OS as free memory,
       isn't  E820_TYPE_NVS safe, too?
      
      This patch overlaps with another proposed patch that would add the region
      type to the log since sometimes the bug reporter sees this log on the
      console but doesn't know to include the kernel log:
      
      https://lore.kernel.org/lkml/20220611204859.234975-3-atomlin@redhat.com/
      
      Here's an example of the "Firmware Bug" apparent false positive (wrapped
      for line length):
      
       DMAR: [Firmware Bug]: No firmware reserved region can cover this RMRR
             [0x000000006f760000-0x000000006f762fff], contact BIOS vendor for
             fixes
       DMAR: [Firmware Bug]: Your BIOS is broken; bad RMRR
             [0x000000006f760000-0x000000006f762fff]
      
      This is the snippet from the e820 table:
      
       BIOS-e820: [mem 0x0000000068bff000-0x000000006ebfefff] reserved
       BIOS-e820: [mem 0x000000006ebff000-0x000000006f9fefff] ACPI NVS
       BIOS-e820: [mem 0x000000006f9ff000-0x000000006fffefff] ACPI data
      
      Fixes: f036c7fa ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved")
      Cc: Will Mortensen <will@extrahop.com>
      Link: https://lore.kernel.org/linux-iommu/64a5843d-850d-e58c-4fc2-0a0eeeb656dc@nec.com/
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216443
      
      
      Signed-off-by: default avatarCharlotte Tan <charlotte@extrahop.com>
      Reviewed-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Link: https://lore.kernel.org/r/20220929044449.32515-1-charlotte@extrahop.com
      
      
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      5566e68d
    • Anup Patel's avatar
      RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc · cea8896b
      Anup Patel authored
      
      The kvm_riscv_vcpu_timer_pending() checks per-VCPU next_cycles
      and per-VCPU software injected VS timer interrupt. This function
      returns incorrect value when Sstc is available because the per-VCPU
      next_cycles are only updated by kvm_riscv_vcpu_timer_save() called
      from kvm_arch_vcpu_put(). As a result, when Sstc is available the
      VCPU does not block properly upon WFI traps.
      
      To fix the above issue, we introduce kvm_riscv_vcpu_timer_sync()
      which will update per-VCPU next_cycles upon every VM exit instead
      of kvm_riscv_vcpu_timer_save().
      
      Fixes: 8f5cb44b ("RISC-V: KVM: Support sstc extension")
      Signed-off-by: default avatarAnup Patel <apatel@ventanamicro.com>
      Reviewed-by: default avatarAtish Patra <atishp@rivosinc.com>
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      cea8896b
    • Andrew Jones's avatar
      RISC-V: Fix compilation without RISCV_ISA_ZICBOM · 5c20a3a9
      Andrew Jones authored
      
      riscv_cbom_block_size and riscv_init_cbom_blocksize() should always
      be available and riscv_init_cbom_blocksize() should always be
      invoked, even when compiling without RISCV_ISA_ZICBOM enabled. This
      is because disabling RISCV_ISA_ZICBOM means "don't use zicbom
      instructions in the kernel" not "pretend there isn't zicbom, even
      when there is". When zicbom is available, whether the kernel enables
      its use with RISCV_ISA_ZICBOM or not, KVM will offer it to guests.
      Ensure we can build KVM and that the block size is initialized even
      when compiling without RISCV_ISA_ZICBOM.
      
      Fixes: 8f7e001e ("RISC-V: Clean up the Zicbom block size probing")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Jones <ajones@ventanamicro.com>
      Signed-off-by: default avatarAnup Patel <apatel@ventanamicro.com>
      Reviewed-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Reviewed-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Tested-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      5c20a3a9
  3. Oct 20, 2022
  4. Oct 18, 2022
  5. Oct 17, 2022
  6. Oct 15, 2022
  7. Oct 14, 2022
    • Helge Deller's avatar
      parisc: Fix userspace graphics card breakage due to pgtable special bit · 70be49f2
      Helge Deller authored
      
      Commit df24e178 ("parisc: Add vDSO support") introduced the vDSO
      support, for which a _PAGE_SPECIAL page table flag was needed.  Since we
      wanted to keep every page table entry in 32-bits, this patch re-used the
      existing - but yet unused - _PAGE_DMB flag (which triggers a hardware break
      if a page is accessed) to store the special bit.
      
      But when graphics card memory is mmapped into userspace, the kernel uses
      vm_iomap_memory() which sets the the special flag. So, with the DMB bit
      set, every access to the graphics memory now triggered a hardware
      exception and segfaulted the userspace program.
      
      Fix this breakage by dropping the DMB bit when writing the page
      protection bits to the CPU TLB.
      
      In addition this patch adds a small optimization: if huge pages aren't
      configured (which is at least the case for 32-bit kernels), then the
      special bit is stored in the hpage (HUGE PAGE) bit instead. That way we
      can skip to reset the DMB bit.
      
      Fixes: df24e178 ("parisc: Add vDSO support")
      Cc: <stable@vger.kernel.org> # 5.18+
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      70be49f2
  8. Oct 13, 2022
  9. Oct 12, 2022
Loading