1. 28 Nov, 2018 1 commit
    • Steven Rostedt (VMware)'s avatar
      x86/function_graph: Simplify with function_graph_enter() · 07f7175b
      Steven Rostedt (VMware) authored
      The function_graph_enter() function does the work of calling the function
      graph hook function and the management of the shadow stack, simplifying the
      work done in the architecture dependent prepare_ftrace_return().
      
      Have x86 use the new code, and remove the shadow stack management as well as
      having to set up the trace structure.
      
      This is needed to prepare for a fix of a design bug on how the curr_ret_stack
      is used.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: stable@kernel.org
      Fixes: 03274a3f ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      07f7175b
  2. 27 Nov, 2018 14 commits
    • Jim Mattson's avatar
      kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb · fd65d314
      Jim Mattson authored
      Previously, we only called indirect_branch_prediction_barrier on the
      logical CPU that freed a vmcb. This function should be called on all
      logical CPUs that last loaded the vmcb in question.
      
      Fixes: 15d45071 ("KVM/x86: Add IBPB support")
      Reported-by: default avatarNeel Natu <neelnatu@google.com>
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fd65d314
    • Junaid Shahid's avatar
      kvm: mmu: Fix race in emulated page table writes · 0e0fee5c
      Junaid Shahid authored
      When a guest page table is updated via an emulated write,
      kvm_mmu_pte_write() is called to update the shadow PTE using the just
      written guest PTE value. But if two emulated guest PTE writes happened
      concurrently, it is possible that the guest PTE and the shadow PTE end
      up being out of sync. Emulated writes do not mark the shadow page as
      unsync-ed, so this inconsistency will not be resolved even by a guest TLB
      flush (unless the page was marked as unsync-ed at some other point).
      
      This is fixed by re-reading the current value of the guest PTE after the
      MMU lock has been acquired instead of just using the value that was
      written prior to calling kvm_mmu_pte_write().
      Signed-off-by: default avatarJunaid Shahid <junaids@google.com>
      Reviewed-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0e0fee5c
    • Liran Alon's avatar
      KVM: nVMX: vmcs12 revision_id is always VMCS12_REVISION even when copied from eVMCS · 52ad7eb3
      Liran Alon authored
      vmcs12 represents the per-CPU cache of L1 active vmcs12.
      
      This cache can be loaded by one of the following:
      1) Guest making a vmcs12 active by exeucting VMPTRLD
      2) Guest specifying eVMCS in VP assist page and executing
      VMLAUNCH/VMRESUME.
      
      Either way, vmcs12 should have revision_id of VMCS12_REVISION.
      Which is not equal to eVMCS revision_id which specifies used
      VersionNumber of eVMCS struct (e.g. KVM_EVMCS_VERSION).
      
      Specifically, this causes an issue in restoring a nested VM state
      because vmx_set_nested_state() verifies that vmcs12->revision_id
      is equal to VMCS12_REVISION which was not true in case vmcs12
      was populated from an eVMCS by vmx_get_nested_state() which calls
      copy_enlightened_to_vmcs12().
      Reviewed-by: default avatarDarren Kenny <darren.kenny@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      52ad7eb3
    • Liran Alon's avatar
      KVM: nVMX: Verify eVMCS revision id match supported eVMCS version on eVMCS VMPTRLD · 72aeb60c
      Liran Alon authored
      According to TLFS section 16.11.2 Enlightened VMCS, the first u32
      field of eVMCS should specify eVMCS VersionNumber.
      
      This version should be in the range of supported eVMCS versions exposed
      to guest via CPUID.0x4000000A.EAX[0:15].
      The range which KVM expose to guest in this CPUID field should be the
      same as the value returned in vmcs_version by nested_enable_evmcs().
      
      According to the above, eVMCS VMPTRLD should verify that version specified
      in given eVMCS is in the supported range. However, current code
      mistakenly verfies this field against VMCS12_REVISION.
      
      One can also see that when KVM use eVMCS, it makes sure that
      alloc_vmcs_cpu() sets allocated eVMCS revision_id to KVM_EVMCS_VERSION.
      
      Obvious fix should just change eVMCS VMPTRLD to verify first u32 field
      of eVMCS is equal to KVM_EVMCS_VERSION.
      However, it turns out that Microsoft Hyper-V fails to comply to their
      own invented interface: When Hyper-V use eVMCS, it just sets first u32
      field of eVMCS to revision_id specified in MSR_IA32_VMX_BASIC (In our
      case: VMCS12_REVISION). Instead of used eVMCS version number which is
      one of the supported versions specified in CPUID.0x4000000A.EAX[0:15].
      To overcome Hyper-V bug, we accept either a supported eVMCS version
      or VMCS12_REVISION as valid values for first u32 field of eVMCS.
      
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: default avatarNikita Leshenko <nikita.leshchenko@oracle.com>
      Reviewed-by: default avatarMark Kanda <mark.kanda@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      72aeb60c
    • Leonid Shatz's avatar
      KVM: nVMX/nSVM: Fix bug which sets vcpu->arch.tsc_offset to L1 tsc_offset · 326e7425
      Leonid Shatz authored
      Since commit e79f245d ("X86/KVM: Properly update 'tsc_offset' to
      represent the running guest"), vcpu->arch.tsc_offset meaning was
      changed to always reflect the tsc_offset value set on active VMCS.
      Regardless if vCPU is currently running L1 or L2.
      
      However, above mentioned commit failed to also change
      kvm_vcpu_write_tsc_offset() to set vcpu->arch.tsc_offset correctly.
      This is because vmx_write_tsc_offset() could set the tsc_offset value
      in active VMCS to given offset parameter *plus vmcs12->tsc_offset*.
      However, kvm_vcpu_write_tsc_offset() just sets vcpu->arch.tsc_offset
      to given offset parameter. Without taking into account the possible
      addition of vmcs12->tsc_offset. (Same is true for SVM case).
      
      Fix this issue by changing kvm_x86_ops->write_tsc_offset() to return
      actually set tsc_offset in active VMCS and modify
      kvm_vcpu_write_tsc_offset() to set returned value in
      vcpu->arch.tsc_offset.
      In addition, rename write_tsc_offset() callback to write_l1_tsc_offset()
      to make it clear that it is meant to set L1 TSC offset.
      
      Fixes: e79f245d ("X86/KVM: Properly update 'tsc_offset' to represent the running guest")
      Reviewed-by: default avatarLiran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarMihai Carabas <mihai.carabas@oracle.com>
      Reviewed-by: default avatarKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Signed-off-by: default avatarLeonid Shatz <leonid.shatz@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      326e7425
    • Yi Wang's avatar
      x86/kvm/vmx: fix old-style function declaration · 1e4329ee
      Yi Wang authored
      The inline keyword which is not at the beginning of the function
      declaration may trigger the following build warnings, so let's fix it:
      
      arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      Signed-off-by: default avatarYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1e4329ee
    • Yi Wang's avatar
      KVM: x86: fix empty-body warnings · 354cb410
      Yi Wang authored
      We get the following warnings about empty statements when building
      with 'W=1':
      
      arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      
      Rework the debug helper macro to get rid of these warnings.
      Signed-off-by: default avatarYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      354cb410
    • Liran Alon's avatar
      KVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes · f48b4711
      Liran Alon authored
      When guest transitions from/to long-mode by modifying MSR_EFER.LMA,
      the list of shared MSRs to be saved/restored on guest<->host
      transitions is updated (See vmx_set_efer() call to setup_msrs()).
      
      On every entry to guest, vcpu_enter_guest() calls
      vmx_prepare_switch_to_guest(). This function should also take care
      of setting the shared MSRs to be saved/restored. However, the
      function does nothing in case we are already running with loaded
      guest state (vmx->loaded_cpu_state != NULL).
      
      This means that even when guest modifies MSR_EFER.LMA which results
      in updating the list of shared MSRs, it isn't being taken into account
      by vmx_prepare_switch_to_guest() because it happens while we are
      running with loaded guest state.
      
      To fix above mentioned issue, add a flag to mark that the list of
      shared MSRs has been updated and modify vmx_prepare_switch_to_guest()
      to set shared MSRs when running with host state *OR* list of shared
      MSRs has been updated.
      
      Note that this issue was mistakenly introduced by commit
      678e315e ("KVM: vmx: add dedicated utility to access guest's
      kernel_gs_base") because previously vmx_set_efer() always called
      vmx_load_host_state() which resulted in vmx_prepare_switch_to_guest() to
      set shared MSRs.
      
      Fixes: 678e315e ("KVM: vmx: add dedicated utility to access guest's kernel_gs_base")
      Reported-by: default avatarEyal Moscovici <eyal.moscovici@oracle.com>
      Reviewed-by: default avatarMihai Carabas <mihai.carabas@oracle.com>
      Reviewed-by: default avatarLiam Merwick <liam.merwick@oracle.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f48b4711
    • Liran Alon's avatar
      KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall · bcbfbd8e
      Liran Alon authored
      kvm_pv_clock_pairing() allocates local var
      "struct kvm_clock_pairing clock_pairing" on stack and initializes
      all it's fields besides padding (clock_pairing.pad[]).
      
      Because clock_pairing var is written completely (including padding)
      to guest memory, failure to init struct padding results in kernel
      info-leak.
      
      Fix the issue by making sure to also init the padding with zeroes.
      
      Fixes: 55dd00a7 ("KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall")
      Reported-by: syzbot+a8ef68d71211ba264f56@syzkaller.appspotmail.com
      Reviewed-by: default avatarMark Kanda <mark.kanda@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bcbfbd8e
    • Liran Alon's avatar
      KVM: nVMX: Fix kernel info-leak when enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS more than once · 7f9ad1df
      Liran Alon authored
      Consider the case that userspace enables KVM_CAP_HYPERV_ENLIGHTENED_VMCS twice:
      1) kvm_vcpu_ioctl_enable_cap() is called to enable
      KVM_CAP_HYPERV_ENLIGHTENED_VMCS which calls nested_enable_evmcs().
      2) nested_enable_evmcs() sets enlightened_vmcs_enabled to true and fills
      vmcs_version which is then copied to userspace.
      3) kvm_vcpu_ioctl_enable_cap() is called again to enable
      KVM_CAP_HYPERV_ENLIGHTENED_VMCS which calls nested_enable_evmcs().
      4) This time nested_enable_evmcs() just returns 0 as
      enlightened_vmcs_enabled is already true. *Without filling
      vmcs_version*.
      5) kvm_vcpu_ioctl_enable_cap() continues as usual and copies
      *uninitialized* vmcs_version to userspace which leads to kernel info-leak.
      
      Fix this issue by simply changing nested_enable_evmcs() to always fill
      vmcs_version output argument. Even when enlightened_vmcs_enabled is
      already set to true.
      
      Note that SVM's nested_enable_evmcs() should not be modified because it
      always returns a non-zero value (-ENODEV) which results in
      kvm_vcpu_ioctl_enable_cap() skipping the copy of vmcs_version to
      userspace (as it should).
      
      Fixes: 57b119da ("KVM: nVMX: add KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability")
      Reported-by: syzbot+cfbc368e283d381f8cef@syzkaller.appspotmail.com
      Reviewed-by: default avatarKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7f9ad1df
    • Wei Wang's avatar
      svm: Add mutex_lock to protect apic_access_page_done on AMD systems · 30510387
      Wei Wang authored
      There is a race condition when accessing kvm->arch.apic_access_page_done.
      Due to it, x86_set_memory_region will fail when creating the second vcpu
      for a svm guest.
      
      Add a mutex_lock to serialize the accesses to apic_access_page_done.
      This lock is also used by vmx for the same purpose.
      Signed-off-by: default avatarWei Wang <wawei@amazon.de>
      Signed-off-by: default avatarAmadeusz Juskowiak <ajusk@amazon.de>
      Signed-off-by: default avatarJulian Stecklina <jsteckli@amazon.de>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Reviewed-by: default avatarJoerg Roedel <jroedel@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      30510387
    • Wanpeng Li's avatar
      KVM: X86: Fix scan ioapic use-before-initialization · e97f852f
      Wanpeng Li authored
      Reported by syzkaller:
      
       BUG: unable to handle kernel NULL pointer dereference at 00000000000001c8
       PGD 80000003ec4da067 P4D 80000003ec4da067 PUD 3f7bfa067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 7 PID: 5059 Comm: debug Tainted: G           OE     4.19.0-rc5 #16
       RIP: 0010:__lock_acquire+0x1a6/0x1990
       Call Trace:
        lock_acquire+0xdb/0x210
        _raw_spin_lock+0x38/0x70
        kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
        vcpu_enter_guest+0x167e/0x1910 [kvm]
        kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
        kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
        do_vfs_ioctl+0xa5/0x690
        ksys_ioctl+0x6d/0x80
        __x64_sys_ioctl+0x1a/0x20
        do_syscall_64+0x83/0x6e0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT6 msr
      and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
      However, irqchip is not initialized by this simple testcase, ioapic/apic
      objects should not be accessed.
      This can be triggered by the following program:
      
          #define _GNU_SOURCE
      
          #include <endian.h>
          #include <stdint.h>
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <sys/syscall.h>
          #include <sys/types.h>
          #include <unistd.h>
      
          uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};
      
          int main(void)
          {
          	syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
          	long res = 0;
          	memcpy((void*)0x20000040, "/dev/kvm", 9);
          	res = syscall(__NR_openat, 0xffffffffffffff9c, 0x20000040, 0, 0);
          	if (res != -1)
          		r[0] = res;
          	res = syscall(__NR_ioctl, r[0], 0xae01, 0);
          	if (res != -1)
          		r[1] = res;
          	res = syscall(__NR_ioctl, r[1], 0xae41, 0);
          	if (res != -1)
          		r[2] = res;
          	memcpy(
          			(void*)0x20000080,
          			"\x01\x00\x00\x00\x00\x5b\x61\xbb\x96\x00\x00\x40\x00\x00\x00\x00\x01\x00"
          			"\x08\x00\x00\x00\x00\x00\x0b\x77\xd1\x78\x4d\xd8\x3a\xed\xb1\x5c\x2e\x43"
          			"\xaa\x43\x39\xd6\xff\xf5\xf0\xa8\x98\xf2\x3e\x37\x29\x89\xde\x88\xc6\x33"
          			"\xfc\x2a\xdb\xb7\xe1\x4c\xac\x28\x61\x7b\x9c\xa9\xbc\x0d\xa0\x63\xfe\xfe"
          			"\xe8\x75\xde\xdd\x19\x38\xdc\x34\xf5\xec\x05\xfd\xeb\x5d\xed\x2e\xaf\x22"
          			"\xfa\xab\xb7\xe4\x42\x67\xd0\xaf\x06\x1c\x6a\x35\x67\x10\x55\xcb",
          			106);
          	syscall(__NR_ioctl, r[2], 0x4008ae89, 0x20000080);
          	syscall(__NR_ioctl, r[2], 0xae80, 0);
          	return 0;
          }
      
      This patch fixes it by bailing out scan ioapic if ioapic is not initialized in
      kernel.
      Reported-by: default avatarWei Wu <ww9210@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Wei Wu <ww9210@gmail.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e97f852f
    • Wanpeng Li's avatar
      KVM: LAPIC: Fix pv ipis use-before-initialization · 38ab012f
      Wanpeng Li authored
      Reported by syzkaller:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
       PGD 800000040410c067 P4D 800000040410c067 PUD 40410d067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 3 PID: 2567 Comm: poc Tainted: G           OE     4.19.0-rc5 #16
       RIP: 0010:kvm_pv_send_ipi+0x94/0x350 [kvm]
       Call Trace:
        kvm_emulate_hypercall+0x3cc/0x700 [kvm]
        handle_vmcall+0xe/0x10 [kvm_intel]
        vmx_handle_exit+0xc1/0x11b0 [kvm_intel]
        vcpu_enter_guest+0x9fb/0x1910 [kvm]
        kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
        kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
        do_vfs_ioctl+0xa5/0x690
        ksys_ioctl+0x6d/0x80
        __x64_sys_ioctl+0x1a/0x20
        do_syscall_64+0x83/0x6e0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The reason is that the apic map has not yet been initialized, the testcase
      triggers pv_send_ipi interface by vmcall which results in kvm->arch.apic_map
      is dereferenced. This patch fixes it by checking whether or not apic map is
      NULL and bailing out immediately if that is the case.
      
      Fixes: 4180bf1b (KVM: X86: Implement "send IPI" hypercall)
      Reported-by: default avatarWei Wu <ww9210@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Wei Wu <ww9210@gmail.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      38ab012f
    • Luiz Capitulino's avatar
      KVM: VMX: re-add ple_gap module parameter · a87c99e6
      Luiz Capitulino authored
      Apparently, the ple_gap parameter was accidentally removed
      by commit c8e88717. Add it
      back.
      Signed-off-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: c8e88717Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a87c99e6
  3. 22 Nov, 2018 3 commits
    • Jiri Olsa's avatar
      perf/x86/intel: Disallow precise_ip on BTS events · 472de49f
      Jiri Olsa authored
      Vince reported a crash in the BTS flush code when touching the callchain
      data, which was supposed to be initialized as an 'early' callchain,
      but intel_pmu_drain_bts_buffer() does not do that:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        ...
        Call Trace:
         <IRQ>
         intel_pmu_drain_bts_buffer+0x151/0x220
         ? intel_get_event_constraints+0x219/0x360
         ? perf_assign_events+0xe2/0x2a0
         ? select_idle_sibling+0x22/0x3a0
         ? __update_load_avg_se+0x1ec/0x270
         ? enqueue_task_fair+0x377/0xdd0
         ? cpumask_next_and+0x19/0x20
         ? load_balance+0x134/0x950
         ? check_preempt_curr+0x7a/0x90
         ? ttwu_do_wakeup+0x19/0x140
         x86_pmu_stop+0x3b/0x90
         x86_pmu_del+0x57/0x160
         event_sched_out.isra.106+0x81/0x170
         group_sched_out.part.108+0x51/0xc0
         __perf_event_disable+0x7f/0x160
         event_function+0x8c/0xd0
         remote_function+0x3c/0x50
         flush_smp_call_function_queue+0x35/0xe0
         smp_call_function_single_interrupt+0x3a/0xd0
         call_function_single_interrupt+0xf/0x20
         </IRQ>
      
      It was triggered by fuzzer but can be easily reproduced by:
      
        # perf record -e cpu/branch-instructions/pu -g -c 1
      
      Peter suggested not to allow branch tracing for precise events:
      
       > Now arguably, this is really stupid behaviour. Who in his right mind
       > wants callchain output on BTS entries. And even if they do, BTS +
       > precise_ip is nonsensical.
       >
       > So in my mind disallowing precise_ip on BTS would be the simplest fix.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 6cbc304f ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
      Link: http://lkml.kernel.org/r/20181121101612.16272-3-jolsa@kernel.orgSigned-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      472de49f
    • Jiri Olsa's avatar
      perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() · 67266c10
      Jiri Olsa authored
      Currently we check the branch tracing only by checking for the
      PERF_COUNT_HW_BRANCH_INSTRUCTIONS event of PERF_TYPE_HARDWARE
      type. But we can define the same event with the PERF_TYPE_RAW
      type.
      
      Changing the intel_pmu_has_bts() code to check on event's final
      hw config value, so both HW types are covered.
      
      Adding unlikely to intel_pmu_has_bts() condition calls, because
      it was used in the original code in intel_bts_constraints.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20181121101612.16272-2-jolsa@kernel.orgSigned-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      67266c10
    • Jiri Olsa's avatar
      perf/x86/intel: Move branch tracing setup to the Intel-specific source file · ed6101bb
      Jiri Olsa authored
      Moving branch tracing setup to Intel core object into separate
      intel_pmu_bts_config function, because it's Intel specific.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20181121101612.16272-1-jolsa@kernel.orgSigned-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      ed6101bb
  4. 20 Nov, 2018 1 commit
  5. 12 Nov, 2018 2 commits
  6. 09 Nov, 2018 3 commits
  7. 06 Nov, 2018 6 commits
    • Eial Czerwacki's avatar
      x86/vsmp: Remove dependency on pv_irq_ops · a48777fd
      Eial Czerwacki authored
      vSMP dependency on pv_irq_ops has been removed some years ago, but the code
      still deals with pv_irq_ops.
      
      In short, "cap & ctl & (1 << 4)" is always returning 0, so all
      PARAVIRT/PARAVIRT_XXL code related to that can be removed.
      
      However, the rest of the code depends on CONFIG_PCI, so fix it accordingly.
      
      Rename set_vsmp_pv_ops to set_vsmp_ctl as the original name does not make
      sense anymore.
      Signed-off-by: default avatarEial Czerwacki <eial@scalemp.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarShai Fultheim <shai@scalemp.com>
      Cc: Juergen Gross <jgross@suse.com>
      Link: https://lkml.kernel.org/r/1541439114-28297-1-git-send-email-eial@scalemp.com
      a48777fd
    • Kirill A. Shutemov's avatar
      x86/ldt: Remove unused variable in map_ldt_struct() · b082f2dd
      Kirill A. Shutemov authored
      Splitting out the sanity check in map_ldt_struct() moved page table syncing
      into a separate function, which made the pgd variable unused. Remove it.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 9bae3197 ("x86/ldt: Split out sanity check in map_ldt_struct()")
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: dave.hansen@linux.intel.com
      Cc: peterz@infradead.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: bhe@redhat.com
      Cc: willy@infradead.org
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181026122856.66224-4-kirill.shutemov@linux.intel.com
      b082f2dd
    • Kirill A. Shutemov's avatar
      x86/ldt: Unmap PTEs for the slot before freeing LDT pages · a0e6e083
      Kirill A. Shutemov authored
      modify_ldt(2) leaves the old LDT mapped after switching over to the new
      one. The old LDT gets freed and the pages can be re-used.
      
      Leaving the mapping in place can have security implications. The mapping is
      present in the userspace page tables and Meltdown-like attacks can read
      these freed and possibly reused pages.
      
      It's relatively simple to fix: unmap the old LDT and flush TLB before
      freeing the old LDT memory.
      
      This further allows to avoid flushing the TLB in map_ldt_struct() as the
      slot is unmapped and flushed by unmap_ldt_struct() or has never been mapped
      at all.
      
      [ tglx: Massaged changelog and removed the needless line breaks ]
      
      Fixes: f55f0501 ("x86/pti: Put the LDT in its own PGD if PTI is on")
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: dave.hansen@linux.intel.com
      Cc: luto@kernel.org
      Cc: peterz@infradead.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: bhe@redhat.com
      Cc: willy@infradead.org
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181026122856.66224-3-kirill.shutemov@linux.intel.com
      a0e6e083
    • Kirill A. Shutemov's avatar
      x86/mm: Move LDT remap out of KASLR region on 5-level paging · d52888aa
      Kirill A. Shutemov authored
      On 5-level paging the LDT remap area is placed in the middle of the KASLR
      randomization region and it can overlap with the direct mapping, the
      vmalloc or the vmap area.
      
      The LDT mapping is per mm, so it cannot be moved into the P4D page table
      next to the CPU_ENTRY_AREA without complicating PGD table allocation for
      5-level paging.
      
      The 4 PGD slot gap just before the direct mapping is reserved for
      hypervisors, so it cannot be used.
      
      Move the direct mapping one slot deeper and use the resulting gap for the
      LDT remap area. The resulting layout is the same for 4 and 5 level paging.
      
      [ tglx: Massaged changelog ]
      
      Fixes: f55f0501 ("x86/pti: Put the LDT in its own PGD if PTI is on")
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: dave.hansen@linux.intel.com
      Cc: peterz@infradead.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: bhe@redhat.com
      Cc: willy@infradead.org
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181026122856.66224-2-kirill.shutemov@linux.intel.com
      d52888aa
    • Vishal Verma's avatar
      acpi/nfit, x86/mce: Validate a MCE's address before using it · e8a308e5
      Vishal Verma authored
      The NFIT machine check handler uses the physical address from the mce
      structure, and compares it against information in the ACPI NFIT table
      to determine whether that location lies on an NVDIMM. The mce->addr
      field however may not always be valid, and this is indicated by the
      MCI_STATUS_ADDRV bit in the status field.
      
      Export mce_usable_address() which already performs validation for the
      address, and use it in the NFIT handler.
      
      Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error")
      Reported-by: default avatarRobert Elliott <elliott@hpe.com>
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      CC: Arnd Bergmann <arnd@arndb.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      CC: Dave Jiang <dave.jiang@intel.com>
      CC: elliott@hpe.com
      CC: "H. Peter Anvin" <hpa@zytor.com>
      CC: Ingo Molnar <mingo@redhat.com>
      CC: Len Brown <lenb@kernel.org>
      CC: linux-acpi@vger.kernel.org
      CC: linux-edac <linux-edac@vger.kernel.org>
      CC: linux-nvdimm@lists.01.org
      CC: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
      CC: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      CC: Ross Zwisler <zwisler@kernel.org>
      CC: stable <stable@vger.kernel.org>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Tony Luck <tony.luck@intel.com>
      CC: x86-ml <x86@kernel.org>
      CC: Yazen Ghannam <yazen.ghannam@amd.com>
      Link: http://lkml.kernel.org/r/20181026003729.8420-2-vishal.l.verma@intel.com
      e8a308e5
    • Vishal Verma's avatar
      acpi/nfit, x86/mce: Handle only uncorrectable machine checks · 5d96c934
      Vishal Verma authored
      The MCE handler for nfit devices is called for memory errors on a
      Non-Volatile DIMM and adds the error location to a 'badblocks' list.
      This list is used by the various NVDIMM drivers to avoid consuming known
      poison locations during IO.
      
      The MCE handler gets called for both corrected and uncorrectable errors.
      Until now, both kinds of errors have been added to the badblocks list.
      However, corrected memory errors indicate that the problem has already
      been fixed by hardware, and the resulting interrupt is merely a
      notification to Linux.
      
      As far as future accesses to that location are concerned, it is
      perfectly fine to use, and thus doesn't need to be included in the above
      badblocks list.
      
      Add a check in the nfit MCE handler to filter out corrected mce events,
      and only process uncorrectable errors.
      
      Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error")
      Reported-by: default avatarOmar Avelar <omar.avelar@intel.com>
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      CC: Arnd Bergmann <arnd@arndb.de>
      CC: Dan Williams <dan.j.williams@intel.com>
      CC: Dave Jiang <dave.jiang@intel.com>
      CC: elliott@hpe.com
      CC: "H. Peter Anvin" <hpa@zytor.com>
      CC: Ingo Molnar <mingo@redhat.com>
      CC: Len Brown <lenb@kernel.org>
      CC: linux-acpi@vger.kernel.org
      CC: linux-edac <linux-edac@vger.kernel.org>
      CC: linux-nvdimm@lists.01.org
      CC: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
      CC: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      CC: Ross Zwisler <zwisler@kernel.org>
      CC: stable <stable@vger.kernel.org>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Tony Luck <tony.luck@intel.com>
      CC: x86-ml <x86@kernel.org>
      CC: Yazen Ghannam <yazen.ghannam@amd.com>
      Link: http://lkml.kernel.org/r/20181026003729.8420-1-vishal.l.verma@intel.com
      5d96c934
  8. 05 Nov, 2018 2 commits
  9. 04 Nov, 2018 1 commit
    • Michael Kelley's avatar
      x86/hyper-v: Enable PIT shutdown quirk · 1de72c70
      Michael Kelley authored
      Hyper-V emulation of the PIT has a quirk such that the normal PIT shutdown
      path doesn't work, because clearing the counter register restarts the
      timer.
      
      Disable the counter clearing on PIT shutdown.
      Signed-off-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
      Cc: "devel@linuxdriverproject.org" <devel@linuxdriverproject.org>
      Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
      Cc: "virtualization@lists.linux-foundation.org" <virtualization@lists.linux-foundation.org>
      Cc: "jgross@suse.com" <jgross@suse.com>
      Cc: "akataria@vmware.com" <akataria@vmware.com>
      Cc: "olaf@aepfle.de" <olaf@aepfle.de>
      Cc: "apw@canonical.com" <apw@canonical.com>
      Cc: vkuznets <vkuznets@redhat.com>
      Cc: "jasowang@redhat.com" <jasowang@redhat.com>
      Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1541303219-11142-3-git-send-email-mikelley@microsoft.com
      1de72c70
  10. 03 Nov, 2018 1 commit
    • Peter Zijlstra's avatar
      x86/qspinlock: Fix compile error · b987ffc1
      Peter Zijlstra authored
      With a compiler that has asm-goto but not asm-cc-output and
      CONFIG_PROFILE_ALL_BRANCHES=y we get a compiler error:
      
        arch/x86/include/asm/rmwcc.h:23:17: error: jump into statement expression
      
      Fix this by writing the if() as a boolean multiplication instead.
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 7aa54be2 ("locking/qspinlock, x86: Provide liveness guarantee")
      Signed-off-by: Ingo Molnar's avatarIngo Molnar <mingo@kernel.org>
      b987ffc1
  11. 01 Nov, 2018 1 commit
    • Dmitry Safonov's avatar
      x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT · a846446b
      Dmitry Safonov authored
      The result of in_compat_syscall() can be pictured as:
      
      x86 platform:
          ---------------------------------------------------
          |  Arch\syscall  |  64-bit  |   ia32   |   x32    |
          |-------------------------------------------------|
          |     x86_64     |  false   |   true   |   true   |
          |-------------------------------------------------|
          |      i686      |          |  <true>  |          |
          ---------------------------------------------------
      
      Other platforms:
          -------------------------------------------
          |  Arch\syscall  |  64-bit  |   compat    |
          |-----------------------------------------|
          |     64-bit     |  false   |    true     |
          |-----------------------------------------|
          |    32-bit(?)   |          |   <false>   |
          -------------------------------------------
      
      As seen, the result of in_compat_syscall() on generic 32-bit platform
      differs from i686.
      
      There is no reason for in_compat_syscall() == true on native i686.  It also
      easy to misread code if the result on native 32-bit platform differs
      between arches.
      
      Because of that non arch-specific code has many places with:
          if (IS_ENABLED(CONFIG_COMPAT) && in_compat_syscall())
      in different variations.
      
      It looks-like the only non-x86 code which uses in_compat_syscall() not
      under CONFIG_COMPAT guard is in amd/amdkfd. But according to the commit
      a18069c1 ("amdkfd: Disable support for 32-bit user processes"), it
      actually should be disabled on native i686.
      
      Rename in_compat_syscall() to in_32bit_syscall() for x86-specific code
      and make in_compat_syscall() false under !CONFIG_COMPAT.
      
      A follow on patch will clean up generic users which were forced to check
      IS_ENABLED(CONFIG_COMPAT) with in_compat_syscall().
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Stephen Boyd <sboyd@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-efi@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181012134253.23266-2-dima@arista.com
      a846446b
  12. 31 Oct, 2018 5 commits
    • Mike Rapoport's avatar
      memblock: stop using implicit alignment to SMP_CACHE_BYTES · 7e1c4e27
      Mike Rapoport authored
      When a memblock allocation APIs are called with align = 0, the alignment
      is implicitly set to SMP_CACHE_BYTES.
      
      Implicit alignment is done deep in the memblock allocator and it can
      come as a surprise.  Not that such an alignment would be wrong even
      when used incorrectly but it is better to be explicit for the sake of
      clarity and the prinicple of the least surprise.
      
      Replace all such uses of memblock APIs with the 'align' parameter
      explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment
      in the memblock internal allocation functions.
      
      For the case when memblock APIs are used via helper functions, e.g.  like
      iommu_arena_new_node() in Alpha, the helper functions were detected with
      Coccinelle's help and then manually examined and updated where
      appropriate.
      
      The direct memblock APIs users were updated using the semantic patch below:
      
      @@
      expression size, min_addr, max_addr, nid;
      @@
      (
      |
      - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
      + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
      nid)
      |
      - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
      + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
      nid)
      |
      - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
      + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
      |
      - memblock_alloc(size, 0)
      + memblock_alloc(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_raw(size, 0)
      + memblock_alloc_raw(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_from(size, 0, min_addr)
      + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
      |
      - memblock_alloc_nopanic(size, 0)
      + memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_low(size, 0)
      + memblock_alloc_low(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_low_nopanic(size, 0)
      + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
      |
      - memblock_alloc_from_nopanic(size, 0, min_addr)
      + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
      |
      - memblock_alloc_node(size, 0, nid)
      + memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
      )
      
      [mhocko@suse.com: changelog update]
      [akpm@linux-foundation.org: coding-style fixes]
      [rppt@linux.ibm.com: fix missed uses of implicit alignment]
        Link: http://lkml.kernel.org/r/20181016133656.GA10925@rapoport-lnx
      Link: http://lkml.kernel.org/r/1538687224-17535-1-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: Paul Burton <paul.burton@mips.com>	[MIPS]
      Acked-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7e1c4e27
    • Mike Rapoport's avatar
      mm: remove include/linux/bootmem.h · 57c8a661
      Mike Rapoport authored
      Move remaining definitions and declarations from include/linux/bootmem.h
      into include/linux/memblock.h and remove the redundant header.
      
      The includes were replaced with the semantic patch below and then
      semi-automated removal of duplicated '#include <linux/memblock.h>
      
      @@
      @@
      - #include <linux/bootmem.h>
      + #include <linux/memblock.h>
      
      [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
        Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
      [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
        Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
      [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
        Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
      Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57c8a661
    • Mike Rapoport's avatar
      memblock: replace BOOTMEM_ALLOC_* with MEMBLOCK variants · 97ad1087
      Mike Rapoport authored
      Drop BOOTMEM_ALLOC_ACCESSIBLE and BOOTMEM_ALLOC_ANYWHERE in favor of
      identical MEMBLOCK definitions.
      
      Link: http://lkml.kernel.org/r/1536927045-23536-29-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97ad1087
    • Mike Rapoport's avatar
      memblock: rename free_all_bootmem to memblock_free_all · c6ffc5ca
      Mike Rapoport authored
      The conversion is done using
      
      sed -i 's@free_all_bootmem@memblock_free_all@' \
          $(git grep -l free_all_bootmem)
      
      Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c6ffc5ca
    • Mike Rapoport's avatar
      memblock: replace free_bootmem_late with memblock_free_late · 53ab85eb
      Mike Rapoport authored
      The free_bootmem_late and memblock_free_late do exactly the same thing:
      they iterate over a range and give pages to the page allocator.
      
      Replace calls to free_bootmem_late with calls to memblock_free_late and
      remove the bootmem variant.
      
      Link: http://lkml.kernel.org/r/1536927045-23536-25-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53ab85eb