1. 10 Feb, 2010 1 commit
    • Suresh Siddha's avatar
      x86, apic: Don't use logical-flat mode when CPU hotplug may exceed 8 CPUs · 681ee44d
      Suresh Siddha authored
      We need to fall back from logical-flat APIC mode to physical-flat mode
      when we have more than 8 CPUs.  However, in the presence of CPU
      hotplug(with bios listing not enabled but possible cpus as disabled cpus in
      MADT), we have to consider the number of possible CPUs rather than
      the number of current CPUs; otherwise we may cross the 8-CPU boundary
      when CPUs are added later.
      32bit apic code can use more cleanups (like the removal of vendor checks in
      32bit default_setup_apic_routing()) and more unifications with 64bit code.
      Yinghai has some patches in works already. This patch addresses the boot issue
      that is reported in the virtualization guest context.
      [ hpa: incorporated function annotation feedback from Yinghai Lu ]
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1265767304.2833.19.camel@sbs-t61.sc.intel.com>
      Acked-by: default avatarShaohui Zheng <shaohui.zheng@intel.com>
      Reviewed-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  2. 11 Dec, 2009 1 commit
    • Mike Travis's avatar
      x86: Limit the number of processor bootup messages · 2eaad1fd
      Mike Travis authored
      When there are a large number of processors in a system, there
      is an excessive amount of messages sent to the system console.
      It's estimated that with 4096 processors in a system, and the
      console baudrate set to 56K, the startup messages will take
      about 84 minutes to clear the serial port.
      This set of patches limits the number of repetitious messages
      which contain no additional information.  Much of this information
      is obtainable from the /proc and /sysfs.   Some of the messages
      are also sent to the kernel log buffer as KERN_DEBUG messages so
      dmesg can be used to examine more closely any details specific to
      a problem.
      The new cpu bootup sequence for system_state == SYSTEM_BOOTING:
      Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
      Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
      Booting Node   3, Processors  #56 #57 #58 #59 #60 #61 #62 #63 Ok.
      Brought up 64 CPUs
      After the system is running, a single line boot message is displayed
      when CPU's are hotplugged on:
          Booting Node %d Processor %d APIC 0x%x
      Status of the following lines:
          CPU: Physical Processor ID:		printed once (for boot cpu)
          CPU: Processor Core ID:		printed once (for boot cpu)
          CPU: Hyper-Threading is disabled	printed once (for boot cpu)
          CPU: Thermal monitoring enabled	printed once (for boot cpu)
          CPU %d/0x%x -> Node %d:		removed
          CPU %d is now offline:		only if system_state == RUNNING
          Initializing CPU#%d:		KERN_DEBUG
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      LKML-Reference: <4B219E28.8080601@sgi.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  3. 02 Dec, 2009 1 commit
  4. 15 Nov, 2009 1 commit
  5. 08 Nov, 2009 1 commit
    • Frederic Weisbecker's avatar
      hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events · 24f1e32c
      Frederic Weisbecker authored
      This patch rebase the implementation of the breakpoints API on top of
      perf events instances.
      Each breakpoints are now perf events that handle the
      register scheduling, thread/cpu attachment, etc..
      The new layering is now made as follows:
             ptrace       kgdb      ftrace   perf syscall
                \          |          /         /
                 \         |         /         /
                  Core breakpoint API        /
                           |               /
                           |              /
                    Breakpoints perf events
                     Breakpoints PMU ---- Debug Register constraints handling
                                          (Part of core breakpoint API)
                   Hardware debug registers
      Reasons of this rewrite:
      - Use the centralized/optimized pmu registers scheduling,
        implying an easier arch integration
      - More powerful register handling: perf attributes (pinned/flexible
        events, exclusive/non-exclusive, tunable period, etc...)
      - New perf ABI: the hardware breakpoints counters
      - Ptrace breakpoints setting remains tricky and still needs some per
        thread breakpoints references.
      Todo (in the order):
      - Support breakpoints perf counter events for perf tools (ie: implement
      - Support from perf tools
      Changes in v2:
      - Follow the perf "event " rename
      - The ptrace regression have been fixed (ptrace breakpoint perf events
        weren't released when a task ended)
      - Drop the struct hw_breakpoint and store generic fields in
      - Separate core and arch specific headers, drop
        asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
      - Use new generic len/type for breakpoint
      - Handle off case: when breakpoints api is not supported by an arch
      Changes in v3:
      - Fix broken CONFIG_KVM, we need to propagate the breakpoint api
        changes to kvm when we exit the guest and restore the bp registers
        to the host.
      Changes in v4:
      - Drop the hw_breakpoint_restore() stub as it is only used by KVM
      - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
      - Restore the breakpoints unconditionally on kvm guest exit:
        TIF_DEBUG_THREAD doesn't anymore cover every cases of running
        breakpoints and vcpu->arch.switch_db_regs might not always be
        set when the guest used debug registers.
        (Waiting for a reliable optimization)
      Changes in v5:
      - Split-up the asm-generic/hw-breakpoint.h moving to
        linux/hw_breakpoint.h into a separate patch
      - Optimize the breakpoints restoring while switching from kvm guest
        to host. We only want to restore the state if we have active
        breakpoints to the host, otherwise we don't care about messed-up
        address registers.
      - Add asm/hw_breakpoint.h to Kbuild
      - Fix bad breakpoint type in trace_selftest.c
      Changes in v6:
      - Fix wrong header inclusion in trace.h (triggered a build
        error with CONFIG_FTRACE_SELFTEST
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jan Kiszka <jan.kiszka@web.de>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
  6. 24 Sep, 2009 1 commit
  7. 03 Sep, 2009 1 commit
    • Andreas Herrmann's avatar
      x86, sched: Workaround broken sched domain creation for AMD Magny-Cours · 5a925b42
      Andreas Herrmann authored
      Current sched domain creation code can't handle multi-node processors.
      When switching to power_savings scheduling errors show up and
      system might hang later on (due to broken sched domain hierarchy):
        # echo 0  >> /sys/devices/system/cpu/sched_mc_power_savings
        CPU0 attaching sched-domain:
         domain 0: span 0-5 level MC
          groups: 0 1 2 3 4 5
          domain 1: span 0-23 level NODE
           groups: 0-5 6-11 18-23 12-17
        # echo 1  >> /sys/devices/system/cpu/sched_mc_power_savings
        CPU0 attaching sched-domain:
         domain 0: span 0-11 level MC
          groups: 0 1 2 3 4 5 6 7 8 9 10 11
        ERROR: parent span is not a superset of domain->span
          domain 1: span 0-5 level CPU
        ERROR: domain->groups does not contain CPU0
           groups: 6-11 (__cpu_power = 12288)
        ERROR: groups don't span domain->span
           domain 2: span 0-23 level NODE
        ERROR: domain->cpu_power not set
        ERROR: groups don't span domain->span
      Fixing all aspects of power-savings scheduling for Magny-Cours needs
      some larger changes in the sched domain creation code.
      As a short-term and temporary workaround avoid the problems by
      extending "the worst possible hack" ;-(
      and always use llc_shared_map on AMD Magny-Cours when MC domain span
      is calculated.
      With this I get:
        # echo 1  >> /sys/devices/system/cpu/sched_mc_power_savings
        CPU0 attaching sched-domain:
         domain 0: span 0-5 level MC
          groups: 0 1 2 3 4 5
          domain 1: span 0-5 level CPU
           groups: 0-5 (__cpu_power = 6144)
           domain 2: span 0-23 level NODE
            groups: 0-5 (__cpu_power = 6144) 6-11 (__cpu_power = 6144) 18-23 (__cpu_power = 6144) 12-17 (__cpu_power = 6144)
      I.e. no errors during sched domain creation, no system hangs, and also
      mc_power_savings scheduling works to a certain extend.
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarAndreas Herrmann <andreas.herrmann3@amd.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  8. 02 Sep, 2009 1 commit
  9. 31 Aug, 2009 1 commit
  10. 21 Aug, 2009 1 commit
    • Suresh Siddha's avatar
      x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init · d0af9eed
      Suresh Siddha authored
      SDM Vol 3a section titled "MTRR considerations in MP systems" specifies
      the need for synchronizing the logical cpu's while initializing/updating
      Currently Linux kernel does the synchronization of all cpu's only when
      a single MTRR register is programmed/updated. During an AP online
      (during boot/cpu-online/resume)  where we initialize all the MTRR/PAT registers,
      we don't follow this synchronization algorithm.
      This can lead to scenarios where during a dynamic cpu online, that logical cpu
      is initializing MTRR/PAT with cache disabled (cr0.cd=1) etc while other logical
      HT sibling continue to run (also with cache disabled because of cr0.cd=1
      on its sibling).
      Starting from Westmere, VMX transitions with cr0.cd=1 don't work properly
      (because of some VMX performance optimizations) and the above scenario
      (with one logical cpu doing VMX activity and another logical cpu coming online)
      can result in system crash.
      Fix the MTRR initialization by doing rendezvous of all the cpus. During
      boot and resume, we delay the MTRR/PAT init for APs till all the
      logical cpu's come online and the rendezvous process at the end of AP's bringup,
      will initialize the MTRR/PAT for all AP's.
      For dynamic single cpu online, we synchronize all the logical cpus and
      do the MTRR/PAT init on the AP that is coming online.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  11. 21 Jul, 2009 1 commit
    • Joseph Cihula's avatar
      x86, intel_txt: Intel TXT Sx shutdown support · 86886e55
      Joseph Cihula authored
      Support for graceful handling of sleep states (S3/S4/S5) after an Intel(R) TXT launch.
      Without this patch, attempting to place the system in one of the ACPI sleep
      states (S3/S4/S5) will cause the TXT hardware to treat this as an attack and
      will cause a system reset, with memory locked.  Not only may the subsequent
      memory scrub take some time, but the platform will be unable to enter the
      requested power state.
      This patch calls back into the tboot so that it may properly and securely clean
      up system state and clear the secrets-in-memory flag, after which it will place
      the system into the requested sleep state using ACPI information passed by the kernel.
       arch/x86/kernel/smpboot.c     |    2 ++
       drivers/acpi/acpica/hwsleep.c |    3 +++
       kernel/cpu.c                  |    7 ++++++-
       3 files changed, 11 insertions(+), 1 deletion(-)
      Signed-off-by: default avatarJoseph Cihula <joseph.cihula@intel.com>
      Signed-off-by: default avatarShane Wang <shane.wang@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  12. 12 Jun, 2009 1 commit
    • Yinghai Lu's avatar
      x86: make zap_low_mapping could be used early · 55cd6367
      Yinghai Lu authored
      Only one cpu is there, just call __flush_tlb for it. Fixes the following boot
      warning on x86:
        [    0.000000] Memory: 885032k/915540k available (5993k kernel code, 29844k reserved, 3842k data, 428k init, 0k highmem)
        [    0.000000] virtual kernel memory layout:
        [    0.000000]     fixmap  : 0xffe17000 - 0xfffff000   (1952 kB)
        [    0.000000]     vmalloc : 0xf8615000 - 0xffe15000   ( 120 MB)
        [    0.000000]     lowmem  : 0xc0000000 - 0xf7e15000   ( 894 MB)
        [    0.000000]       .init : 0xc19a5000 - 0xc1a10000   ( 428 kB)
        [    0.000000]       .data : 0xc15da4bb - 0xc199af6c   (3842 kB)
        [    0.000000]       .text : 0xc1000000 - 0xc15da4bb   (5993 kB)
        [    0.000000] Checking if this processor honours the WP bit even in supervisor mode...Ok.
        [    0.000000] ------------[ cut here ]------------
        [    0.000000] WARNING: at kernel/smp.c:369 smp_call_function_many+0x50/0x1b0()
        [    0.000000] Hardware name: System Product Name
        [    0.000000] Modules linked in:
        [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip #52504
        [    0.000000] Call Trace:
        [    0.000000]  [<c104aa16>] warn_slowpath_common+0x65/0x95
        [    0.000000]  [<c104aa58>] warn_slowpath_null+0x12/0x15
        [    0.000000]  [<c1073bbe>] smp_call_function_many+0x50/0x1b0
        [    0.000000]  [<c1037615>] ? do_flush_tlb_all+0x0/0x41
        [    0.000000]  [<c1037615>] ? do_flush_tlb_all+0x0/0x41
        [    0.000000]  [<c1073d4f>] smp_call_function+0x31/0x58
        [    0.000000]  [<c1037615>] ? do_flush_tlb_all+0x0/0x41
        [    0.000000]  [<c104f635>] on_each_cpu+0x26/0x65
        [    0.000000]  [<c10374b5>] flush_tlb_all+0x19/0x1b
        [    0.000000]  [<c1032ab3>] zap_low_mappings+0x4d/0x56
        [    0.000000]  [<c15d64b5>] ? printk+0x14/0x17
        [    0.000000]  [<c19b42a8>] mem_init+0x23d/0x245
        [    0.000000]  [<c19a56a1>] start_kernel+0x17a/0x2d5
        [    0.000000]  [<c19a5347>] ? unknown_bootoption+0x0/0x19a
        [    0.000000]  [<c19a5039>] __init_begin+0x39/0x41
        [    0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
  13. 07 Jun, 2009 1 commit
    • Cyrill Gorcunov's avatar
      x86, apic: Fix dummy apic read operation together with broken MP handling · 103428e5
      Cyrill Gorcunov authored
      Ingo Molnar reported that read_apic is buggy novadays:
      [    0.000000] Using APIC driver default
      [    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
      [    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
      [    0.000000] APIC: disable apic facility
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] WARNING: at arch/x86/kernel/apic/apic.c:254 native_apic_read_dummy+0x2d/0x3b()
      [    0.000000] Hardware name: HP OmniBook PC
      Indeed we still rely on apic->read operation for SMP compiled
      kernel. And instead of disfigure the SMP code with #ifdef we
      allow to call apic->read. To capture any unexpected results
      we check for apic->read being called for sane reason via
      WARN_ON_ONCE but(!) instead of OR we should use AND logical
      operation (thanks Yinghai for spotting the root of the problem).
      Along with that we could be have bad MP table and we are
      to fix it that way no SMP started and no complains about
      BIOS bug if apic was just disabled via command line.
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20090607124840.GD4547@lenovo>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  14. 02 Jun, 2009 1 commit
  15. 19 Apr, 2009 1 commit
  16. 08 Apr, 2009 2 commits
  17. 13 Mar, 2009 6 commits
  18. 08 Mar, 2009 1 commit
  19. 26 Feb, 2009 2 commits
  20. 17 Feb, 2009 4 commits
  21. 15 Feb, 2009 1 commit
  22. 05 Feb, 2009 1 commit
  23. 31 Jan, 2009 2 commits
  24. 30 Jan, 2009 2 commits
  25. 29 Jan, 2009 4 commits