Skip to content
Snippets Groups Projects
  1. Oct 09, 2023
  2. Aug 24, 2023
    • Eric DeVolder's avatar
      crash: hotplug support for kexec_load() · a72bbec7
      Eric DeVolder authored
      The hotplug support for kexec_load() requires changes to the userspace
      kexec-tools and a little extra help from the kernel.
      
      Given a kdump capture kernel loaded via kexec_load(), and a subsequent
      hotplug event, the crash hotplug handler finds the elfcorehdr and rewrites
      it to reflect the hotplug change.  That is the desired outcome, however,
      at kernel panic time, the purgatory integrity check fails (because the
      elfcorehdr changed), and the capture kernel does not boot and no vmcore is
      generated.
      
      Therefore, the userspace kexec-tools/kexec must indicate to the kernel
      that the elfcorehdr can be modified (because the kexec excluded the
      elfcorehdr from the digest, and sized the elfcorehdr memory buffer
      appropriately).
      
      To facilitate hotplug support with kexec_load():
       - a new kexec flag KEXEC_UPATE_ELFCOREHDR indicates that it is
         safe for the kernel to modify the kexec_load()'d elfcorehdr
       - the /sys/kernel/crash_elfcorehdr_size node communicates the
         preferred size of the elfcorehdr memory buffer
       - The sysfs crash_hotplug nodes (ie.
         /sys/devices/system/[cpu|memory]/crash_hotplug) dynamically
         take into account kexec_file_load() vs kexec_load() and
         KEXEC_UPDATE_ELFCOREHDR.
         This is critical so that the udev rule processing of crash_hotplug
         is all that is needed to determine if the userspace unload-then-load
         of the kdump image is to be skipped, or not. The proposed udev
         rule change looks like:
         # The kernel updates the crash elfcorehdr for CPU and memory changes
         SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
         SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
      
      The table below indicates the behavior of kexec_load()'d kdump image
      updates (with the new udev crash_hotplug rule in place):
      
       Kernel |Kexec
       -------+-----+----
       Old    |Old  |New
              |  a  | a
       -------+-----+----
       New    |  a  | b
       -------+-----+----
      
      where kexec 'old' and 'new' delineate kexec-tools has the needed
      modifications for the crash hotplug feature, and kernel 'old' and 'new'
      delineate the kernel supports this crash hotplug feature.
      
      Behavior 'a' indicates the unload-then-reload of the entire kdump image. 
      For the kexec 'old' column, the unload-then-reload occurs due to the
      missing flag KEXEC_UPDATE_ELFCOREHDR.  An 'old' kernel (with 'new' kexec)
      does not present the crash_hotplug sysfs node, which leads to the
      unload-then-reload of the kdump image.
      
      Behavior 'b' indicates the desired optimized behavior of the kernel
      directly modifying the elfcorehdr and avoiding the unload-then-reload of
      the kdump image.
      
      If the udev rule is not updated with crash_hotplug node check, then no
      matter any combination of kernel or kexec is new or old, the kdump image
      continues to be unload-then-reload on hotplug changes.
      
      To fully support crash hotplug feature, there needs to be a rollout of
      kernel, kexec-tools and udev rule changes.  However, the order of the
      rollout of these pieces does not matter; kexec_load()'d kdump images still
      function for hotplug as-is.
      
      Link: https://lkml.kernel.org/r/20230814214446.6659-7-eric.devolder@oracle.com
      
      
      Signed-off-by: default avatarEric DeVolder <eric.devolder@oracle.com>
      Suggested-by: default avatarHari Bathini <hbathini@linux.ibm.com>
      Acked-by: default avatarHari Bathini <hbathini@linux.ibm.com>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Akhil Raj <lf32.dev@gmail.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov (AMD) <bp@alien8.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Mimi Zohar <zohar@linux.ibm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Weißschuh <linux@weissschuh.net>
      Cc: Valentin Schneider <vschneid@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a72bbec7
  3. Feb 03, 2023
  4. Sep 12, 2022
    • Valentin Schneider's avatar
      panic, kexec: make __crash_kexec() NMI safe · 05c62574
      Valentin Schneider authored
      Attempting to get a crash dump out of a debug PREEMPT_RT kernel via an NMI
      panic() doesn't work.  The cause of that lies in the PREEMPT_RT definition
      of mutex_trylock():
      
      	if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
      		return 0;
      
      This prevents an nmi_panic() from executing the main body of
      __crash_kexec() which does the actual kexec into the kdump kernel.  The
      warning and return are explained by:
      
        6ce47fd9 ("rtmutex: Warn if trylock is called from hard/softirq context")
        [...]
        The reasons for this are:
      
            1) There is a potential deadlock in the slowpath
      
            2) Another cpu which blocks on the rtmutex will boost the task
      	 which allegedly locked the rtmutex, but that cannot work
      	 because the hard/softirq context borrows the task context.
      
      Furthermore, grabbing the lock isn't NMI safe, so do away with kexec_mutex
      and replace it with an atomic variable.  This is somewhat overzealous as
      *some* callsites could keep using a mutex (e.g.  the sysfs-facing ones
      like crash_shrink_memory()), but this has the benefit of involving a
      single unified lock and preventing any future NMI-related surprises.
      
      Tested by triggering NMI panics via:
      
        $ echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi
        $ echo 1 > /proc/sys/kernel/unknown_nmi_panic
        $ echo 1 > /proc/sys/kernel/panic
      
        $ ipmitool power diag
      
      Link: https://lkml.kernel.org/r/20220630223258.4144112-3-vschneid@redhat.com
      
      
      Fixes: 6ce47fd9 ("rtmutex: Warn if trylock is called from hard/softirq context")
      Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Juri Lelli <jlelli@redhat.com>
      Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      05c62574
  5. Sep 08, 2021
  6. Oct 05, 2020
    • Kees Cook's avatar
      LSM: Introduce kernel_post_load_data() hook · b64fcae7
      Kees Cook authored
      
      There are a few places in the kernel where LSMs would like to have
      visibility into the contents of a kernel buffer that has been loaded or
      read. While security_kernel_post_read_file() (which includes the
      buffer) exists as a pairing for security_kernel_read_file(), no such
      hook exists to pair with security_kernel_load_data().
      
      Earlier proposals for just using security_kernel_post_read_file() with a
      NULL file argument were rejected (i.e. "file" should always be valid for
      the security_..._file hooks, but it appears at least one case was
      left in the kernel during earlier refactoring. (This will be fixed in
      a subsequent patch.)
      
      Since not all cases of security_kernel_load_data() can have a single
      contiguous buffer made available to the LSM hook (e.g. kexec image
      segments are separately loaded), there needs to be a way for the LSM to
      reason about its expectations of the hook coverage. In order to handle
      this, add a "contents" argument to the "kernel_load_data" hook that
      indicates if the newly added "kernel_post_load_data" hook will be called
      with the full contents once loaded. That way, LSMs requiring full contents
      can choose to unilaterally reject "kernel_load_data" with contents=false
      (which is effectively the existing hook coverage), but when contents=true
      they can allow it and later evaluate the "kernel_post_load_data" hook
      once the buffer is loaded.
      
      With this change, LSMs can gain coverage over non-file-backed data loads
      (e.g. init_module(2) and firmware userspace helper), which will happen
      in subsequent patches.
      
      Additionally prepare IMA to start processing these cases.
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarKP Singh <kpsingh@google.com>
      Link: https://lore.kernel.org/r/20201002173828.2099543-9-keescook@chromium.org
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b64fcae7
  7. Jan 08, 2020
  8. Aug 20, 2019
  9. Jun 19, 2019
  10. Jul 16, 2018
  11. Apr 02, 2018
  12. Jul 12, 2017
    • Xunlei Pang's avatar
      kdump: protect vmcoreinfo data under the crash memory · 1229384f
      Xunlei Pang authored
      Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
      has the risk of being modified by some wrong code during system is
      running.
      
      As a result, vmcore dumped may contain the wrong vmcoreinfo.  Later on,
      when using "crash", "makedumpfile", etc utility to parse this vmcore, we
      probably will get "Segmentation fault" or other unexpected errors.
      
      E.g.  1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
      system; 3) trigger kdump, then we obviously will fail to recognize the
      crash context correctly due to the corrupted vmcoreinfo.
      
      Now except for vmcoreinfo, all the crash data is well
      protected(including the cpu note which is fully updated in the crash
      path, thus its correctness is guaranteed).  Given that vmcoreinfo data
      is a large chunk prepared for kdump, we better protect it as well.
      
      To solve this, we relocate and copy vmcoreinfo_data to the crash memory
      when kdump is loading via kexec syscalls.  Because the whole crash
      memory will be protected by existing arch_kexec_protect_crashkres()
      mechanism, we naturally protect vmcoreinfo_data from write(even read)
      access under kernel direct mapping after kdump is loaded.
      
      Since kdump is usually loaded at the very early stage after boot, we can
      trust the correctness of the vmcoreinfo data copied.
      
      On the other hand, we still need to operate the vmcoreinfo safe copy
      when crash happens to generate vmcoreinfo_note again, we rely on vmap()
      to map out a new kernel virtual address and update to use this new one
      instead in the following crash_save_vmcoreinfo().
      
      BTW, we do not touch vmcoreinfo_note, because it will be fully updated
      using the protected vmcoreinfo_data after crash which is surely correct
      just like the cpu crash note.
      
      Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
      
      
      Signed-off-by: default avatarXunlei Pang <xlpang@redhat.com>
      Tested-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1229384f
  13. Aug 02, 2016
    • Russell King's avatar
      kexec: allow architectures to override boot mapping · 43546d86
      Russell King authored
      kexec physical addresses are the boot-time view of the system.  For
      certain ARM systems (such as Keystone 2), the boot view of the system
      does not match the kernel's view of the system: the boot view uses a
      special alias in the lower 4GB of the physical address space.
      
      To cater for these kinds of setups, we need to translate between the
      boot view physical addresses and the normal kernel view physical
      addresses.  This patch extracts the current transation points into
      linux/kexec.h, and allows an architecture to override the functions.
      
      Due to the translations required, we unfortunately end up with six
      translation functions, which are reduced down to four that the
      architecture can override.
      
      [akpm@linux-foundation.org: kexec.h needs asm/io.h for phys_to_virt()]
      Link: http://lkml.kernel.org/r/E1b8koP-0004HZ-Vf@rmk-PC.armlinux.org.uk
      
      
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Keerthy <j-keerthy@ti.com>
      Cc: Pratyush Anand <panand@redhat.com>
      Cc: Vitaly Andrianov <vitalya@ti.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Simon Horman <horms@verge.net.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43546d86
  14. May 24, 2016
    • Xunlei Pang's avatar
      s390/kexec: consolidate crash_map/unmap_reserved_pages() and... · 7a0058ec
      Xunlei Pang authored
      s390/kexec: consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(unprotect)_crashkres()
      
      Commit 3f625002581b ("kexec: introduce a protection mechanism for the
      crashkernel reserved memory") is a similar mechanism for protecting the
      crash kernel reserved memory to previous crash_map/unmap_reserved_pages()
      implementation, the new one is more generic in name and cleaner in code
      (besides, some arch may not be allowed to unmap the pgtable).
      
      Therefore, this patch consolidates them, and uses the new
      arch_kexec_protect(unprotect)_crashkres() to replace former
      crash_map/unmap_reserved_pages() which by now has been only used by
      S390.
      
      The consolidation work needs the crash memory to be mapped initially,
      this is done in machine_kdump_pm_init() which is after
      reserve_crashkernel().  Once kdump kernel is loaded, the new
      arch_kexec_protect_crashkres() implemented for S390 will actually
      unmap the pgtable like before.
      
      Signed-off-by: default avatarXunlei Pang <xlpang@redhat.com>
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Acked-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Minfei Huang <mhuang@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7a0058ec
    • Minfei Huang's avatar
      kexec: do a cleanup for function kexec_load · 0eea0867
      Minfei Huang authored
      
      There are a lof of work to be done in function kexec_load, not only for
      allocating structs and loading initram, but also for some misc.
      
      To make it more clear, wrap a new function do_kexec_load which is used
      to allocate structs and load initram.  And the pre-work will be done in
      kexec_load.
      
      Signed-off-by: default avatarMinfei Huang <mnfhuang@gmail.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Xunlei Pang <xlpang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0eea0867
    • Minfei Huang's avatar
      kexec: make a pair of map/unmap reserved pages in error path · 917a3560
      Minfei Huang authored
      
      For some arch, kexec shall map the reserved pages, then use them, when
      we try to start the kdump service.
      
      kexec may return directly, without unmaping the reserved pages, if it
      fails during starting service.  To fix it, we make a pair of map/unmap
      reserved pages both in generic path and error path.
      
      This patch only affects s390.  Other architecturess don't implement the
      interface of crash_unmap_reserved_pages and crash_map_reserved_pages.
      
      It isn't a urgent patch.  Kernel can work well without any risk,
      although the reserved pages are not unmapped before returning in error
      path.
      
      Signed-off-by: default avatarMinfei Huang <mnfhuang@gmail.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Xunlei Pang <xlpang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      917a3560
    • Xunlei Pang's avatar
      kexec: introduce a protection mechanism for the crashkernel reserved memory · 9b492cf5
      Xunlei Pang authored
      
      For the cases that some kernel (module) path stamps the crash reserved
      memory(already mapped by the kernel) where has been loaded the second
      kernel data, the kdump kernel will probably fail to boot when panic
      happens (or even not happens) leaving the culprit at large, this is
      unacceptable.
      
      The patch introduces a mechanism for detecting such cases:
      
      1) After each crash kexec loading, it simply marks the reserved memory
         regions readonly since we no longer access it after that.  When someone
         stamps the region, the first kernel will panic and trigger the kdump.
         The weak arch_kexec_protect_crashkres() is introduced to do the actual
         protection.
      
      2) To allow multiple loading, once 1) was done we also need to remark
         the reserved memory to readwrite each time a system call related to
         kdump is made.  The weak arch_kexec_unprotect_crashkres() is introduced
         to do the actual protection.
      
      The architecture can make its specific implementation by overriding
      arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres().
      
      Signed-off-by: default avatarXunlei Pang <xlpang@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Minfei Huang <mhuang@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b492cf5
  15. Jan 21, 2016
  16. Nov 07, 2015
  17. Sep 10, 2015
    • Dave Young's avatar
      kexec: split kexec_load syscall from kexec core code · 2965faa5
      Dave Young authored
      
      There are two kexec load syscalls, kexec_load another and kexec_file_load.
       kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
      split kexec_load syscall code to kernel/kexec.c.
      
      And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
      use kexec_file_load only, or vice verse.
      
      The original requirement is from Ted Ts'o, he want kexec kernel signature
      being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
      kexec_load syscall can bypass the checking.
      
      Vivek Goyal proposed to create a common kconfig option so user can compile
      in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
      KEXEC_CORE so that old config files still work.
      
      Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
      architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
      KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
      kexec_load syscall.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarDave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2965faa5
    • Dave Young's avatar
      kexec: split kexec_file syscall code to kexec_file.c · a43cac0d
      Dave Young authored
      
      Split kexec_file syscall related code to another file kernel/kexec_file.c
      so that the #ifdef CONFIG_KEXEC_FILE in kexec.c can be dropped.
      
      Sharing variables and functions are moved to kernel/kexec_internal.h per
      suggestion from Vivek and Petr.
      
      [akpm@linux-foundation.org: fix bisectability]
      [akpm@linux-foundation.org: declare the various arch_kexec functions]
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: default avatarDave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a43cac0d
  18. Jul 01, 2015
  19. Apr 23, 2015
    • Martin Schwidefsky's avatar
      kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP · 7e01b5ac
      Martin Schwidefsky authored
      
      Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
      to override the gfp flags of the allocation for the kexec control
      page. The loop in kimage_alloc_normal_control_pages allocates pages
      with GFP_KERNEL until a page is found that happens to have an
      address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
      with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
      the loop will keep allocating memory until the oom killer steps in.
      
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      7e01b5ac
  20. Feb 17, 2015
  21. Jan 26, 2015
  22. Dec 13, 2014
  23. Oct 14, 2014
  24. Aug 29, 2014
    • Vivek Goyal's avatar
      kexec: create a new config option CONFIG_KEXEC_FILE for new syscall · 74ca317c
      Vivek Goyal authored
      
      Currently new system call kexec_file_load() and all the associated code
      compiles if CONFIG_KEXEC=y.  But new syscall also compiles purgatory
      code which currently uses gcc option -mcmodel=large.  This option seems
      to be available only gcc 4.4 onwards.
      
      Hiding new functionality behind a new config option will not break
      existing users of old gcc.  Those who wish to enable new functionality
      will require new gcc.  Having said that, I am trying to figure out how
      can I move away from using -mcmodel=large but that can take a while.
      
      I think there are other advantages of introducing this new config
      option.  As this option will be enabled only on x86_64, other arches
      don't have to compile generic kexec code which will never be used.  This
      new code selects CRYPTO=y and CRYPTO_SHA256=y.  And all other arches had
      to do this for CONFIG_KEXEC.  Now with introduction of new config
      option, we can remove crypto dependency from other arches.
      
      Now CONFIG_KEXEC_FILE is available only on x86_64.  So whereever I had
      CONFIG_X86_64 defined, I got rid of that.
      
      For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
      "depends on CRYPTO=y".  This should be safer as "select" is not
      recursive.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Tested-by: default avatarShaun Ruffell <sruffell@digium.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74ca317c
  25. Aug 08, 2014
    • Vivek Goyal's avatar
      kexec: verify the signature of signed PE bzImage · 8e7d8381
      Vivek Goyal authored
      
      This is the final piece of the puzzle of verifying kernel image signature
      during kexec_file_load() syscall.
      
      This patch calls into PE file routines to verify signature of bzImage.  If
      signature are valid, kexec_file_load() succeeds otherwise it fails.
      
      Two new config options have been introduced.  First one is
      CONFIG_KEXEC_VERIFY_SIG.  This option enforces that kernel has to be
      validly signed otherwise kernel load will fail.  If this option is not
      set, no signature verification will be done.  Only exception will be when
      secureboot is enabled.  In that case signature verification should be
      automatically enforced when secureboot is enabled.  But that will happen
      when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      
      I tested these patches with both "pesign" and "sbsign" signed bzImages.
      
      I used signing_key.priv key and signing_key.x509 cert for signing as
      generated during kernel build process (if module signing is enabled).
      
      Used following method to sign bzImage.
      
      pesign
      ======
      - Convert DER format cert to PEM format cert
      openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
      PEM
      
      - Generate a .p12 file from existing cert and private key file
      openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
      signing_key.x509.PEM
      
      - Import .p12 file into pesign db
      pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign
      
      - Sign bzImage
      pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
      -c "Glacier signing key - Magrathea" -s
      
      sbsign
      ======
      sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
      /boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+
      
      Patch details:
      
      Well all the hard work is done in previous patches.  Now bzImage loader
      has just call into that code and verify whether bzImage signature are
      valid or not.
      
      Also create two config options.  First one is CONFIG_KEXEC_VERIFY_SIG.
      This option enforces that kernel has to be validly signed otherwise kernel
      load will fail.  If this option is not set, no signature verification will
      be done.  Only exception will be when secureboot is enabled.  In that case
      signature verification should be automatically enforced when secureboot is
      enabled.  But that will happen when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Matt Fleming <matt@console-pimps.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e7d8381
    • Vivek Goyal's avatar
      kexec: support for kexec on panic using new system call · dd5f7260
      Vivek Goyal authored
      
      This patch adds support for loading a kexec on panic (kdump) kernel usning
      new system call.
      
      It prepares ELF headers for memory areas to be dumped and for saved cpu
      registers.  Also prepares the memory map for second kernel and limits its
      boot to reserved areas only.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd5f7260
    • Vivek Goyal's avatar
      kexec-bzImage64: support for loading bzImage using 64bit entry · 27f48d3e
      Vivek Goyal authored
      
      This is loader specific code which can load bzImage and set it up for
      64bit entry.  This does not take care of 32bit entry or real mode entry.
      
      32bit mode entry can be implemented if somebody needs it.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27f48d3e
    • Vivek Goyal's avatar
      kexec: load and relocate purgatory at kernel load time · 12db5562
      Vivek Goyal authored
      
      Load purgatory code in RAM and relocate it based on the location.
      Relocation code has been inspired by module relocation code and purgatory
      relocation code in kexec-tools.
      
      Also compute the checksums of loaded kexec segments and store them in
      purgatory.
      
      Arch independent code provides this functionality so that arch dependent
      bootloaders can make use of it.
      
      Helper functions are provided to get/set symbol values in purgatory which
      are used by bootloaders later to set things like stack and entry point of
      second kernel etc.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      12db5562
    • Vivek Goyal's avatar
      kexec: implementation of new syscall kexec_file_load · cb105258
      Vivek Goyal authored
      
      Previous patch provided the interface definition and this patch prvides
      implementation of new syscall.
      
      Previously segment list was prepared in user space.  Now user space just
      passes kernel fd, initrd fd and command line and kernel will create a
      segment list internally.
      
      This patch contains generic part of the code.  Actual segment preparation
      and loading is done by arch and image specific loader.  Which comes in
      next patch.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb105258
    • Vivek Goyal's avatar
      kexec: new syscall kexec_file_load() declaration · f0895685
      Vivek Goyal authored
      
      This is the new syscall kexec_file_load() declaration/interface.  I have
      reserved the syscall number only for x86_64 so far.  Other architectures
      (including i386) can reserve syscall number when they enable the support
      for this new syscall.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0895685
    • Vivek Goyal's avatar
      kexec: use common function for kimage_normal_alloc() and kimage_crash_alloc() · 255aedd9
      Vivek Goyal authored
      
      kimage_normal_alloc() and kimage_crash_alloc() are doing lot of similar
      things and differ only little.  So instead of having two separate
      functions create a common function kimage_alloc_init() and pass it the
      "flags" argument which tells whether it is normal kexec or kexec_on_panic.
       And this function should be able to deal with both the cases.
      
      This consolidation also helps later where we can use a common function
      kimage_file_alloc_init() to handle normal and crash cases for new file
      based kexec syscall.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      255aedd9
Loading