Skip to content
Snippets Groups Projects
  1. Sep 17, 2024
    • Tiezhu Yang's avatar
      objtool: Handle frame pointer related instructions · da5b2ad1
      Tiezhu Yang authored
      
      After commit a0f7085f ("LoongArch: Add RANDOMIZE_KSTACK_OFFSET
      support"), there are three new instructions "addi.d $fp, $sp, 32",
      "sub.d $sp, $sp, $t0" and "addi.d $sp, $fp, -32" for the secondary
      stack in do_syscall(), then there is a objtool warning "return with
      modified stack frame" and no handle_syscall() which is the previous
      frame of do_syscall() in the call trace when executing the command
      "echo l > /proc/sysrq-trigger".
      
      objdump shows something like this:
      
      0000000000000000 <do_syscall>:
         0:   02ff8063        addi.d          $sp, $sp, -32
         4:   29c04076        st.d            $fp, $sp, 16
         8:   29c02077        st.d            $s0, $sp, 8
         c:   29c06061        st.d            $ra, $sp, 24
        10:   02c08076        addi.d          $fp, $sp, 32
        ...
        74:   0011b063        sub.d           $sp, $sp, $t0
        ...
        a8:   4c000181        jirl            $ra, $t0, 0
        ...
        dc:   02ff82c3        addi.d          $sp, $fp, -32
        e0:   28c06061        ld.d            $ra, $sp, 24
        e4:   28c04076        ld.d            $fp, $sp, 16
        e8:   28c02077        ld.d            $s0, $sp, 8
        ec:   02c08063        addi.d          $sp, $sp, 32
        f0:   4c000020        jirl            $zero, $ra, 0
      
      The instruction "sub.d $sp, $sp, $t0" changes the stack bottom and the
      new stack size is a random value, in order to find the return address of
      do_syscall() which is stored in the original stack frame after executing
      "jirl $ra, $t0, 0", it should use fp which points to the original stack
      top.
      
      At the beginning, the thought is tended to decode the secondary stack
      instruction "sub.d $sp, $sp, $t0" and set it as a label, then check this
      label for the two frame pointer instructions to change the cfa base and
      cfa offset during the period of secondary stack in update_cfi_state().
      This is valid for GCC but invalid for Clang due to there are different
      secondary stack instructions for ClangBuiltLinux on LoongArch, something
      like this:
      
      0000000000000000 <do_syscall>:
        ...
        88:   00119064        sub.d           $a0, $sp, $a0
        8c:   00150083        or              $sp, $a0, $zero
        ...
      
      Actually, it equals to a single instruction "sub.d $sp, $sp, $a0", but
      there is no proper condition to check it as a label like GCC, and so the
      beginning thought is not a good way.
      
      Essentially, there are two special frame pointer instructions which are
      "addi.d $fp, $sp, imm" and "addi.d $sp, $fp, imm", the first one points
      fp to the original stack top and the second one restores the original
      stack bottom from fp.
      
      Based on the above analysis, in order to avoid adding an arch-specific
      update_cfi_state(), we just add a member "frame_pointer" in the "struct
      symbol" as a label to avoid affecting the current normal case, then set
      it as true only if there is "addi.d $sp, $fp, imm". The last is to check
      this label for the two frame pointer instructions to change the cfa base
      and cfa offset in update_cfi_state().
      
      Tested with the following two configs:
      (1) CONFIG_RANDOMIZE_KSTACK_OFFSET=y &&
          CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=n
      (2) CONFIG_RANDOMIZE_KSTACK_OFFSET=y &&
          CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
      
      By the way, there is no effect for x86 with this patch, tested on the
      x86 machine with Fedora 40 system.
      
      Cc: stable@vger.kernel.org # 6.9+
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      da5b2ad1
  2. Aug 18, 2024
    • Miguel Ojeda's avatar
      objtool/rust: list `noreturn` Rust functions · 56d680dd
      Miguel Ojeda authored
      Rust functions may be `noreturn` (i.e. diverging) by returning the
      "never" type, `!`, e.g.
      
          fn f() -> ! {
              loop {}
          }
      
      Thus list the known `noreturn` functions to avoid such warnings.
      
      Without this, `objtool` would complain if enabled for Rust, e.g.:
      
          rust/core.o: warning: objtool:
          _R...9panic_fmt() falls through to next function _R...18panic_nounwind_fmt()
      
          rust/alloc.o: warning: objtool:
          .text: unexpected end of section
      
      In order to do so, we cannot match symbols' names exactly, for two
      reasons:
      
        - Rust mangling scheme [1] contains disambiguators [2] which we
          cannot predict (e.g. they may vary depending on the compiler version).
      
          One possibility to solve this would be to parse v0 and ignore/zero
          those before comparison.
      
        - Some of the diverging functions come from `core`, i.e. the Rust
          standard library, which may change with each compiler version
          since they are implementation details (e.g. `panic_internals`).
      
      Thus, to workaround both issues, only part of the symbols are matched,
      instead of using the `NORETURN` macro in `noreturns.h`.
      
      Ideally, just like for the C side, we should have a better solution. For
      instance, the compiler could give us the list via something like:
      
          $ rustc --emit=noreturns ...
      
      [ Kees agrees this should be automated and Peter says:
      
          So it would be fairly simple to make objtool consume a magic section
          emitted by the compiler.. I think we've asked the compiler folks
          for that at some point even, but I don't have clear recollections.
      
        We will ask upstream Rust about it. And if they agree, then perhaps
        we can get Clang/GCC to implement something similar too -- for this
        sort of thing we can take advantage of the shorter cycles of `rustc`
        as well as their unstable features concept to experiment.
      
        Gary proposed using DWARF (though it would need to be available), and
        wrote a proof of concept script using the `object` and `gimli` crates:
        https://gist.github.com/nbdd0121/449692570622c2f46a29ad9f47c3379a
      
          - Miguel ]
      
      Link: https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html [1]
      Link: https://doc.rust-lang.org/rustc/symbol-mangling/v0.html#disambiguator
      
       [2]
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarAlice Ryhl <aliceryhl@google.com>
      Reviewed-by: default avatarKees Cook <kees@kernel.org>
      Tested-by: default avatarBenno Lossin <benno.lossin@proton.me>
      Link: https://lore.kernel.org/r/20240725183325.122827-6-ojeda@kernel.org
      
      
      [ Added `len_mismatch_fail` symbol for new `kernel` crate code merged
        since then as well as 3 more `core::panicking` symbols that appear
        in `RUST_DEBUG_ASSERTIONS=y` builds.  - Miguel ]
      Signed-off-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      56d680dd
  3. Jul 04, 2024
    • Ilya Leoshkevich's avatar
      kmsan: allow disabling KMSAN checks for the current task · ec3e837d
      Ilya Leoshkevich authored
      Like for KASAN, it's useful to temporarily disable KMSAN checks around,
      e.g., redzone accesses.  Introduce kmsan_disable_current() and
      kmsan_enable_current(), which are similar to their KASAN counterparts.
      
      Make them reentrant in order to handle memory allocations in interrupt
      context.  Repurpose the allow_reporting field for this.
      
      Link: https://lkml.kernel.org/r/20240621113706.315500-12-iii@linux.ibm.com
      
      
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: <kasan-dev@googlegroups.com>
      Cc: Marco Elver <elver@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ec3e837d
  4. Jul 03, 2024
  5. Jul 01, 2024
    • Borislav Petkov (AMD)'s avatar
      x86/alternatives, kvm: Fix a couple of CALLs without a frame pointer · 0d3db1f1
      Borislav Petkov (AMD) authored
      objtool complains:
      
        arch/x86/kvm/kvm.o: warning: objtool: .altinstr_replacement+0xc5: call without frame pointer save/setup
        vmlinux.o: warning: objtool: .altinstr_replacement+0x2eb: call without frame pointer save/setup
      
      Make sure %rSP is an output operand to the respective asm() statements.
      
      The test_cc() hunk and ALT_OUTPUT_SP() courtesy of peterz. Also from him
      add some helpful debugging info to the documentation.
      
      Now on to the explanations:
      
      tl;dr: The alternatives macros are pretty fragile.
      
      If I do ALT_OUTPUT_SP(output) in order to be able to package in a %rsp
      reference for objtool so that a stack frame gets properly generated, the
      inline asm input operand with positional argument 0 in clear_page():
      
      	"0" (page)
      
      gets "renumbered" due to the added
      
      	: "+r" (current_stack_pointer), "=D" (page)
      
      and then gcc says:
      
        ./arch/x86/include/asm/page_64.h:53:9: error: inconsistent operand constraints in an ‘asm’
      
      The fix is to use an explicit "D" constraint which points to a singleton
      register class (gcc terminology) which ends up doing what is expected
      here: the page pointer - input and output - should be in the same %rdi
      register.
      
      Other register classes have more than one register in them - example:
      "r" and "=r" or "A":
      
        ‘A’
      	The ‘a’ and ‘d’ registers.  This class is used for
      	instructions that return double word results in the ‘ax:dx’
      	register pair.  Single word values will be allocated either in
      	‘ax’ or ‘dx’.
      
      so using "D" and "=D" just works in this particular case.
      
      And yes, one would say, sure, why don't you do "+D" but then:
      
        : "+r" (current_stack_pointer), "+D" (page)
        : [old] "i" (clear_page_orig), [new1] "i" (clear_page_rep), [new2] "i" (clear_page_erms),
        : "cc", "memory", "rax", "rcx")
      
      now find the Waldo^Wcomma which throws a wrench into all this.
      
      Because that silly macro has an "input..." consume-all last macro arg
      and in it, one is supposed to supply input *and* clobbers, leading to
      silly syntax snafus.
      
      Yap, they need to be cleaned up, one fine day...
      
      Closes: https://lore.kernel.org/oe-kbuild-all/202406141648.jO9qNGLa-lkp@intel.com/
      
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Acked-by: default avatarSean Christopherson <seanjc@google.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240625112056.GDZnqoGDXgYuWBDUwu@fat_crate.local
      0d3db1f1
  6. Jun 28, 2024
  7. Jun 11, 2024
    • Peter Zijlstra's avatar
      x86/alternatives: Add nested alternatives macros · d2a793da
      Peter Zijlstra authored
      
      Instead of making increasingly complicated ALTERNATIVE_n()
      implementations, use a nested alternative expression.
      
      The only difference between:
      
        ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)
      
      and
      
        ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
                    newinst2, flag2)
      
      is that the outer alternative can add additional padding when the inner
      alternative is the shorter one, which then results in
      alt_instr::instrlen being inconsistent.
      
      However, this is easily remedied since the alt_instr entries will be
      consecutive and it is trivial to compute the max(alt_instr::instrlen) at
      runtime while patching.
      
      Specifically, after this the ALTERNATIVE_2 macro, after CPP expansion
      (and manual layout), looks like this:
      
        .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
        740:
        740: \oldinstr ;
        741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ;
        742: .pushsection .altinstructions,"a" ;
        	altinstr_entry 740b,743f,\ft_flags1,742b-740b,744f-743f ;
        .popsection ;
        .pushsection .altinstr_replacement,"ax" ;
        743: \newinstr1 ;
        744: .popsection ; ;
        741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ;
        742: .pushsection .altinstructions,"a" ;
        altinstr_entry 740b,743f,\ft_flags2,742b-740b,744f-743f ;
        .popsection ;
        .pushsection .altinstr_replacement,"ax" ;
        743: \newinstr2 ;
        744: .popsection ;
        .endm
      
      The only label that is ambiguous is 740, however they all reference the
      same spot, so that doesn't matter.
      
      NOTE: obviously only @oldinstr may be an alternative; making @newinstr
      an alternative would mean patching .altinstr_replacement which very
      likely isn't what is intended, also the labels will be confused in that
      case.
      
        [ bp: Debug an issue where it would match the wrong two insns and
          and consider them nested due to the same signed offsets in the
          .alternative section and use instr_va() to compare the full virtual
          addresses instead.
      
          - Use new labels to denote that the new, nested
          alternatives are being used when staring at preprocessed output.
      
          - Use the %c constraint everywhere instead of %P and document the
            difference for future reference. ]
      
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Co-developed-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net
      d2a793da
  8. Mar 30, 2024
    • Mikulas Patocka's avatar
      objtool: Fix compile failure when using the x32 compiler · 6205125b
      Mikulas Patocka authored and Ingo Molnar's avatar Ingo Molnar committed
      
      When compiling the v6.9-rc1 kernel with the x32 compiler, the following
      errors are reported. The reason is that we take an "unsigned long"
      variable and print it using "PRIx64" format string.
      
      	In file included from check.c:16:
      	check.c: In function ‘add_dead_ends’:
      	/usr/src/git/linux-2.6/tools/objtool/include/objtool/warn.h:46:17: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘long unsigned int’ [-Werror=format=]
      	   46 |                 "%s: warning: objtool: " format "\n",   \
      	      |                 ^~~~~~~~~~~~~~~~~~~~~~~~
      	check.c:613:33: note: in expansion of macro ‘WARN’
      	  613 |                                 WARN("can't find unreachable insn at %s+0x%" PRIx64,
      	      |                                 ^~~~
      	...
      
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-kernel@vger.kernel.org
      6205125b
  9. Mar 11, 2024
  10. Mar 01, 2024
  11. Feb 29, 2024
    • Kees Cook's avatar
      fortify: Split reporting and avoid passing string pointer · 475ddf1f
      Kees Cook authored
      
      In preparation for KUnit testing and further improvements in fortify
      failure reporting, split out the report and encode the function and access
      failure (read or write overflow) into a single u8 argument. This mainly
      ends up saving a tiny bit of space in the data segment. For a defconfig
      with FORTIFY_SOURCE enabled:
      
      $ size gcc/vmlinux.before gcc/vmlinux.after
         text  	  data     bss     dec    	    hex filename
      26132309        9760658 2195460 38088427        2452eeb gcc/vmlinux.before
      26132386        9748382 2195460 38076228        244ff44 gcc/vmlinux.after
      
      Reviewed-by: default avatarAlexander Lobakin <aleksander.lobakin@intel.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      475ddf1f
  12. Feb 22, 2024
  13. Jan 31, 2024
  14. Jan 10, 2024
  15. Dec 15, 2023
  16. Nov 17, 2023
  17. Oct 20, 2023
    • Josh Poimboeuf's avatar
      objtool: Fix return thunk patching in retpolines · 34de4fe7
      Josh Poimboeuf authored
      
      With CONFIG_RETHUNK enabled, the compiler replaces every RET with a tail
      call to a return thunk ('JMP __x86_return_thunk').  Objtool annotates
      all such return sites so they can be patched during boot by
      apply_returns().
      
      The implementation of __x86_return_thunk() is just a bare RET.  It's
      only meant to be used temporarily until apply_returns() patches all
      return sites with either a JMP to another return thunk or an actual RET.
      
      Removing the .text..__x86.return_thunk section would break objtool's
      detection of return sites in retpolines.  Since retpolines and return
      thunks would land in the same section, the compiler no longer uses
      relocations for the intra-section jumps between the retpolines and the
      return thunk, causing objtool to overlook them.
      
      As a result, none of the retpolines' return sites would get patched.
      Each one stays at 'JMP __x86_return_thunk', effectively a bare RET.
      
      Fix it by teaching objtool to detect when a non-relocated jump target is
      a return thunk (or retpoline).
      
        [ bp: Massage the commit message now that the offending commit
          removing the .text..__x86.return_thunk section has been zapped.
          Still keep the objtool change here as it makes objtool more robust
          wrt handling such intra-TU jumps without relocations, should some
          toolchain and/or config generate them in the future. ]
      
      Reported-by: default avatarDavid Kaplan <david.kaplan@amd.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231012024737.eg5phclogp67ik6x@treble
      34de4fe7
  18. Oct 19, 2023
  19. Oct 06, 2023
  20. Oct 03, 2023
  21. Sep 18, 2023
  22. Sep 12, 2023
    • Josh Poimboeuf's avatar
      objtool: Fix _THIS_IP_ detection for cold functions · 72178d5d
      Josh Poimboeuf authored and Ingo Molnar's avatar Ingo Molnar committed
      
      Cold functions and their non-cold counterparts can use _THIS_IP_ to
      reference each other.  Don't warn about !ENDBR in that case.
      
      Note that for GCC this is currently irrelevant in light of the following
      commit
      
        c27cd083 ("Compiler attributes: GCC cold function alignment workarounds")
      
      which disabled cold functions in the kernel.  However this may still be
      possible with Clang.
      
      Fixes several warnings like the following:
      
        drivers/scsi/bnx2i/bnx2i.prelink.o: warning: objtool: bnx2i_hw_ep_disconnect+0x19d: relocation to !ENDBR: bnx2i_hw_ep_disconnect.cold+0x0
        drivers/net/ipvlan/ipvlan.prelink.o: warning: objtool: ipvlan_addr4_event.cold+0x28: relocation to !ENDBR: ipvlan_addr4_event+0xda
        drivers/net/ipvlan/ipvlan.prelink.o: warning: objtool: ipvlan_addr6_event.cold+0x26: relocation to !ENDBR: ipvlan_addr6_event+0xb7
        drivers/net/ethernet/broadcom/tg3.prelink.o: warning: objtool: tg3_set_ringparam.cold+0x17: relocation to !ENDBR: tg3_set_ringparam+0x115
        drivers/net/ethernet/broadcom/tg3.prelink.o: warning: objtool: tg3_self_test.cold+0x17: relocation to !ENDBR: tg3_self_test+0x2e1
        drivers/target/iscsi/cxgbit/cxgbit.prelink.o: warning: objtool: __cxgbit_free_conn.cold+0x24: relocation to !ENDBR: __cxgbit_free_conn+0xfb
        net/can/can.prelink.o: warning: objtool: can_rx_unregister.cold+0x2c: relocation to !ENDBR: can_rx_unregister+0x11b
        drivers/net/ethernet/qlogic/qed/qed.prelink.o: warning: objtool: qed_spq_post+0xc0: relocation to !ENDBR: qed_spq_post.cold+0x9a
        drivers/net/ethernet/qlogic/qed/qed.prelink.o: warning: objtool: qed_iwarp_ll2_comp_syn_pkt.cold+0x12f: relocation to !ENDBR: qed_iwarp_ll2_comp_syn_pkt+0x34b
        net/tipc/tipc.prelink.o: warning: objtool: tipc_nametbl_publish.cold+0x21: relocation to !ENDBR: tipc_nametbl_publish+0xa6
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/d8f1ab6a23a6105bc023c132b105f245c7976be6.1694476559.git.jpoimboe@kernel.org
      72178d5d
  23. Aug 16, 2023
    • Peter Zijlstra's avatar
      objtool/x86: Fixup frame-pointer vs rethunk · dbf46008
      Peter Zijlstra authored
      
      For stack-validation of a frame-pointer build, objtool validates that
      every CALL instruction is preceded by a frame-setup. The new SRSO
      return thunks violate this with their RSB stuffing trickery.
      
      Extend the __fentry__ exception to also cover the embedded_insn case
      used for this. This cures:
      
        vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup
      
      Fixes: 4ae68b26 ("objtool/x86: Fix SRSO mess")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Link: https://lore.kernel.org/r/20230816115921.GH980931@hirez.programming.kicks-ass.net
      dbf46008
    • Peter Zijlstra's avatar
      x86/cpu: Rename original retbleed methods · d025b7ba
      Peter Zijlstra authored
      
      Rename the original retbleed return thunk and untrain_ret to
      retbleed_return_thunk() and retbleed_untrain_ret().
      
      No functional changes.
      
      Suggested-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/20230814121148.909378169@infradead.org
      d025b7ba
    • Peter Zijlstra's avatar
      x86/cpu: Clean up SRSO return thunk mess · d43490d0
      Peter Zijlstra authored
      
      Use the existing configurable return thunk. There is absolute no
      justification for having created this __x86_return_thunk alternative.
      
      To clarify, the whole thing looks like:
      
      Zen3/4 does:
      
        srso_alias_untrain_ret:
      	  nop2
      	  lfence
      	  jmp srso_alias_return_thunk
      	  int3
      
        srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
      	  add $8, %rsp
      	  ret
      	  int3
      
        srso_alias_return_thunk:
      	  call srso_alias_safe_ret
      	  ud2
      
      While Zen1/2 does:
      
        srso_untrain_ret:
      	  movabs $foo, %rax
      	  lfence
      	  call srso_safe_ret           (jmp srso_return_thunk ?)
      	  int3
      
        srso_safe_ret: // embedded in movabs instruction
      	  add $8,%rsp
                ret
                int3
      
        srso_return_thunk:
      	  call srso_safe_ret
      	  ud2
      
      While retbleed does:
      
        zen_untrain_ret:
      	  test $0xcc, %bl
      	  lfence
      	  jmp zen_return_thunk
                int3
      
        zen_return_thunk: // embedded in the test instruction
      	  ret
                int3
      
      Where Zen1/2 flush the BTB entry using the instruction decoder trick
      (test,movabs) Zen3/4 use BTB aliasing. SRSO adds a return sequence
      (srso_safe_ret()) which forces the function return instruction to
      speculate into a trap (UD2).  This RET will then mispredict and
      execution will continue at the return site read from the top of the
      stack.
      
      Pick one of three options at boot (evey function can only ever return
      once).
      
        [ bp: Fixup commit message uarch details and add them in a comment in
          the code too. Add a comment about the srso_select_mitigation()
          dependency on retbleed_select_mitigation(). Add moar ifdeffery for
          32-bit builds. Add a dummy srso_untrain_ret_alias() definition for
          32-bit alternatives needing the symbol. ]
      
      Fixes: fb3bd914 ("x86/srso: Add a Speculative RAS Overflow mitigation")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/20230814121148.842775684@infradead.org
      d43490d0
    • Peter Zijlstra's avatar
      objtool/x86: Fix SRSO mess · 4ae68b26
      Peter Zijlstra authored
      
      Objtool --rethunk does two things:
      
       - it collects all (tail) call's of __x86_return_thunk and places them
         into .return_sites. These are typically compiler generated, but
         RET also emits this same.
      
       - it fudges the validation of the __x86_return_thunk symbol; because
         this symbol is inside another instruction, it can't actually find
         the instruction pointed to by the symbol offset and gets upset.
      
      Because these two things pertained to the same symbol, there was no
      pressing need to separate these two separate things.
      
      However, alas, along comes SRSO and more crazy things to deal with
      appeared.
      
      The SRSO patch itself added the following symbol names to identify as
      rethunk:
      
        'srso_untrain_ret', 'srso_safe_ret' and '__ret'
      
      Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a
      new similarly embedded return thunk, and 'srso_untrain_ret' is
      completely unrelated to anything the above does (and was only included
      because of that INT3 vs UD2 issue fixed previous).
      
      Clear things up by adding a second category for the embedded instruction
      thing.
      
      Fixes: fb3bd914 ("x86/srso: Add a Speculative RAS Overflow mitigation")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/20230814121148.704502245@infradead.org
      4ae68b26
  24. Aug 14, 2023
  25. Jul 27, 2023
    • Borislav Petkov (AMD)'s avatar
      x86/srso: Add a Speculative RAS Overflow mitigation · fb3bd914
      Borislav Petkov (AMD) authored
      
      Add a mitigation for the speculative return address stack overflow
      vulnerability found on AMD processors.
      
      The mitigation works by ensuring all RET instructions speculate to
      a controlled location, similar to how speculation is controlled in the
      retpoline sequence.  To accomplish this, the __x86_return_thunk forces
      the CPU to mispredict every function return using a 'safe return'
      sequence.
      
      To ensure the safety of this mitigation, the kernel must ensure that the
      safe return sequence is itself free from attacker interference.  In Zen3
      and Zen4, this is accomplished by creating a BTB alias between the
      untraining function srso_untrain_ret_alias() and the safe return
      function srso_safe_ret_alias() which results in evicting a potentially
      poisoned BTB entry and using that safe one for all function returns.
      
      In older Zen1 and Zen2, this is accomplished using a reinterpretation
      technique similar to Retbleed one: srso_untrain_ret() and
      srso_safe_ret().
      
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      fb3bd914
  26. Jul 10, 2023
  27. Jun 30, 2023
    • Ingo Molnar's avatar
      objtool: Remove btrfs_assertfail() from the noreturn exceptions list · 06697ca6
      Ingo Molnar authored
      
      The objtool merge in commit 6f612579 ("Merge tag 'objtool-core ...")
      generated a semantic conflict that was not resolved.
      
      The btrfs_assertfail() entry was removed from the noreturn list in
      commit b831306b ("btrfs: print assertion failure report and stack
      trace from the same line") because btrfs_assertfail() was changed from a
      noreturn function into a macro.
      
      The noreturn list was then moved from check.c to noreturns.h in commit
      6245ce4a ("objtool: Move noreturn function list to separate file"),
      and should be removed from that post-merge as well.
      
      Do it explicitly.
      
      Cc: David Sterba <dsterba@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      06697ca6
  28. Jun 19, 2023
    • David Sterba's avatar
      btrfs: print assertion failure report and stack trace from the same line · b831306b
      David Sterba authored
      
      Assertions reports are split into two parts, the exact file and location
      of the condition and then the stack trace printed from
      btrfs_assertfail(). This means all the stack traces report the same line
      and this is what's typically reported by various tools, making it harder
      to distinguish the reports.
      
        [403.2467] assertion failed: refcount_read(&block_group->refs) == 1, in fs/btrfs/block-group.c:4259
        [403.2479] ------------[ cut here ]------------
        [403.2484] kernel BUG at fs/btrfs/messages.c:259!
        [403.2488] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
        [403.2493] CPU: 2 PID: 23202 Comm: umount Not tainted 6.2.0-rc4-default+ #67
        [403.2499] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
        [403.2509] RIP: 0010:btrfs_assertfail+0x19/0x1b [btrfs]
        ...
        [403.2595] Call Trace:
        [403.2598]  <TASK>
        [403.2601]  btrfs_free_block_groups.cold+0x52/0xae [btrfs]
        [403.2608]  close_ctree+0x6c2/0x761 [btrfs]
        [403.2613]  ? __wait_for_common+0x2b8/0x360
        [403.2618]  ? btrfs_cleanup_one_transaction.cold+0x7a/0x7a [btrfs]
        [403.2626]  ? mark_held_locks+0x6b/0x90
        [403.2630]  ? lockdep_hardirqs_on_prepare+0x13d/0x200
        [403.2636]  ? __call_rcu_common.constprop.0+0x1ea/0x3d0
        [403.2642]  ? trace_hardirqs_on+0x2d/0x110
        [403.2646]  ? __call_rcu_common.constprop.0+0x1ea/0x3d0
        [403.2652]  generic_shutdown_super+0xb0/0x1c0
        [403.2657]  kill_anon_super+0x1e/0x40
        [403.2662]  btrfs_kill_super+0x25/0x30 [btrfs]
        [403.2668]  deactivate_locked_super+0x4c/0xc0
      
      By making btrfs_assertfail a macro we'll get the same line number for
      the BUG output:
      
        [63.5736] assertion failed: 0, in fs/btrfs/super.c:1572
        [63.5758] ------------[ cut here ]------------
        [63.5782] kernel BUG at fs/btrfs/super.c:1572!
        [63.5807] invalid opcode: 0000 [#2] PREEMPT SMP KASAN
        [63.5831] CPU: 0 PID: 859 Comm: mount Tainted: G      D            6.3.0-rc7-default+ #2062
        [63.5868] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
        [63.5905] RIP: 0010:btrfs_mount+0x24/0x30 [btrfs]
        [63.5964] RSP: 0018:ffff88800e69fcd8 EFLAGS: 00010246
        [63.5982] RAX: 000000000000002d RBX: ffff888008fc1400 RCX: 0000000000000000
        [63.6004] RDX: 0000000000000000 RSI: ffffffffb90fd868 RDI: ffffffffbcc3ff20
        [63.6026] RBP: ffffffffc081b200 R08: 0000000000000001 R09: ffff88800e69fa27
        [63.6046] R10: ffffed1001cd3f44 R11: 0000000000000001 R12: ffff888005a3c370
        [63.6062] R13: ffffffffc058e830 R14: 0000000000000000 R15: 00000000ffffffff
        [63.6081] FS:  00007f7b3561f800(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000
        [63.6105] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [63.6120] CR2: 00007fff83726e10 CR3: 0000000002a9e000 CR4: 00000000000006b0
        [63.6137] Call Trace:
        [63.6143]  <TASK>
        [63.6148]  legacy_get_tree+0x80/0xd0
        [63.6158]  vfs_get_tree+0x43/0x120
        [63.6166]  do_new_mount+0x1f3/0x3d0
        [63.6176]  ? do_add_mount+0x140/0x140
        [63.6187]  ? cap_capable+0xa4/0xe0
        [63.6197]  path_mount+0x223/0xc10
      
      This comes at a cost of bloating the final btrfs.ko module due all the
      inlining, as long as assertions are compiled in. This is a must for
      debugging builds but this is often enabled on release builds too.
      
      Release build:
      
         text    data     bss     dec     hex filename
      1251676   20317   16088 1288081  13a791 pre/btrfs.ko
      1260612   29473   16088 1306173  13ee3d post/btrfs.ko
      
      DELTA: +8936
      
      CC: Josh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      b831306b
  29. Jun 07, 2023
    • Josh Poimboeuf's avatar
      objtool: Skip reading DWARF section data · b4c96ef0
      Josh Poimboeuf authored
      Objtool doesn't use DWARF at all, and the DWARF sections' data take up a
      lot of memory.  Skip reading them.
      
      Note this only skips the DWARF base sections, not the rela sections.
      The relas are needed because their symbol references may need to be
      reindexed if any local symbols get added by elf_create_symbol().
      
      Also note the DWARF data will eventually be read by libelf anyway, when
      writing the object file.  But that's fine, the goal here is to reduce
      *peak* memory usage, and the previous patch (which freed insn memory)
      gave some breathing room.  So the allocation gets shifted to a later
      time, resulting in lower peak memory usage.
      
      With allyesconfig + CONFIG_DEBUG_INFO:
      
      - Before: peak heap memory consumption: 29.93G
      - After:  peak heap memory consumption: 25.47G
      
      Link: https://lore.kernel.org/r/52a9698835861dd35f2ec35c49f96d0bb39fb177.1685464332.git.jpoimboe@kernel.org
      
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      b4c96ef0
Loading