Skip to content
Snippets Groups Projects
  1. Oct 04, 2024
  2. Oct 02, 2024
    • Al Viro's avatar
      move asm/unaligned.h to linux/unaligned.h · 5f60d5f6
      Al Viro authored
      asm/unaligned.h is always an include of asm-generic/unaligned.h;
      might as well move that thing to linux/unaligned.h and include
      that - there's nothing arch-specific in that header.
      
      auto-generated by the following:
      
      for i in `git grep -l -w asm/unaligned.h`; do
      	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
      done
      for i in `git grep -l -w asm-generic/unaligned.h`; do
      	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
      done
      git mv include/asm-generic/unaligned.h include/linux/unaligned.h
      git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
      sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
      sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
      5f60d5f6
  3. Sep 28, 2024
  4. Sep 25, 2024
    • Tetsuo Handa's avatar
      tomoyo: fallback to realpath if symlink's pathname does not exist · ada1986d
      Tetsuo Handa authored
      
      Alfred Agrell found that TOMOYO cannot handle execveat(AT_EMPTY_PATH)
      inside chroot environment where /dev and /proc are not mounted, for
      commit 51f39a1f ("syscalls: implement execveat() system call") missed
      that TOMOYO tries to canonicalize argv[0] when the filename fed to the
      executed program as argv[0] is supplied using potentially nonexistent
      pathname.
      
      Since "/dev/fd/<fd>" already lost symlink information used for obtaining
      that <fd>, it is too late to reconstruct symlink's pathname. Although
      <filename> part of "/dev/fd/<fd>/<filename>" might not be canonicalized,
      TOMOYO cannot use tomoyo_realpath_nofollow() when /dev or /proc is not
      mounted. Therefore, fallback to tomoyo_realpath_from_path() when
      tomoyo_realpath_nofollow() failed.
      
      Reported-by: default avatarAlfred Agrell <blubban@gmail.com>
      Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1082001
      
      
      Fixes: 51f39a1f ("syscalls: implement execveat() system call")
      Cc: stable@vger.kernel.org # v3.19+
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      ada1986d
  5. Sep 24, 2024
    • Tetsuo Handa's avatar
      tomoyo: allow building as a loadable LSM module · 8b985bbf
      Tetsuo Handa authored
      One of concerns for enabling TOMOYO in prebuilt kernels is that distributor
      wants to avoid bloating kernel packages. Although boot-time kernel command
      line options allows selecting built-in LSMs to enable, file size increase
      of vmlinux and memory footprint increase of vmlinux caused by builtin-but-
      not-enabled LSMs remains. If it becomes possible to make LSMs dynamically
      appendable after boot using loadable kernel modules, these problems will
      go away.
      
      Another of concerns for enabling TOMOYO in prebuilt kernels is that who can
      provide support when distributor cannot provide support. Due to "those who
      compiled kernel code is expected to provide support for that kernel code"
      spell, TOMOYO is failing to get enabled in Fedora distribution [1]. The
      point of loadable kernel module is to share the workload. If it becomes
      possible to make LSMs dynamically appendable after boot using loadable
      kernel modules, as with people can use device drivers not supported by
      distributors but provided by third party device vendors, we can break
      this spell and can lower the barrier for using TOMOYO.
      
      This patch is intended for demonstrating that there is nothing difficult
      for supporting TOMOYO-like loadable LSM modules. For now we need to live
      with a mixture of built-in part and loadable part because fully loadable
      LSM modules are not supported since Linux 2.6.24 [2] and number of LSMs
      which can reserve static call slots is determined at compile time in
      Linux 6.12.
      
      Major changes in this patch are described below.
      There are no behavior changes as long as TOMOYO is built into vmlinux.
      
      Add CONFIG_SECURITY_TOMOYO_LKM as "bool" instead of changing
      CONFIG_SECURITY_TOMOYO from "bool" to "tristate", for something went
      wrong with how Makefile is evaluated if I choose "tristate".
      
      Add proxy.c for serving as a bridge between vmlinux and tomoyo.ko .
      Move callback functions from init.c to proxy.c when building as a loadable
      LSM module. init.c is built-in part and remains for reserving static call
      slots. proxy.c contains module's init function and tells init.c location of
      callback functions, making it possible to use static call for tomoyo.ko .
      
      By deferring initialization of "struct tomoyo_task" until tomoyo.ko is
      loaded, threads created between init.c reserved LSM hooks and proxy.c
      updates LSM hooks will have NULL "struct tomoyo_task" instances. Assuming
      that tomoyo.ko is loaded by the moment when the global init process starts,
      initialize "struct tomoyo_task" instance for current thread as a kernel
      thread when tomoyo_task(current) is called for the first time.
      
      There is a hack for exporting currently not-exported functions.
      This hack will be removed after all relevant functions are exported.
      
      Link: https://bugzilla.redhat.com/show_bug.cgi?id=542986 [1]
      Link: https://lkml.kernel.org/r/caafb609-8bef-4840-a080-81537356fc60@I-love.SAKURA.ne.jp
      
       [2]
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      8b985bbf
  6. Sep 23, 2024
  7. Sep 19, 2024
  8. Sep 16, 2024
  9. Sep 13, 2024
  10. Sep 11, 2024
  11. Sep 09, 2024
    • Mickaël Salaün's avatar
      security: Update file_set_fowner documentation · 19c9d55d
      Mickaël Salaün authored
      
      Highlight that the file_set_fowner hook is now called with a lock held.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Ondrej Mosnacek <omosnace@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
      Signed-off-by: default avatarMickaël Salaün <mic@digikod.net>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      19c9d55d
  12. Sep 03, 2024
    • Paul Moore's avatar
      selinux: fix style problems in security/selinux/include/audit.h · d19a9e25
      Paul Moore authored
      
      Remove the needless indent in the function comment header blocks.
      
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      d19a9e25
    • Jiawei Ye's avatar
      smackfs: Use rcu_assign_pointer() to ensure safe assignment in smk_set_cipso · 2749749a
      Jiawei Ye authored
      
      In the `smk_set_cipso` function, the `skp->smk_netlabel.attr.mls.cat`
      field is directly assigned to a new value without using the appropriate
      RCU pointer assignment functions. According to RCU usage rules, this is
      illegal and can lead to unpredictable behavior, including data
      inconsistencies and impossible-to-diagnose memory corruption issues.
      
      This possible bug was identified using a static analysis tool developed
      by myself, specifically designed to detect RCU-related issues.
      
      To address this, the assignment is now done using rcu_assign_pointer(),
      which ensures that the pointer assignment is done safely, with the
      necessary memory barriers and synchronization. This change prevents
      potential RCU dereference issues by ensuring that the `cat` field is
      safely updated while still adhering to RCU's requirements.
      
      Fixes: 0817534f ("smackfs: Fix use-after-free in netlbl_catmap_walk()")
      Signed-off-by: default avatarJiawei Ye <jiawei.ye@foxmail.com>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      2749749a
  13. Aug 30, 2024
  14. Aug 29, 2024
  15. Aug 28, 2024
    • Scott Mayhew's avatar
      selinux,smack: don't bypass permissions check in inode_setsecctx hook · 76a0e79b
      Scott Mayhew authored
      
      Marek Gresko reports that the root user on an NFS client is able to
      change the security labels on files on an NFS filesystem that is
      exported with root squashing enabled.
      
      The end of the kerneldoc comment for __vfs_setxattr_noperm() states:
      
       *  This function requires the caller to lock the inode's i_mutex before it
       *  is executed. It also assumes that the caller will make the appropriate
       *  permission checks.
      
      nfsd_setattr() does do permissions checking via fh_verify() and
      nfsd_permission(), but those don't do all the same permissions checks
      that are done by security_inode_setxattr() and its related LSM hooks do.
      
      Since nfsd_setattr() is the only consumer of security_inode_setsecctx(),
      simplest solution appears to be to replace the call to
      __vfs_setxattr_noperm() with a call to __vfs_setxattr_locked().  This
      fixes the above issue and has the added benefit of causing nfsd to
      recall conflicting delegations on a file when a client tries to change
      its security label.
      
      Cc: stable@kernel.org
      Reported-by: default avatarMarek Gresko <marek.gresko@protonmail.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=218809
      
      
      Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Tested-by: default avatarStephen Smalley <stephen.smalley.work@gmail.com>
      Reviewed-by: default avatarStephen Smalley <stephen.smalley.work@gmail.com>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      76a0e79b
    • Zhen Lei's avatar
      selinux: simplify avc_xperms_audit_required() · 68cfb283
      Zhen Lei authored
      
      By associative and commutative laws, the result of the two 'audited' is
      zero. Take the second 'audited' as an example:
        1) audited = requested & avd->auditallow;
        2) audited &= ~requested;
        ==> audited = ~requested & (requested & avd->auditallow);
        ==> audited = (~requested & requested) & avd->auditallow;
        ==> audited = 0 & avd->auditallow;
        ==> audited = 0;
      
      In fact, it is more readable to directly write zero. The value of the
      first 'audited' is 0 because AUDIT is not allowed. The second 'audited'
      is zero because there is no AUDITALLOW permission.
      
      Signed-off-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      68cfb283
    • Guido Trentalancia's avatar
      selinux: mark both IPv4 and IPv6 accepted connection sockets as labeled · a3422eb4
      Guido Trentalancia authored
      
      The current partial labeling was introduced in 389fb800 ("netlabel:
      Label incoming TCP connections correctly in SELinux") due to the fact
      that IPv6 labeling was not supported yet at the time.
      
      Signed-off-by: default avatarGuido Trentalancia <guido@trentalancia.com>
      [PM: properly format the referenced commit ID, adjust subject]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      a3422eb4
    • Christian Brauner's avatar
      file: reclaim 24 bytes from f_owner · 1934b212
      Christian Brauner authored
      We do embedd struct fown_struct into struct file letting it take up 32
      bytes in total. We could tweak struct fown_struct to be more compact but
      really it shouldn't even be embedded in struct file in the first place.
      
      Instead, actual users of struct fown_struct should allocate the struct
      on demand. This frees up 24 bytes in struct file.
      
      That will have some potentially user-visible changes for the ownership
      fcntl()s. Some of them can now fail due to allocation failures.
      Practically, that probably will almost never happen as the allocations
      are small and they only happen once per file.
      
      The fown_struct is used during kill_fasync() which is used by e.g.,
      pipes to generate a SIGIO signal. Sending of such signals is conditional
      on userspace having set an owner for the file using one of the F_OWNER
      fcntl()s. Such users will be unaffected if struct fown_struct is
      allocated during the fcntl() call.
      
      There are a few subsystems that call __f_setown() expecting
      file->f_owner to be allocated:
      
      (1) tun devices
          file->f_op->fasync::tun_chr_fasync()
          -> __f_setown()
      
          There are no callers of tun_chr_fasync().
      
      (2) tty devices
      
          file->f_op->fasync::tty_fasync()
          -> __tty_fasync()
             -> __f_setown()
      
          tty_fasync() has no additional callers but __tty_fasync() has. Note
          that __tty_fasync() only calls __f_setown() if the @on argument is
          true. It's called from:
      
          file->f_op->release::tty_release()
          -> tty_release()
             -> __tty_fasync()
                -> __f_setown()
      
          tty_release() calls __tty_fasync() with @on false
          => __f_setown() is never called from tty_release().
             => All callers of tty_release() are safe as well.
      
          file->f_op->release::tty_open()
          -> tty_release()
             -> __tty_fasync()
                -> __f_setown()
      
          __tty_hangup() calls __tty_fasync() with @on false
          => __f_setown() is never called from tty_release().
             => All callers of __tty_hangup() are safe as well.
      
      From the callchains it's obvious that (1) and (2) end up getting called
      via file->f_op->fasync(). That can happen either through the F_SETFL
      fcntl() with the FASYNC flag raised or via the FIOASYNC ioctl(). If
      FASYNC is requested and the file isn't already FASYNC then
      file->f_op->fasync() is called with @on true which ends up causing both
      (1) and (2) to call __f_setown().
      
      (1) and (2) are the only subsystems that call __f_setown() from the
      file->f_op->fasync() handler. So both (1) and (2) have been updated to
      allocate a struct fown_struct prior to calling fasync_helper() to
      register with the fasync infrastructure. That's safe as they both call
      fasync_helper() which also does allocations if @on is true.
      
      The other interesting case are file leases:
      
      (3) file leases
          lease_manager_ops->lm_setup::lease_setup()
          -> __f_setown()
      
          Which in turn is called from:
      
          generic_add_lease()
          -> lease_manager_ops->lm_setup::lease_setup()
             -> __f_setown()
      
      So here again we can simply make generic_add_lease() allocate struct
      fown_struct prior to the lease_manager_ops->lm_setup::lease_setup()
      which happens under a spinlock.
      
      With that the two remaining subsystems that call __f_setown() are:
      
      (4) dnotify
      (5) sockets
      
      Both have their own custom ioctls to set struct fown_struct and both
      have been converted to allocate a struct fown_struct on demand from
      their respective ioctls.
      
      Interactions with O_PATH are fine as well e.g., when opening a /dev/tty
      as O_PATH then no file->f_op->open() happens thus no file->f_owner is
      allocated. That's fine as no file operation will be set for those and
      the device has never been opened. fcntl()s called on such things will
      just allocate a ->f_owner on demand. Although I have zero idea why'd you
      care about f_owner on an O_PATH fd.
      
      Link: https://lore.kernel.org/r/20240813-work-f_owner-v2-1-4e9343a79f9f@kernel.org
      
      
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      1934b212
  16. Aug 27, 2024
  17. Aug 26, 2024
  18. Aug 25, 2024
    • Guenter Roeck's avatar
      apparmor: fix policy_unpack_test on big endian systems · 98c0cc48
      Guenter Roeck authored
      
      policy_unpack_test fails on big endian systems because data byte order
      is expected to be little endian but is generated in host byte order.
      This results in test failures such as:
      
       # policy_unpack_test_unpack_array_with_null_name: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:150
          Expected array_size == (u16)16, but
              array_size == 4096 (0x1000)
              (u16)16 == 16 (0x10)
          # policy_unpack_test_unpack_array_with_null_name: pass:0 fail:1 skip:0 total:1
          not ok 3 policy_unpack_test_unpack_array_with_null_name
          # policy_unpack_test_unpack_array_with_name: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:164
          Expected array_size == (u16)16, but
              array_size == 4096 (0x1000)
              (u16)16 == 16 (0x10)
          # policy_unpack_test_unpack_array_with_name: pass:0 fail:1 skip:0 total:1
      
      Add the missing endianness conversions when generating test data.
      
      Fixes: 4d944bcd ("apparmor: add AppArmor KUnit tests for policy unpack")
      Cc: Brendan Higgins <brendanhiggins@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      98c0cc48
  19. Aug 22, 2024
    • GiSeong Ji's avatar
      security: smack: Fix indentation in smack_netfilter.c · eabc10e6
      GiSeong Ji authored
      
      Aligned parameters in the function declaration of smack_ip_output
      to adhere to the Linux kernel coding style guidelines.
      
      The parameters of the smack_ip_output function were previously misaligned,
      with the second and third parameters not aligned under the first parameter.
      This change corrects the indentation, improving code readability and
      maintaining consistency with the rest of the codebase.
      
      Signed-off-by: default avatarGiSeong Ji <jiggyjiggy0323@gmail.com>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      eabc10e6
    • Yang Li's avatar
      ipe: Remove duplicated include in ipe.c · f5dafb89
      Yang Li authored
      
      The header files eval.h is included twice in ipe.c,
      so one inclusion of each can be removed.
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9796
      
      
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      f5dafb89
    • KP Singh's avatar
      lsm: replace indirect LSM hook calls with static calls · 417c5643
      KP Singh authored
      
      LSM hooks are currently invoked from a linked list as indirect calls
      which are invoked using retpolines as a mitigation for speculative
      attacks (Branch History / Target injection) and add extra overhead which
      is especially bad in kernel hot paths:
      
      security_file_ioctl:
         0xff...0320 <+0>:	endbr64
         0xff...0324 <+4>:	push   %rbp
         0xff...0325 <+5>:	push   %r15
         0xff...0327 <+7>:	push   %r14
         0xff...0329 <+9>:	push   %rbx
         0xff...032a <+10>:	mov    %rdx,%rbx
         0xff...032d <+13>:	mov    %esi,%ebp
         0xff...032f <+15>:	mov    %rdi,%r14
         0xff...0332 <+18>:	mov    $0xff...7030,%r15
         0xff...0339 <+25>:	mov    (%r15),%r15
         0xff...033c <+28>:	test   %r15,%r15
         0xff...033f <+31>:	je     0xff...0358 <security_file_ioctl+56>
         0xff...0341 <+33>:	mov    0x18(%r15),%r11
         0xff...0345 <+37>:	mov    %r14,%rdi
         0xff...0348 <+40>:	mov    %ebp,%esi
         0xff...034a <+42>:	mov    %rbx,%rdx
      
         0xff...034d <+45>:	call   0xff...2e0 <__x86_indirect_thunk_array+352>
         			       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
          Indirect calls that use retpolines leading to overhead, not just due
          to extra instruction but also branch misses.
      
         0xff...0352 <+50>:	test   %eax,%eax
         0xff...0354 <+52>:	je     0xff...0339 <security_file_ioctl+25>
         0xff...0356 <+54>:	jmp    0xff...035a <security_file_ioctl+58>
         0xff...0358 <+56>:	xor    %eax,%eax
         0xff...035a <+58>:	pop    %rbx
         0xff...035b <+59>:	pop    %r14
         0xff...035d <+61>:	pop    %r15
         0xff...035f <+63>:	pop    %rbp
         0xff...0360 <+64>:	jmp    0xff...47c4 <__x86_return_thunk>
      
      The indirect calls are not really needed as one knows the addresses of
      enabled LSM callbacks at boot time and only the order can possibly
      change at boot time with the lsm= kernel command line parameter.
      
      An array of static calls is defined per LSM hook and the static calls
      are updated at boot time once the order has been determined.
      
      With the hook now exposed as a static call, one can see that the
      retpolines are no longer there and the LSM callbacks are invoked
      directly:
      
      security_file_ioctl:
         0xff...0ca0 <+0>:	endbr64
         0xff...0ca4 <+4>:	nopl   0x0(%rax,%rax,1)
         0xff...0ca9 <+9>:	push   %rbp
         0xff...0caa <+10>:	push   %r14
         0xff...0cac <+12>:	push   %rbx
         0xff...0cad <+13>:	mov    %rdx,%rbx
         0xff...0cb0 <+16>:	mov    %esi,%ebp
         0xff...0cb2 <+18>:	mov    %rdi,%r14
         0xff...0cb5 <+21>:	jmp    0xff...0cc7 <security_file_ioctl+39>
        			       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         Static key enabled for SELinux
      
         0xffffffff818f0cb7 <+23>:	jmp    0xff...0cde <security_file_ioctl+62>
         				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
         Static key enabled for BPF LSM. This is something that is changed to
         default to false to avoid the existing side effect issues of BPF LSM
         [1] in a subsequent patch.
      
         0xff...0cb9 <+25>:	xor    %eax,%eax
         0xff...0cbb <+27>:	xchg   %ax,%ax
         0xff...0cbd <+29>:	pop    %rbx
         0xff...0cbe <+30>:	pop    %r14
         0xff...0cc0 <+32>:	pop    %rbp
         0xff...0cc1 <+33>:	cs jmp 0xff...0000 <__x86_return_thunk>
         0xff...0cc7 <+39>:	endbr64
         0xff...0ccb <+43>:	mov    %r14,%rdi
         0xff...0cce <+46>:	mov    %ebp,%esi
         0xff...0cd0 <+48>:	mov    %rbx,%rdx
         0xff...0cd3 <+51>:	call   0xff...3230 <selinux_file_ioctl>
         			       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         Direct call to SELinux.
      
         0xff...0cd8 <+56>:	test   %eax,%eax
         0xff...0cda <+58>:	jne    0xff...0cbd <security_file_ioctl+29>
         0xff...0cdc <+60>:	jmp    0xff...0cb7 <security_file_ioctl+23>
         0xff...0cde <+62>:	endbr64
         0xff...0ce2 <+66>:	mov    %r14,%rdi
         0xff...0ce5 <+69>:	mov    %ebp,%esi
         0xff...0ce7 <+71>:	mov    %rbx,%rdx
         0xff...0cea <+74>:	call   0xff...e220 <bpf_lsm_file_ioctl>
         			       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         Direct call to BPF LSM.
      
         0xff...0cef <+79>:	test   %eax,%eax
         0xff...0cf1 <+81>:	jne    0xff...0cbd <security_file_ioctl+29>
         0xff...0cf3 <+83>:	jmp    0xff...0cb9 <security_file_ioctl+25>
         0xff...0cf5 <+85>:	endbr64
         0xff...0cf9 <+89>:	mov    %r14,%rdi
         0xff...0cfc <+92>:	mov    %ebp,%esi
         0xff...0cfe <+94>:	mov    %rbx,%rdx
         0xff...0d01 <+97>:	pop    %rbx
         0xff...0d02 <+98>:	pop    %r14
         0xff...0d04 <+100>:	pop    %rbp
         0xff...0d05 <+101>:	ret
         0xff...0d06 <+102>:	int3
         0xff...0d07 <+103>:	int3
         0xff...0d08 <+104>:	int3
         0xff...0d09 <+105>:	int3
      
      While this patch uses static_branch_unlikely indicating that an LSM hook
      is likely to be not present. In most cases this is still a better choice
      as even when an LSM with one hook is added, empty slots are created for
      all LSM hooks (especially when many LSMs that do not initialize most
      hooks are present on the system).
      
      There are some hooks that don't use the call_int_hook or
      call_void_hook. These hooks are updated to use a new macro called
      lsm_for_each_hook where the lsm_callback is directly invoked as an
      indirect call.
      
      Below are results of the relevant Unixbench system benchmarks with BPF LSM
      and SELinux enabled with default policies enabled with and without these
      patches.
      
      Benchmark                                          Delta(%): (+ is better)
      ==========================================================================
      Execl Throughput                                             +1.9356
      File Write 1024 bufsize 2000 maxblocks                       +6.5953
      Pipe Throughput                                              +9.5499
      Pipe-based Context Switching                                 +3.0209
      Process Creation                                             +2.3246
      Shell Scripts (1 concurrent)                                 +1.4975
      System Call Overhead                                         +2.7815
      System Benchmarks Index Score (Partial Only):                +3.4859
      
      In the best case, some syscalls like eventfd_create benefitted to about
      ~10%.
      
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarKP Singh <kpsingh@kernel.org>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      417c5643
  20. Aug 20, 2024
    • Deven Bowers's avatar
      ipe: kunit test for parser · 10ca05a7
      Deven Bowers authored
      Add various happy/unhappy unit tests for both IPE's policy parser.
      
      Besides, a test suite for IPE functionality is available at
      https://github.com/microsoft/ipe/tree/test-suite
      
      
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      10ca05a7
    • Deven Bowers's avatar
      scripts: add boot policy generation program · ba199dc9
      Deven Bowers authored
      
      Enables an IPE policy to be enforced from kernel start, enabling access
      control based on trust from kernel startup. This is accomplished by
      transforming an IPE policy indicated by CONFIG_IPE_BOOT_POLICY into a
      c-string literal that is parsed at kernel startup as an unsigned policy.
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      ba199dc9
    • Fan Wu's avatar
      ipe: enable support for fs-verity as a trust provider · 31f8c868
      Fan Wu authored
      
      Enable IPE policy authors to indicate trust for a singular fsverity
      file, identified by the digest information, through "fsverity_digest"
      and all files using valid fsverity builtin signatures via
      "fsverity_signature".
      
      This enables file-level integrity claims to be expressed in IPE,
      allowing individual files to be authorized, giving some flexibility
      for policy authors. Such file-level claims are important to be expressed
      for enforcing the integrity of packages, as well as address some of the
      scalability issues in a sole dm-verity based solution (# of loop back
      devices, etc).
      
      This solution cannot be done in userspace as the minimum threat that
      IPE should mitigate is an attacker downloads malicious payload with
      all required dependencies. These dependencies can lack the userspace
      check, bypassing the protection entirely. A similar attack succeeds if
      the userspace component is replaced with a version that does not
      perform the check. As a result, this can only be done in the common
      entry point - the kernel.
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      31f8c868
    • Fan Wu's avatar
      lsm: add security_inode_setintegrity() hook · fb55e177
      Fan Wu authored
      
      This patch introduces a new hook to save inode's integrity
      data. For example, for fsverity enabled files, LSMs can use this hook to
      save the existence of verified fsverity builtin signature into the inode's
      security blob, and LSMs can make access decisions based on this data.
      
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      [PM: subject line tweak, removed changelog]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      fb55e177
    • Deven Bowers's avatar
      ipe: add support for dm-verity as a trust provider · e155858d
      Deven Bowers authored
      
      Allows author of IPE policy to indicate trust for a singular dm-verity
      volume, identified by roothash, through "dmverity_roothash" and all
      signed and validated dm-verity volumes, through "dmverity_signature".
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      [PM: fixed some line length issues in the comments]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      e155858d
    • Deven Bowers's avatar
      block,lsm: add LSM blob and new LSM hooks for block devices · b55d26bd
      Deven Bowers authored
      
      This patch introduces a new LSM blob to the block_device structure,
      enabling the security subsystem to store security-sensitive data related
      to block devices. Currently, for a device mapper's mapped device containing
      a dm-verity target, critical security information such as the roothash and
      its signing state are not readily accessible. Specifically, while the
      dm-verity volume creation process passes the dm-verity roothash and its
      signature from userspace to the kernel, the roothash is stored privately
      within the dm-verity target, and its signature is discarded
      post-verification. This makes it extremely hard for the security subsystem
      to utilize these data.
      
      With the addition of the LSM blob to the block_device structure, the
      security subsystem can now retain and manage important security metadata
      such as the roothash and the signing state of a dm-verity by storing them
      inside the blob. Access decisions can then be based on these stored data.
      
      The implementation follows the same approach used for security blobs in
      other structures like struct file, struct inode, and struct superblock.
      The initialization of the security blob occurs after the creation of the
      struct block_device, performed by the security subsystem. Similarly, the
      security blob is freed by the security subsystem before the struct
      block_device is deallocated or freed.
      
      This patch also introduces a new hook security_bdev_setintegrity() to save
      block device's integrity data to the new LSM blob. For example, for
      dm-verity, it can use this hook to expose its roothash and signing state
      to LSMs, then LSMs can save these data into the LSM blob.
      
      Please note that the new hook should be invoked every time the security
      information is updated to keep these data current. For example, in
      dm-verity, if the mapping table is reloaded and configured to use a
      different dm-verity target with a new roothash and signing information,
      the previously stored data in the LSM blob will become obsolete. It is
      crucial to re-invoke the hook to refresh these data and ensure they are up
      to date. This necessity arises from the design of device-mapper, where a
      device-mapper device is first created, and then targets are subsequently
      loaded into it. These targets can be modified multiple times during the
      device's lifetime. Therefore, while the LSM blob is allocated during the
      creation of the block device, its actual contents are not initialized at
      this stage and can change substantially over time. This includes
      alterations from data that the LSM 'trusts' to those it does not, making
      it essential to handle these changes correctly. Failure to address this
      dynamic aspect could potentially allow for bypassing LSM checks.
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      [PM: merge fuzz, subject line tweaks]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      b55d26bd
    • Deven Bowers's avatar
      ipe: add permissive toggle · a68916ea
      Deven Bowers authored
      
      IPE, like SELinux, supports a permissive mode. This mode allows policy
      authors to test and evaluate IPE policy without it affecting their
      programs. When the mode is changed, a 1404 AUDIT_MAC_STATUS will
      be reported.
      
      This patch adds the following audit records:
      
          audit: MAC_STATUS enforcing=0 old_enforcing=1 auid=4294967295
            ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
          audit: MAC_STATUS enforcing=1 old_enforcing=0 auid=4294967295
            ses=4294967295 enabled=1 old-enabled=1 lsm=ipe res=1
      
      The audit record only emit when the value from the user input is
      different from the current enforce value.
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      a68916ea
    • Deven Bowers's avatar
      audit,ipe: add IPE auditing support · f44554b5
      Deven Bowers authored
      
      Users of IPE require a way to identify when and why an operation fails,
      allowing them to both respond to violations of policy and be notified
      of potentially malicious actions on their systems with respect to IPE
      itself.
      
      This patch introduces 3 new audit events.
      
      AUDIT_IPE_ACCESS(1420) indicates the result of an IPE policy evaluation
      of a resource.
      AUDIT_IPE_CONFIG_CHANGE(1421) indicates the current active IPE policy
      has been changed to another loaded policy.
      AUDIT_IPE_POLICY_LOAD(1422) indicates a new IPE policy has been loaded
      into the kernel.
      
      This patch also adds support for success auditing, allowing users to
      identify why an allow decision was made for a resource. However, it is
      recommended to use this option with caution, as it is quite noisy.
      
      Here are some examples of the new audit record types:
      
      AUDIT_IPE_ACCESS(1420):
      
          audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
            pid=297 comm="sh" path="/root/vol/bin/hello" dev="tmpfs"
            ino=3897 rule="op=EXECUTE boot_verified=TRUE action=ALLOW"
      
          audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
            pid=299 comm="sh" path="/mnt/ipe/bin/hello" dev="dm-0"
            ino=2 rule="DEFAULT action=DENY"
      
          audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
           pid=300 path="/tmp/tmpdp2h1lub/deny/bin/hello" dev="tmpfs"
            ino=131 rule="DEFAULT action=DENY"
      
      The above three records were generated when the active IPE policy only
      allows binaries from the initramfs to run. The three identical `hello`
      binary were placed at different locations, only the first hello from
      the rootfs(initramfs) was allowed.
      
      Field ipe_op followed by the IPE operation name associated with the log.
      
      Field ipe_hook followed by the name of the LSM hook that triggered the IPE
      event.
      
      Field enforcing followed by the enforcement state of IPE. (it will be
      introduced in the next commit)
      
      Field pid followed by the pid of the process that triggered the IPE
      event.
      
      Field comm followed by the command line program name of the process that
      triggered the IPE event.
      
      Field path followed by the file's path name.
      
      Field dev followed by the device name as found in /dev where the file is
      from.
      Note that for device mappers it will use the name `dm-X` instead of
      the name in /dev/mapper.
      For a file in a temp file system, which is not from a device, it will use
      `tmpfs` for the field.
      The implementation of this part is following another existing use case
      LSM_AUDIT_DATA_INODE in security/lsm_audit.c
      
      Field ino followed by the file's inode number.
      
      Field rule followed by the IPE rule made the access decision. The whole
      rule must be audited because the decision is based on the combination of
      all property conditions in the rule.
      
      Along with the syscall audit event, user can know why a blocked
      happened. For example:
      
          audit: AUDIT1420 ipe_op=EXECUTE ipe_hook=BPRM_CHECK enforcing=1
            pid=2138 comm="bash" path="/mnt/ipe/bin/hello" dev="dm-0"
            ino=2 rule="DEFAULT action=DENY"
          audit[1956]: SYSCALL arch=c000003e syscall=59
            success=no exit=-13 a0=556790138df0 a1=556790135390 a2=5567901338b0
            a3=ab2a41a67f4f1f4e items=1 ppid=147 pid=1956 auid=4294967295 uid=0
            gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0
            ses=4294967295 comm="bash" exe="/usr/bin/bash" key=(null)
      
      The above two records showed bash used execve to run "hello" and got
      blocked by IPE. Note that the IPE records are always prior to a SYSCALL
      record.
      
      AUDIT_IPE_CONFIG_CHANGE(1421):
      
          audit: AUDIT1421
            old_active_pol_name="Allow_All" old_active_pol_version=0.0.0
            old_policy_digest=sha256:E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649
            new_active_pol_name="boot_verified" new_active_pol_version=0.0.0
            new_policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F
            auid=4294967295 ses=4294967295 lsm=ipe res=1
      
      The above record showed the current IPE active policy switch from
      `Allow_All` to `boot_verified` along with the version and the hash
      digest of the two policies. Note IPE can only have one policy active
      at a time, all access decision evaluation is based on the current active
      policy.
      The normal procedure to deploy a policy is loading the policy to deploy
      into the kernel first, then switch the active policy to it.
      
      AUDIT_IPE_POLICY_LOAD(1422):
      
          audit: AUDIT1422 policy_name="boot_verified" policy_version=0.0.0
            policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F2676
            auid=4294967295 ses=4294967295 lsm=ipe res=1
      
      The above record showed a new policy has been loaded into the kernel
      with the policy name, policy version and policy hash.
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      [PM: subject line tweak]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      f44554b5
    • Deven Bowers's avatar
      ipe: add userspace interface · 2261306f
      Deven Bowers authored
      
      As is typical with LSMs, IPE uses securityfs as its interface with
      userspace. for a complete list of the interfaces and the respective
      inputs/outputs, please see the documentation under
      admin-guide/LSM/ipe.rst
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      2261306f
    • Fan Wu's avatar
      lsm: add new securityfs delete function · 7138679f
      Fan Wu authored
      
      When deleting a directory in the security file system, the existing
      securityfs_remove requires the directory to be empty, otherwise
      it will do nothing. This leads to a potential risk that the security
      file system might be in an unclean state when the intended deletion
      did not happen.
      
      This commit introduces a new function securityfs_recursive_remove
      to recursively delete a directory without leaving an unclean state.
      
      Co-developed-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      [PM: subject line tweak]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      7138679f
    • Fan Wu's avatar
      ipe: introduce 'boot_verified' as a trust provider · a8a74df1
      Fan Wu authored
      IPE is designed to provide system level trust guarantees, this usually
      implies that trust starts from bootup with a hardware root of trust,
      which validates the bootloader. After this, the bootloader verifies
      the kernel and the initramfs.
      
      As there's no currently supported integrity method for initramfs, and
      it's typically already verified by the bootloader. This patch introduces
      a new IPE property `boot_verified` which allows author of IPE policy to
      indicate trust for files from initramfs.
      
      The implementation of this feature utilizes the newly added
      `initramfs_populated` hook. This hook marks the superblock of the rootfs
      after the initramfs has been unpacked into it.
      
      Before mounting the real rootfs on top of the initramfs, initramfs
      script will recursively remove all files and directories on the
      initramfs. This is typically implemented by using switch_root(8)
      (https://man7.org/linux/man-pages/man8/switch_root.8.html
      
      ).
      Therefore the initramfs will be empty and not accessible after the real
      rootfs takes over. It is advised to switch to a different policy
      that doesn't rely on the `boot_verified` property after this point.
      This ensures that the trust policies remain relevant and effective
      throughout the system's operation.
      
      Signed-off-by: default avatarDeven Bowers <deven.desai@linux.microsoft.com>
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      a8a74df1
    • Fan Wu's avatar
      initramfs,lsm: add a security hook to do_populate_rootfs() · 2fea0c26
      Fan Wu authored
      
      This patch introduces a new hook to notify security system that the
      content of initramfs has been unpacked into the rootfs.
      
      Upon receiving this notification, the security system can activate
      a policy to allow only files that originated from the initramfs to
      execute or load into kernel during the early stages of booting.
      
      This approach is crucial for minimizing the attack surface by
      ensuring that only trusted files from the initramfs are operational
      in the critical boot phase.
      
      Signed-off-by: default avatarFan Wu <wufan@linux.microsoft.com>
      [PM: subject line tweak]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      2fea0c26
Loading