- Apr 15, 2021
-
-
Randy Dunlap authored
Fix kernel-doc notation in commoncap.c. Use correct (matching) function name in comments as in code. Use correct function argument names in kernel-doc comments. Use kernel-doc's "Return:" format for function return values. Fixes these kernel-doc warnings: ../security/commoncap.c:1206: warning: expecting prototype for cap_task_ioprio(). Prototype was for cap_task_setioprio() instead ../security/commoncap.c:1219: warning: expecting prototype for cap_task_ioprio(). Prototype was for cap_task_setnice() instead Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
James Morris <jamorris@linux.microsoft.com>
-
- Mar 24, 2021
-
-
Arnd Bergmann authored
gcc-11 introdces a harmless warning for cap_inode_getsecurity: security/commoncap.c: In function ‘cap_inode_getsecurity’: security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread] 440 | memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem here is that tmpbuf is initialized to NULL, so gcc assumes it is not accessible unless it gets set by vfs_getxattr_alloc(). This is a legitimate warning as far as I can tell, but the code is correct since it correctly handles the error when that function fails. Add a separate NULL check to tell gcc about it as well. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by:
James Morris <jamorris@linux.microsoft.com>
-
- Mar 12, 2021
-
-
Eric W. Biederman authored
It turns out that there are in fact userspace implementations that care and this recent change caused a regression. https://github.com/containers/buildah/issues/3071 As the motivation for the original change was future development, and the impact is existing real world code just revert this change and allow the ambiguity in v3 file caps. Cc: stable@vger.kernel.org Fixes: 95ebabde ("capabilities: Don't allow writing ambiguous v3 file capabilities") Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Jan 28, 2021
-
-
Miklos Szeredi authored
If a capability is stored on disk in v2 format cap_inode_getsecurity() will currently return in v2 format unconditionally. This is wrong: v2 cap should be equivalent to a v3 cap with zero rootid, and so the same conversions performed on it. If the rootid cannot be mapped, v3 is returned unconverted. Fix this so that both v2 and v3 return -EOVERFLOW if the rootid (or the owner of the fs user namespace in case of v2) cannot be mapped into the current user namespace. Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com> Acked-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Jan 24, 2021
-
-
Christian Brauner authored
When interacting with user namespace and non-user namespace aware filesystem capabilities the vfs will perform various security checks to determine whether or not the filesystem capabilities can be used by the caller, whether they need to be removed and so on. The main infrastructure for this resides in the capability codepaths but they are called through the LSM security infrastructure even though they are not technically an LSM or optional. This extends the existing security hooks security_inode_removexattr(), security_inode_killpriv(), security_inode_getsecurity() to pass down the mount's user namespace and makes them aware of idmapped mounts. In order to actually get filesystem capabilities from disk the capability infrastructure exposes the get_vfs_caps_from_disk() helper. For user namespace aware filesystem capabilities a root uid is stored alongside the capabilities. In order to determine whether the caller can make use of the filesystem capability or whether it needs to be ignored it is translated according to the superblock's user namespace. If it can be translated to uid 0 according to that id mapping the caller can use the filesystem capabilities stored on disk. If we are accessing the inode that holds the filesystem capabilities through an idmapped mount we map the root uid according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts: reading filesystem caps from disk enforces that the root uid associated with the filesystem capability must have a mapping in the superblock's user namespace and that the caller is either in the same user namespace or is a descendant of the superblock's user namespace. For filesystems that are mountable inside user namespace the caller can just mount the filesystem and won't usually need to idmap it. If they do want to idmap it they can create an idmapped mount and mark it with a user namespace they created and which is thus a descendant of s_user_ns. For filesystems that are not mountable inside user namespaces the descendant rule is trivially true because the s_user_ns will be the initial user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-11-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
James Morris <jamorris@linux.microsoft.com> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Tycho Andersen authored
When interacting with extended attributes the vfs verifies that the caller is privileged over the inode with which the extended attribute is associated. For posix access and posix default extended attributes a uid or gid can be stored on-disk. Let the functions handle posix extended attributes on idmapped mounts. If the inode is accessed through an idmapped mount we need to map it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. This has no effect for e.g. security xattrs since they don't store uids or gids and don't perform permission checks on them like posix acls do. Link: https://lore.kernel.org/r/20210121131959.646623-10-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
James Morris <jamorris@linux.microsoft.com> Signed-off-by:
Tycho Andersen <tycho@tycho.pizza> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
The posix acl permission checking helpers determine whether a caller is privileged over an inode according to the acls associated with the inode. Add helpers that make it possible to handle acls on idmapped mounts. The vfs and the filesystems targeted by this first iteration make use of posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to translate basic posix access and default permissions such as the ACL_USER and ACL_GROUP type according to the initial user namespace (or the superblock's user namespace) to and from the caller's current user namespace. Adapt these two helpers to handle idmapped mounts whereby we either map from or into the mount's user namespace depending on in which direction we're translating. Similarly, cap_convert_nscap() is used by the vfs to translate user namespace and non-user namespace aware filesystem capabilities from the superblock's user namespace to the caller's user namespace. Enable it to handle idmapped mounts by accounting for the mount's user namespace. In addition the fileystems targeted in the first iteration of this patch series make use of the posix_acl_chmod() and, posix_acl_update_mode() helpers. Both helpers perform permission checks on the target inode. Let them handle idmapped mounts. These two helpers are called when posix acls are set by the respective filesystems to handle this case we extend the ->set() method to take an additional user namespace argument to pass the mount's user namespace down. Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
In order to determine whether a caller holds privilege over a given inode the capability framework exposes the two helpers privileged_wrt_inode_uidgid() and capable_wrt_inode_uidgid(). The former verifies that the inode has a mapping in the caller's user namespace and the latter additionally verifies that the caller has the requested capability in their current user namespace. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped inodes. If the initial user namespace is passed all operations are a nop so non-idmapped mounts will not see a change in behavior. Link: https://lore.kernel.org/r/20210121131959.646623-5-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
James Morris <jamorris@linux.microsoft.com> Acked-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
- Dec 29, 2020
-
-
Eric W. Biederman authored
The v3 file capabilities have a uid field that records the filesystem uid of the root user of the user namespace the file capabilities are valid in. When someone is silly enough to have the same underlying uid as the root uid of multiple nested containers a v3 filesystem capability can be ambiguous. In the spirit of don't do that then, forbid writing a v3 filesystem capability if it is ambiguous. Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities") Reviewed-by:
Andrew G. Morgan <morgan@kernel.org> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Dec 14, 2020
-
-
Miklos Szeredi authored
cap_convert_nscap() does permission checking as well as conversion of the xattr value conditionally based on fs's user-ns. This is needed by overlayfs and probably other layered fs (ecryptfs) and is what vfs_foo() is supposed to do anyway. Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com> Acked-by:
James Morris <jamorris@linux.microsoft.com>
-
- May 30, 2020
-
-
Eric W. Biederman authored
Move the computation of creds from prepare_binfmt into begin_new_exec so that the creds need only be computed once. This is just code reorganization no semantic changes of any kind are made. Moving the computation is safe. I have looked through the kernel and verified none of the binfmts look at bprm->cred directly, and that there are no helpers that look at bprm->cred indirectly. Which means that it is not a problem to compute the bprm->cred later in the execution flow as it is not used until it becomes current->cred. A new function bprm_creds_from_file is added to contain the work that needs to be done. bprm_creds_from_file first computes which file bprm->executable or most likely bprm->file that the bprm->creds will be computed from. The funciton bprm_fill_uid is updated to receive the file instead of accessing bprm->file. The now unnecessary work needed to reset the bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid. A small comment to document that bprm_fill_uid now only deals with the work to handle suid and sgid files. The default case is already heandled by prepare_exec_creds. The function security_bprm_repopulate_creds is renamed security_bprm_creds_from_file and now is explicitly passed the file from which to compute the creds. The documentation of the bprm_creds_from_file security hook is updated to explain when the hook is called and what it needs to do. The file is passed from cap_bprm_creds_from_file into get_file_caps so that the caps are computed for the appropriate file. The now unnecessary work in cap_bprm_creds_from_file to reset the ambient capabilites has been removed. A small comment to document that the work of cap_bprm_creds_from_file is to read capabilities from the files secureity attribute and derive capabilities from the fact the user had uid 0 has been added. Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
There is a small bug in the code that recomputes parts of bprm->cred for every bprm->file. The code never recomputes the part of clear_dangerous_personality_flags it is responsible for. Which means that in practice if someone creates a sgid script the interpreter will not be able to use any of: READ_IMPLIES_EXEC ADDR_NO_RANDOMIZE ADDR_COMPAT_LAYOUT MMAP_PAGE_ZERO. This accentially clearing of personality flags probably does not matter in practice because no one has complained but it does make the code more difficult to understand. Further remaining bug compatible prevents the recomputation from being removed and replaced by simply computing bprm->cred once from the final bprm->file. Making this change removes the last behavior difference between computing bprm->creds from the final file and recomputing bprm->cred several times. Which allows this behavior change to be justified for it's own reasons, and for any but hunts looking into why the behavior changed to wind up here instead of in the code that will follow that computes bprm->cred from the final bprm->file. This small logic bug appears to have existed since the code started clearing dangerous personality bits. History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support") Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- May 26, 2020
-
-
Eric W. Biederman authored
An invariant of cap_bprm_set_creds is that every field in the new cred structure that cap_bprm_set_creds might set, needs to be set every time to ensure the fields does not get a stale value. The field cap_ambient is not set every time cap_bprm_set_creds is called, which means that if there is a suid or sgid script with an interpreter that has neither the suid nor the sgid bits set the interpreter should be able to accept ambient credentials. Unfortuantely because cap_ambient is not reset to it's original value the interpreter can not accept ambient credentials. Given that the ambient capability set is expected to be controlled by the caller, I don't think this is particularly serious. But it is definitely worth fixing so the code works correctly. I have tested to verify my reading of the code is correct and the interpreter of a sgid can receive ambient capabilities with this change and cannot receive ambient capabilities without this change. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Fixes: 58319057 ("capabilities: ambient capabilities") Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- May 21, 2020
-
-
Eric W. Biederman authored
Rename bprm->cap_elevated to bprm->active_secureexec and initialize it in prepare_binprm instead of in cap_bprm_set_creds. Initializing bprm->active_secureexec in prepare_binprm allows multiple implementations of security_bprm_repopulate_creds to play nicely with each other. Rename security_bprm_set_creds to security_bprm_reopulate_creds to emphasize that this path recomputes part of bprm->cred. This recomputation avoids the time of check vs time of use problems that are inherent in unix #! interpreters. In short two renames and a move in the location of initializing bprm->active_secureexec. Link: https://lkml.kernel.org/r/87o8qkzrxp.fsf_-_@x220.int.ebiederm.org Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Jul 07, 2019
-
-
Carmeli Tamir authored
Using the existing defined XATTR_SECURITY_PREFIX_LEN instead of sizeof(XATTR_SECURITY_PREFIX) - 1. Pretty simple cleanup. Signed-off-by:
Carmeli Tamir <carmeli.tamir@gmail.com> Signed-off-by:
James Morris <jamorris@linux.microsoft.com>
-
- Jun 11, 2019
-
-
YueHaibing authored
Fix sparse warning: security/commoncap.c:1347:27: warning: symbol 'capability_hooks' was not declared. Should it be static? Reported-by:
Hulk Robot <hulkci@huawei.com> Signed-off-by:
YueHaibing <yuehaibing@huawei.com> Signed-off-by:
James Morris <jamorris@linux.microsoft.com>
-
- May 30, 2019
-
-
Thomas Gleixner authored
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Feb 25, 2019
-
-
Micah Morton authored
This should have gone in with commit c1a85a00. Signed-off-by:
Micah Morton <mortonm@chromium.org> Signed-off-by:
James Morris <james.morris@microsoft.com>
-
- Jan 25, 2019
-
-
Richard Guy Briggs authored
V3 namespaced file capabilities were introduced in commit 8db6c34f ("Introduce v3 namespaced file capabilities") Add support for these by adding the "frootid" field to the existing fcaps fields in the NAME and BPRM_FCAPS records. Please see github issue https://github.com/linux-audit/audit-kernel/issues/103 Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Acked-by:
Serge Hallyn <serge@hallyn.com> [PM: comment tweak to fit an 80 char line width] Signed-off-by:
Paul Moore <paul@paul-moore.com>
-
- Jan 10, 2019
-
-
Micah Morton authored
This patch provides a general mechanism for passing flags to the security_capable LSM hook. It replaces the specific 'audit' flag that is used to tell security_capable whether it should log an audit message for the given capability check. The reason for generalizing this flag passing is so we can add an additional flag that signifies whether security_capable is being called by a setid syscall (which is needed by the proposed SafeSetID LSM). Signed-off-by:
Micah Morton <mortonm@chromium.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
James Morris <james.morris@microsoft.com>
-
- Jan 08, 2019
-
-
Kees Cook authored
This converts capabilities to use the new LSM_ORDER_FIRST position. Signed-off-by:
Kees Cook <keescook@chromium.org> Reviewed-by:
Casey Schaufler <casey@schaufler-ca.com>
-
- Dec 12, 2018
-
-
Paul Gortmaker authored
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in removing such instances is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h might have been the implicit source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each instance for the presence of either and replace as needed. Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: linux-security-module@vger.kernel.org Cc: linux-integrity@vger.kernel.org Cc: keyrings@vger.kernel.org Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
James Morris <james.morris@microsoft.com>
-
- Aug 29, 2018
-
-
Christian Brauner authored
bprm_caps_from_vfs_caps() never returned -EINVAL so remove the rc == -EINVAL check. Signed-off-by:
Christian Brauner <christian@brauner.io> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
James Morris <james.morris@microsoft.com>
-
- Aug 11, 2018
-
-
Eddie.Horng authored
The code in cap_inode_getsecurity(), introduced by commit 8db6c34f ("Introduce v3 namespaced file capabilities"), should use d_find_any_alias() instead of d_find_alias() do handle unhashed dentry correctly. This is needed, for example, if execveat() is called with an open but unlinked overlayfs file, because overlayfs unhashes dentry on unlink. This is a regression of real life application, first reported at https://www.spinics.net/lists/linux-unionfs/msg05363.html Below reproducer and setup can reproduce the case. const char* exec="echo"; const char *newargv[] = { "echo", "hello", NULL}; const char *newenviron[] = { NULL }; int fd, err; fd = open(exec, O_PATH); unlink(exec); err = syscall(322/*SYS_execveat*/, fd, "", newargv, newenviron, AT_EMPTY_PATH); if(err<0) fprintf(stderr, "execveat: %s\n", strerror(errno)); gcc compile into ~/test/a.out mount -t overlay -orw,lowerdir=/mnt/l,upperdir=/mnt/u,workdir=/mnt/w none /mnt/m cd /mnt/m cp /bin/echo . ~/test/a.out Expected result: hello Actually result: execveat: Invalid argument dmesg: Invalid argument reading file caps for /dev/fd/3 The 2nd reproducer and setup emulates similar case but for regular filesystem: const char* exec="echo"; int fd, err; char buf[256]; fd = open(exec, O_RDONLY); unlink(exec); err = fgetxattr(fd, "security.capability", buf, 256); if(err<0) fprintf(stderr, "fgetxattr: %s\n", strerror(errno)); gcc compile into ~/test_fgetxattr cd /tmp cp /bin/echo . ~/test_fgetxattr Result: fgetxattr: Invalid argument On regular filesystem, for example, ext4 read xattr from disk and return to execveat(), will not trigger this issue, however, the overlay attr handler pass real dentry to vfs_getxattr() will. This reproducer calls fgetxattr() with an unlinked fd, involkes vfs_getxattr() then reproduced the case that d_find_alias() in cap_inode_getsecurity() can't find the unlinked dentry. Suggested-by:
Amir Goldstein <amir73il@gmail.com> Acked-by:
Amir Goldstein <amir73il@gmail.com> Acked-by:
Serge E. Hallyn <serge@hallyn.com> Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities") Cc: <stable@vger.kernel.org> # v4.14 Signed-off-by:
Eddie Horng <eddie.horng@mediatek.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- May 24, 2018
-
-
Eric W. Biederman authored
A privileged user in s_user_ns will generally have the ability to manipulate the backing store and insert security.* xattrs into the filesystem directly. Therefore the kernel must be prepared to handle these xattrs from unprivileged mounts, and it makes little sense for commoncap to prevent writing these xattrs to the filesystem. The capability and LSM code have already been updated to appropriately handle xattrs from unprivileged mounts, so it is safe to loosen this restriction on setting xattrs. The exception to this logic is that writing xattrs to a mounted filesystem may also cause the LSM inode_post_setxattr or inode_setsecurity callbacks to be invoked. SELinux will deny the xattr update by virtue of applying mountpoint labeling to unprivileged userns mounts, and Smack will deny the writes for any user without global CAP_MAC_ADMIN, so loosening the capability check in commoncap is safe in this respect as well. Signed-off-by:
Seth Forshee <seth.forshee@canonical.com> Acked-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
Christian Brauner <christian@brauner.io> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Apr 11, 2018
-
-
Tetsuo Handa authored
syzbot is reporting NULL pointer dereference at xattr_getsecurity() [1], for cap_inode_getsecurity() is returning sizeof(struct vfs_cap_data) when memory allocation failed. Return -ENOMEM if memory allocation failed. [1] https://syzkaller.appspot.com/bug?id=a55ba438506fe68649a5f50d2d82d56b365e0107 Signed-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities") Reported-by:
syzbot <syzbot+9369930ca44f29e60e2d@syzkaller.appspotmail.com> Cc: stable <stable@vger.kernel.org> # 4.14+ Acked-by:
Serge E. Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.morris@microsoft.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Jan 02, 2018
-
-
Eric Biggers authored
If userspace attempted to set a "security.capability" xattr shorter than 4 bytes (e.g. 'setfattr -n security.capability -v x file'), then cap_convert_nscap() read past the end of the buffer containing the xattr value because it accessed the ->magic_etc field without verifying that the xattr value is long enough to contain that field. Fix it by validating the xattr value size first. This bug was found using syzkaller with KASAN. The KASAN report was as follows (cleaned up slightly): BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498 Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852 CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0xe3/0x195 lib/dump_stack.c:53 print_address_description+0x73/0x260 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x235/0x350 mm/kasan/report.c:409 cap_convert_nscap+0x514/0x630 security/commoncap.c:498 setxattr+0x2bd/0x350 fs/xattr.c:446 path_setxattr+0x168/0x1b0 fs/xattr.c:472 SYSC_setxattr fs/xattr.c:487 [inline] SyS_setxattr+0x36/0x50 fs/xattr.c:483 entry_SYSCALL_64_fastpath+0x18/0x85 Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities") Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by:
Eric Biggers <ebiggers@google.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
- Oct 20, 2017
-
-
Richard Guy Briggs authored
The existing condition tested for process effective capabilities set by file attributes but intended to ignore the change if the result was unsurprisingly an effective full set in the case root is special with a setuid root executable file and we are root. Stated again: - When you execute a setuid root application, it is no surprise and expected that it got all capabilities, so we do not want capabilities recorded. if (pE_grew && !(pE_fullset && (eff_root || real_root) && root_priveleged) ) Now make sure we cover other cases: - If something prevented a setuid root app getting all capabilities and it wound up with one capability only, then it is a surprise and should be logged. When it is a setuid root file, we only want capabilities when the process does not get full capabilities.. root_priveleged && setuid_root && !pE_fullset - Similarly if a non-setuid program does pick up capabilities due to file system based capabilities, then we want to know what capabilities were picked up. When it has file system based capabilities we want the capabilities. !is_setuid && (has_fcap && pP_gained) - If it is a non-setuid file and it gets ambient capabilities, we want the capabilities. !is_setuid && pA_gained - These last two are combined into one due to the common first parameter. Related: https://github.com/linux-audit/audit-kernel/issues/16 Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Acked-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Now that the logic is inverted, it is much easier to see that both real root and effective root conditions had to be met to avoid printing the BPRM_FCAPS record with audit syscalls. This meant that any setuid root applications would print a full BPRM_FCAPS record when it wasn't necessary, cluttering the event output, since the SYSCALL and PATH records indicated the presence of the setuid bit and effective root user id. Require only one of effective root or real root to avoid printing the unnecessary record. Ref: commit 3fc689e9 ("Add audit_log_bprm_fcaps/AUDIT_BPRM_FCAPS") See: https://github.com/linux-audit/audit-kernel/issues/16 Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Acked-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
The way the logic was presented, it was awkward to read and verify. Invert the logic using DeMorgan's Law to be more easily able to read and understand. Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Remove a layer of conditional logic to make the use of conditions easier to read and analyse. Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Move the audit log decision logic to its own function to isolate the complexity in one place. Suggested-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Introduce a number of inlines to make the use of the negation of uid_eq() easier to read and analyse. Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Introduce inline root_privileged() to make use of SECURE_NONROOT easier to read. Suggested-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Rename has_cap to has_fcap to clarify it applies to file capabilities since the entire source file is about capabilities. Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Introduce macros cap_gained, cap_grew, cap_full to make the use of the negation of is_subset() easier to read and analyse. Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
Richard Guy Briggs authored
Factor out the case of privileged root from the function cap_bprm_set_creds() to make the latter easier to read and analyse. Suggested-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Acked-by:
James Morris <james.l.morris@oracle.com> Acked-by:
Kees Cook <keescook@chromium.org> Okay-ished-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
- Oct 19, 2017
-
-
Colin Ian King authored
The pointer fs_ns is assigned from inode->i_ib->s_user_ns before a null pointer check on inode, hence if inode is actually null we will get a null pointer dereference on this assignment. Fix this by only dereferencing inode after the null pointer check on inode. Detected by CoverityScan CID#1455328 ("Dereference before null check") Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities") Signed-off-by:
Colin Ian King <colin.king@canonical.com> Cc: stable@vger.kernel.org Acked-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
- Sep 24, 2017
-
-
Stefan Berger authored
cap_inode_need_killpriv returns 1 if security.capability exists and has a value and inode_killpriv() is required, 0 otherwise. Fix the description of the return value to reflect this. Signed-off-by:
Stefan Berger <stefanb@linux.vnet.ibm.com> Reviewed-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
James Morris <james.l.morris@oracle.com>
-
- Sep 01, 2017
-
-
Serge E. Hallyn authored
Root in a non-initial user ns cannot be trusted to write a traditional security.capability xattr. If it were allowed to do so, then any unprivileged user on the host could map his own uid to root in a private namespace, write the xattr, and execute the file with privilege on the host. However supporting file capabilities in a user namespace is very desirable. Not doing so means that any programs designed to run with limited privilege must continue to support other methods of gaining and dropping privilege. For instance a program installer must detect whether file capabilities can be assigned, and assign them if so but set setuid-root otherwise. The program in turn must know how to drop partial capabilities, and do so only if setuid-root. This patch introduces v3 of the security.capability xattr. It builds a vfs_ns_cap_data struct by appending a uid_t rootid to struct vfs_cap_data. This is the absolute uid_t (that is, the uid_t in user namespace which mounted the filesystem, usually init_user_ns) of the root id in whose namespaces the file capabilities may take effect. When a task asks to write a v2 security.capability xattr, if it is privileged with respect to the userns which mounted the filesystem, then nothing should change. Otherwise, the kernel will transparently rewrite the xattr as a v3 with the appropriate rootid. This is done during the execution of setxattr() to catch user-space-initiated capability writes. Subsequently, any task executing the file which has the noted kuid as its root uid, or which is in a descendent user_ns of such a user_ns, will run the file with capabilities. Similarly when asking to read file capabilities, a v3 capability will be presented as v2 if it applies to the caller's namespace. If a task writes a v3 security.capability, then it can provide a uid for the xattr so long as the uid is valid in its own user namespace, and it is privileged with CAP_SETFCAP over its namespace. The kernel will translate that rootid to an absolute uid, and write that to disk. After this, a task in the writer's namespace will not be able to use those capabilities (unless rootid was 0), but a task in a namespace where the given uid is root will. Only a single security.capability xattr may exist at a time for a given file. A task may overwrite an existing xattr so long as it is privileged over the inode. Note this is a departure from previous semantics, which required privilege to remove a security.capability xattr. This check can be re-added if deemed useful. This allows a simple setxattr to work, allows tar/untar to work, and allows us to tar in one namespace and untar in another while preserving the capability, without risking leaking privilege into a parent namespace. Example using tar: $ cp /bin/sleep sleepx $ mkdir b1 b2 $ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1 $ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2 $ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx $ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar $ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx b2/sleepx = cap_sys_admin+ep # /opt/ltp/testcases/bin/getv3xattr b2/sleepx v3 xattr, rootid is 100001 A patch to linux-test-project adding a new set of tests for this functionality is in the nsfscaps branch at github.com/hallyn/ltp Changelog: Nov 02 2016: fix invalid check at refuse_fcap_overwrite() Nov 07 2016: convert rootid from and to fs user_ns (From ebiederm: mar 28 2017) commoncap.c: fix typos - s/v4/v3 get_vfs_caps_from_disk: clarify the fs_ns root access check nsfscaps: change the code split for cap_inode_setxattr() Apr 09 2017: don't return v3 cap for caps owned by current root. return a v2 cap for a true v2 cap in non-init ns Apr 18 2017: . Change the flow of fscap writing to support s_user_ns writing. . Remove refuse_fcap_overwrite(). The value of the previous xattr doesn't matter. Apr 24 2017: . incorporate Eric's incremental diff . move cap_convert_nscap to setxattr and simplify its usage May 8, 2017: . fix leaking dentry refcount in cap_inode_getsecurity Signed-off-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-