Skip to content
Snippets Groups Projects
Forked from drm / msm
Source project has a limited visibility.
  • Sargun Dhillon's avatar
    c2aa2dfe
    seccomp: Add wait_killable semantic to seccomp user notifier · c2aa2dfe
    Sargun Dhillon authored
    
    This introduces a per-filter flag (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV)
    that makes it so that when notifications are received by the supervisor the
    notifying process will transition to wait killable semantics. Although wait
    killable isn't a set of semantics formally exposed to userspace, the
    concept is searchable. If the notifying process is signaled prior to the
    notification being received by the userspace agent, it will be handled as
    normal.
    
    One quirk about how this is handled is that the notifying process
    only switches to TASK_KILLABLE if it receives a wakeup from either
    an addfd or a signal. This is to avoid an unnecessary wakeup of
    the notifying task.
    
    The reasons behind switching into wait_killable only after userspace
    receives the notification are:
    * Avoiding unncessary work - Often, workloads will perform work that they
      may abort (request racing comes to mind). This allows for syscalls to be
      aborted safely prior to the notification being received by the
      supervisor. In this, the supervisor doesn't end up doing work that the
      workload does not want to complete anyways.
    * Avoiding side effects - We don't want the syscall to be interruptible
      once the supervisor starts doing work because it may not be trivial
      to reverse the operation. For example, unmounting a file system may
      take a long time, and it's hard to rollback, or treat that as
      reentrant.
    * Avoid breaking runtimes - Various runtimes do not GC when they are
      during a syscall (or while running native code that subsequently
      calls a syscall). If many notifications are blocked, and not picked
      up by the supervisor, this can get the application into a bad state.
    
    Signed-off-by: default avatarSargun Dhillon <sargun@sargun.me>
    Signed-off-by: default avatarKees Cook <keescook@chromium.org>
    Link: https://lore.kernel.org/r/20220503080958.20220-2-sargun@sargun.me
    c2aa2dfe
    History
    seccomp: Add wait_killable semantic to seccomp user notifier
    Sargun Dhillon authored
    
    This introduces a per-filter flag (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV)
    that makes it so that when notifications are received by the supervisor the
    notifying process will transition to wait killable semantics. Although wait
    killable isn't a set of semantics formally exposed to userspace, the
    concept is searchable. If the notifying process is signaled prior to the
    notification being received by the userspace agent, it will be handled as
    normal.
    
    One quirk about how this is handled is that the notifying process
    only switches to TASK_KILLABLE if it receives a wakeup from either
    an addfd or a signal. This is to avoid an unnecessary wakeup of
    the notifying task.
    
    The reasons behind switching into wait_killable only after userspace
    receives the notification are:
    * Avoiding unncessary work - Often, workloads will perform work that they
      may abort (request racing comes to mind). This allows for syscalls to be
      aborted safely prior to the notification being received by the
      supervisor. In this, the supervisor doesn't end up doing work that the
      workload does not want to complete anyways.
    * Avoiding side effects - We don't want the syscall to be interruptible
      once the supervisor starts doing work because it may not be trivial
      to reverse the operation. For example, unmounting a file system may
      take a long time, and it's hard to rollback, or treat that as
      reentrant.
    * Avoid breaking runtimes - Various runtimes do not GC when they are
      during a syscall (or while running native code that subsequently
      calls a syscall). If many notifications are blocked, and not picked
      up by the supervisor, this can get the application into a bad state.
    
    Signed-off-by: default avatarSargun Dhillon <sargun@sargun.me>
    Signed-off-by: default avatarKees Cook <keescook@chromium.org>
    Link: https://lore.kernel.org/r/20220503080958.20220-2-sargun@sargun.me