Skip to content
Snippets Groups Projects
  1. Oct 27, 2023
    • WangJinchao's avatar
      padata: Fix refcnt handling in padata_free_shell() · 7ddc21e3
      WangJinchao authored
      
      In a high-load arm64 environment, the pcrypt_aead01 test in LTP can lead
      to system UAF (Use-After-Free) issues. Due to the lengthy analysis of
      the pcrypt_aead01 function call, I'll describe the problem scenario
      using a simplified model:
      
      Suppose there's a user of padata named `user_function` that adheres to
      the padata requirement of calling `padata_free_shell` after `serial()`
      has been invoked, as demonstrated in the following code:
      
      ```c
      struct request {
          struct padata_priv padata;
          struct completion *done;
      };
      
      void parallel(struct padata_priv *padata) {
          do_something();
      }
      
      void serial(struct padata_priv *padata) {
          struct request *request = container_of(padata,
          				struct request,
      				padata);
          complete(request->done);
      }
      
      void user_function() {
          DECLARE_COMPLETION(done)
          padata->parallel = parallel;
          padata->serial = serial;
          padata_do_parallel();
          wait_for_completion(&done);
          padata_free_shell();
      }
      ```
      
      In the corresponding padata.c file, there's the following code:
      
      ```c
      static void padata_serial_worker(struct work_struct *serial_work) {
          ...
          cnt = 0;
      
          while (!list_empty(&local_list)) {
              ...
              padata->serial(padata);
              cnt++;
          }
      
          local_bh_enable();
      
          if (refcount_sub_and_test(cnt, &pd->refcnt))
              padata_free_pd(pd);
      }
      ```
      
      Because of the high system load and the accumulation of unexecuted
      softirq at this moment, `local_bh_enable()` in padata takes longer
      to execute than usual. Subsequently, when accessing `pd->refcnt`,
      `pd` has already been released by `padata_free_shell()`, resulting
      in a UAF issue with `pd->refcnt`.
      
      The fix is straightforward: add `refcount_dec_and_test` before calling
      `padata_free_pd` in `padata_free_shell`.
      
      Fixes: 07928d9b ("padata: Remove broken queue flushing")
      
      Signed-off-by: default avatarWangJinchao <wangjinchao@xfusion.com>
      Acked-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      7ddc21e3
  2. Sep 15, 2023
    • Lu Jialin's avatar
      crypto: pcrypt - Fix hungtask for PADATA_RESET · 8f4f68e7
      Lu Jialin authored
      
      We found a hungtask bug in test_aead_vec_cfg as follows:
      
      INFO: task cryptomgr_test:391009 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Call trace:
       __switch_to+0x98/0xe0
       __schedule+0x6c4/0xf40
       schedule+0xd8/0x1b4
       schedule_timeout+0x474/0x560
       wait_for_common+0x368/0x4e0
       wait_for_completion+0x20/0x30
       wait_for_completion+0x20/0x30
       test_aead_vec_cfg+0xab4/0xd50
       test_aead+0x144/0x1f0
       alg_test_aead+0xd8/0x1e0
       alg_test+0x634/0x890
       cryptomgr_test+0x40/0x70
       kthread+0x1e0/0x220
       ret_from_fork+0x10/0x18
       Kernel panic - not syncing: hung_task: blocked tasks
      
      For padata_do_parallel, when the return err is 0 or -EBUSY, it will call
      wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal
      case, aead_request_complete() will be called in pcrypt_aead_serial and the
      return err is 0 for padata_do_parallel. But, when pinst->flags is
      PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it
      won't call aead_request_complete(). Therefore, test_aead_vec_cfg will
      hung at wait_for_completion(&wait->completion), which will cause
      hungtask.
      
      The problem comes as following:
      (padata_do_parallel)                 |
          rcu_read_lock_bh();              |
          err = -EINVAL;                   |   (padata_replace)
                                           |     pinst->flags |= PADATA_RESET;
          err = -EBUSY                     |
          if (pinst->flags & PADATA_RESET) |
              rcu_read_unlock_bh()         |
              return err
      
      In order to resolve the problem, we replace the return err -EBUSY with
      -EAGAIN, which means parallel_data is changing, and the caller should call
      it again.
      
      v3:
      remove retry and just change the return err.
      v2:
      introduce padata_try_do_parallel() in pcrypt_aead_encrypt and
      pcrypt_aead_decrypt to solve the hungtask.
      
      Signed-off-by: default avatarLu Jialin <lujialin4@huawei.com>
      Signed-off-by: default avatarGuo Zihua <guozihua@huawei.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      8f4f68e7
  3. Mar 14, 2023
  4. Dec 14, 2022
    • Nathan Chancellor's avatar
      padata: Mark padata_work_init() as __ref · 0d24f1b7
      Nathan Chancellor authored
      
      When building arm64 allmodconfig + ThinLTO with clang and a proposed
      modpost update to account for -ffuncton-sections, the following warning
      appears:
      
        WARNING: modpost: vmlinux.o: section mismatch in reference: padata_work_init (section: .text.padata_work_init) -> padata_mt_helper (section: .init.text)
        WARNING: modpost: vmlinux.o: section mismatch in reference: padata_work_init (section: .text.padata_work_init) -> padata_mt_helper (section: .init.text)
      
      LLVM has optimized padata_work_init() to include the address of
      padata_mt_helper() directly because it inlined the other call to
      padata_work_init() with padata_parallel_worker(), meaning the remaining
      uses of padata_work_init() use padata_mt_helper() as the work_fn
      argument. This optimization causes modpost to complain since
      padata_work_init() is not __init, whereas padata_mt_helper() is.
      
      Since padata_work_init() is only called from __init code when
      padata_mt_helper() is passed as the work_fn argument, mark
      padata_work_init() as __ref, which makes it clear to modpost that this
      scenario is okay.
      
      Suggested-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Acked-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      0d24f1b7
  5. Nov 25, 2022
  6. Jan 31, 2022
  7. Aug 27, 2021
  8. Aug 12, 2021
  9. Jul 30, 2021
  10. Sep 04, 2020
    • Daniel Jordan's avatar
      padata: fix possible padata_works_lock deadlock · 1b0df11f
      Daniel Jordan authored
      
      syzbot reports,
      
        WARNING: inconsistent lock state
        5.9.0-rc2-syzkaller #0 Not tainted
        --------------------------------
        inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
        syz-executor.0/26715 takes:
        (padata_works_lock){+.?.}-{2:2}, at: padata_do_parallel kernel/padata.c:220
        {IN-SOFTIRQ-W} state was registered at:
          spin_lock include/linux/spinlock.h:354 [inline]
          padata_do_parallel kernel/padata.c:220
          ...
          __do_softirq kernel/softirq.c:298
          ...
          sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1091
          asm_sysvec_apic_timer_interrupt arch/x86/include/asm/idtentry.h:581
      
         Possible unsafe locking scenario:
      
               CPU0
               ----
          lock(padata_works_lock);
          <Interrupt>
            lock(padata_works_lock);
      
      padata_do_parallel() takes padata_works_lock with softirqs enabled, so a
      deadlock is possible if, on the same CPU, the lock is acquired in
      process context and then softirq handling done in an interrupt leads to
      the same path.
      
      Fix by leaving softirqs disabled while do_parallel holds
      padata_works_lock.
      
      Reported-by: default avatar <syzbot+f4b9f49e38e25eb4ef52@syzkaller.appspotmail.com>
      Fixes: 4611ce22 ("padata: allocate work structures for parallel jobs from a pool")
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      1b0df11f
  11. Jul 23, 2020
  12. Jun 18, 2020
  13. Jun 04, 2020
    • Daniel Jordan's avatar
      padata: add basic support for multithreaded jobs · 004ed426
      Daniel Jordan authored
      
      Sometimes the kernel doesn't take full advantage of system memory
      bandwidth, leading to a single CPU spending excessive time in
      initialization paths where the data scales with memory size.
      
      Multithreading naturally addresses this problem.
      
      Extend padata, a framework that handles many parallel yet singlethreaded
      jobs, to also handle multithreaded jobs by adding support for splitting up
      the work evenly, specifying a minimum amount of work that's appropriate
      for one helper thread to do, load balancing between helpers, and
      coordinating them.
      
      This is inspired by work from Pavel Tatashin and Steve Sistare.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Zi Yan <ziy@nvidia.com>
      Link: http://lkml.kernel.org/r/20200527173608.2885243-5-daniel.m.jordan@oracle.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      004ed426
    • Daniel Jordan's avatar
      padata: allocate work structures for parallel jobs from a pool · 4611ce22
      Daniel Jordan authored
      
      padata allocates per-CPU, per-instance work structs for parallel jobs.  A
      do_parallel call assigns a job to a sequence number and hashes the number
      to a CPU, where the job will eventually run using the corresponding work.
      
      This approach fit with how padata used to bind a job to each CPU
      round-robin, makes less sense after commit bfde23ce ("padata: unbind
      parallel jobs from specific CPUs") because a work isn't bound to a
      particular CPU anymore, and isn't needed at all for multithreaded jobs
      because they don't have sequence numbers.
      
      Replace the per-CPU works with a preallocated pool, which allows sharing
      them between existing padata users and the upcoming multithreaded user.
      The pool will also facilitate setting NUMA-aware concurrency limits with
      later users.
      
      The pool is sized according to the number of possible CPUs.  With this
      limit, MAX_OBJ_NUM no longer makes sense, so remove it.
      
      If the global pool is exhausted, a parallel job is run in the current task
      instead to throttle a system trying to do too much in parallel.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Zi Yan <ziy@nvidia.com>
      Link: http://lkml.kernel.org/r/20200527173608.2885243-4-daniel.m.jordan@oracle.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4611ce22
    • Daniel Jordan's avatar
      padata: initialize earlier · f1b192b1
      Daniel Jordan authored
      
      padata will soon initialize the system's struct pages in parallel, so it
      needs to be ready by page_alloc_init_late().
      
      The error return from padata_driver_init() triggers an initcall warning,
      so add a warning to padata_init() to avoid silent failure.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Zi Yan <ziy@nvidia.com>
      Link: http://lkml.kernel.org/r/20200527173608.2885243-3-daniel.m.jordan@oracle.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f1b192b1
    • Daniel Jordan's avatar
      padata: remove exit routine · 305dacf7
      Daniel Jordan authored
      
      Patch series "padata: parallelize deferred page init", v3.
      
      Deferred struct page init is a bottleneck in kernel boot--the biggest for
      us and probably others.  Optimizing it maximizes availability for
      large-memory systems and allows spinning up short-lived VMs as needed
      without having to leave them running.  It also benefits bare metal
      machines hosting VMs that are sensitive to downtime.  In projects such as
      VMM Fast Restart[1], where guest state is preserved across kexec reboot,
      it helps prevent application and network timeouts in the guests.
      
      So, multithread deferred init to take full advantage of system memory
      bandwidth.
      
      Extend padata, a framework that handles many parallel singlethreaded jobs,
      to handle multithreaded jobs as well by adding support for splitting up
      the work evenly, specifying a minimum amount of work that's appropriate
      for one helper thread to do, load balancing between helpers, and
      coordinating them.  More documentation in patches 4 and 8.
      
      This series is the first step in a project to address other memory
      proportional bottlenecks in the kernel such as pmem struct page init, vfio
      page pinning, hugetlb fallocate, and munmap.  Deferred page init doesn't
      require concurrency limits, resource control, or priority adjustments like
      these other users will because it happens during boot when the system is
      otherwise idle and waiting for page init to finish.
      
      This has been run on a variety of x86 systems and speeds up kernel boot by
      4% to 49%, saving up to 1.6 out of 4 seconds.  Patch 6 has more numbers.
      
      This patch (of 8):
      
      padata_driver_exit() is unnecessary because padata isn't built as a module
      and doesn't exit.
      
      padata's init routine will soon allocate memory, so getting rid of the
      exit function now avoids pointless code to free it.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Zi Yan <ziy@nvidia.com>
      Link: http://lkml.kernel.org/r/20200527173608.2885243-1-daniel.m.jordan@oracle.com
      Link: http://lkml.kernel.org/r/20200527173608.2885243-2-daniel.m.jordan@oracle.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      305dacf7
  14. Apr 30, 2020
    • Daniel Jordan's avatar
      padata: add separate cpuhp node for CPUHP_PADATA_DEAD · 3c2214b6
      Daniel Jordan authored
      
      Removing the pcrypt module triggers this:
      
        general protection fault, probably for non-canonical
          address 0xdead000000000122
        CPU: 5 PID: 264 Comm: modprobe Not tainted 5.6.0+ #2
        Hardware name: QEMU Standard PC
        RIP: 0010:__cpuhp_state_remove_instance+0xcc/0x120
        Call Trace:
         padata_sysfs_release+0x74/0xce
         kobject_put+0x81/0xd0
         padata_free+0x12/0x20
         pcrypt_exit+0x43/0x8ee [pcrypt]
      
      padata instances wrongly use the same hlist node for the online and dead
      states, so __padata_free()'s second cpuhp remove call chokes on the node
      that the first poisoned.
      
      cpuhp multi-instance callbacks only walk forward in cpuhp_step->list and
      the same node is linked in both the online and dead lists, so the list
      corruption that results from padata_alloc() adding the node to a second
      list without removing it from the first doesn't cause problems as long
      as no instances are freed.
      
      Avoid the issue by giving each state its own node.
      
      Fixes: 894c9ef9 ("padata: validate cpumask without removed CPU during offline")
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      3c2214b6
  15. Mar 06, 2020
  16. Feb 22, 2020
  17. Dec 11, 2019
    • Daniel Jordan's avatar
      padata: update documentation · bfcdcef8
      Daniel Jordan authored
      
      Remove references to unused functions, standardize language, update to
      reflect new functionality, migrate to rst format, and fix all kernel-doc
      warnings.
      
      Fixes: 815613da ("kernel/padata.c: removed unused code")
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      bfcdcef8
    • Daniel Jordan's avatar
      padata: remove reorder_objects · 3facced7
      Daniel Jordan authored
      
      reorder_objects is unused since the rework of padata's flushing, so
      remove it.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      3facced7
    • Daniel Jordan's avatar
      padata: remove cpumask change notifier · 91a71d61
      Daniel Jordan authored
      
      Since commit 63d35788 ("crypto: pcrypt - remove padata cpumask
      notifier") this feature is unused, so get rid of it.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      91a71d61
    • Daniel Jordan's avatar
      padata: always acquire cpu_hotplug_lock before pinst->lock · 38228e88
      Daniel Jordan authored
      
      lockdep complains when padata's paths to update cpumasks via CPU hotplug
      and sysfs are both taken:
      
        # echo 0 > /sys/devices/system/cpu/cpu1/online
        # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted
        ------------------------------------------------------
        bash/205 is trying to acquire lock:
        ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120
      
        but task is already holding lock:
        ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120
      
        which lock already depends on the new lock.
      
      padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent
      order.  Which should be first?  CPU hotplug calls into padata with
      cpu_hotplug_lock already held, so it should have priority.
      
      Fixes: 6751fb3c ("padata: Use get_online_cpus/put_online_cpus")
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      38228e88
    • Daniel Jordan's avatar
      padata: validate cpumask without removed CPU during offline · 894c9ef9
      Daniel Jordan authored
      
      Configuring an instance's parallel mask without any online CPUs...
      
        echo 2 > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
        echo 0 > /sys/devices/system/cpu/cpu1/online
      
      ...makes tcrypt mode=215 crash like this:
      
        divide error: 0000 [#1] SMP PTI
        CPU: 4 PID: 283 Comm: modprobe Not tainted 5.4.0-rc8-padata-doc-v2+ #2
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014
        RIP: 0010:padata_do_parallel+0x114/0x300
        Call Trace:
         pcrypt_aead_encrypt+0xc0/0xd0 [pcrypt]
         crypto_aead_encrypt+0x1f/0x30
         do_mult_aead_op+0x4e/0xdf [tcrypt]
         test_mb_aead_speed.constprop.0.cold+0x226/0x564 [tcrypt]
         do_test+0x28c2/0x4d49 [tcrypt]
         tcrypt_mod_init+0x55/0x1000 [tcrypt]
         ...
      
      cpumask_weight() in padata_cpu_hash() returns 0 because the mask has no
      CPUs.  The problem is __padata_remove_cpu() checks for valid masks too
      early and so doesn't mark the instance PADATA_INVALID as expected, which
      would have made padata_do_parallel() return error before doing the
      division.
      
      Fix by introducing a second padata CPU hotplug state before
      CPUHP_BRINGUP_CPU so that __padata_remove_cpu() sees the online mask
      without @cpu.  No need for the second argument to padata_replace() since
      @cpu is now already missing from the online mask.
      
      Fixes: 33e54450 ("padata: Handle empty padata cpumasks")
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      894c9ef9
    • Herbert Xu's avatar
      crypto: pcrypt - Avoid deadlock by using per-instance padata queues · bbefa1dd
      Herbert Xu authored
      
      If the pcrypt template is used multiple times in an algorithm, then a
      deadlock occurs because all pcrypt instances share the same
      padata_instance, which completes requests in the order submitted.  That
      is, the inner pcrypt request waits for the outer pcrypt request while
      the outer request is already waiting for the inner.
      
      This patch fixes this by allocating a set of queues for each pcrypt
      instance instead of using two global queues.  In order to maintain
      the existing user-space interface, the pinst structure remains global
      so any sysfs modifications will apply to every pcrypt instance.
      
      Note that when an update occurs we have to allocate memory for
      every pcrypt instance.  Should one of the allocations fail we
      will abort the update without rolling back changes already made.
      
      The new per-instance data structure is called padata_shell and is
      essentially a wrapper around parallel_data.
      
      Reproducer:
      
      	#include <linux/if_alg.h>
      	#include <sys/socket.h>
      	#include <unistd.h>
      
      	int main()
      	{
      		struct sockaddr_alg addr = {
      			.salg_type = "aead",
      			.salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))"
      		};
      		int algfd, reqfd;
      		char buf[32] = { 0 };
      
      		algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
      		bind(algfd, (void *)&addr, sizeof(addr));
      		setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20);
      		reqfd = accept(algfd, 0, 0);
      		write(reqfd, buf, 32);
      		read(reqfd, buf, 16);
      	}
      
      Reported-by: default avatar <syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com>
      Fixes: 5068c7a8 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: default avatarEric Biggers <ebiggers@kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      bbefa1dd
    • Herbert Xu's avatar
      padata: Remove unused padata_remove_cpu · 13380a14
      Herbert Xu authored
      
      The function padata_remove_cpu was supposed to have been removed
      along with padata_add_cpu but somehow it remained behind.  Let's
      kill it now as it doesn't even have a prototype anymore.
      
      Fixes: 815613da ("kernel/padata.c: removed unused code")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Reviewed-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      13380a14
    • Herbert Xu's avatar
      padata: Remove broken queue flushing · 07928d9b
      Herbert Xu authored
      
      The function padata_flush_queues is fundamentally broken because
      it cannot force padata users to complete the request that is
      underway.  IOW padata has to passively wait for the completion
      of any outstanding work.
      
      As it stands flushing is used in two places.  Its use in padata_stop
      is simply unnecessary because nothing depends on the queues to
      be flushed afterwards.
      
      The other use in padata_replace is more substantial as we depend
      on it to free the old pd structure.  This patch instead uses the
      pd->refcnt to dynamically free the pd structure once all requests
      are complete.
      
      Fixes: 2b73b07a ("padata: Flush the padata queues actively")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Reviewed-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      07928d9b
  18. Sep 13, 2019
    • Daniel Jordan's avatar
      padata: remove cpu_index from the parallel_queue · c51636a3
      Daniel Jordan authored
      
      With the removal of the ENODATA case from padata_get_next, the cpu_index
      field is no longer useful, so it can go away.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c51636a3
    • Daniel Jordan's avatar
      padata: unbind parallel jobs from specific CPUs · bfde23ce
      Daniel Jordan authored
      
      Padata binds the parallel part of a job to a single CPU and round-robins
      over all CPUs in the system for each successive job.  Though the serial
      parts rely on per-CPU queues for correct ordering, they're not necessary
      for parallel work, and it improves performance to run the job locally on
      NUMA machines and let the scheduler pick the CPU within a node on a busy
      system.
      
      So, make the parallel workqueue unbound.
      
      Update the parallel workqueue's cpumask when the instance's parallel
      cpumask changes.
      
      Now that parallel jobs no longer run on max_active=1 workqueues, two or
      more parallel works that hash to the same CPU may run simultaneously,
      finish out of order, and so be serialized out of order.  Prevent this by
      keeping the works sorted on the reorder list by sequence number and
      checking that in the reordering logic.
      
      padata_get_next becomes padata_find_next so it can be reused for the end
      of padata_reorder, where it's used to avoid uselessly queueing work when
      the next job by sequence number isn't finished yet but a later job that
      hashed to the same CPU has.
      
      The ENODATA case in padata_find_next no longer makes sense because
      parallel jobs aren't bound to specific CPUs.  The EINPROGRESS case takes
      care of the scenario where a parallel job is potentially running on the
      same CPU as padata_find_next, and with only one error code left, just
      use NULL instead.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      bfde23ce
    • Daniel Jordan's avatar
      padata: use separate workqueues for parallel and serial work · 45d153c0
      Daniel Jordan authored
      
      padata currently uses one per-CPU workqueue per instance for all work.
      
      Prepare for running parallel jobs on an unbound workqueue by introducing
      dedicated workqueues for parallel and serial work.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      45d153c0
    • Daniel Jordan's avatar
      padata, pcrypt: take CPU hotplug lock internally in padata_alloc_possible · cc491d8e
      Daniel Jordan authored
      
      With pcrypt's cpumask no longer used, take the CPU hotplug lock inside
      padata_alloc_possible.
      
      Useful later in the series for avoiding nested acquisition of the CPU
      hotplug lock in padata when padata_alloc_possible is allocating an
      unbound workqueue.
      
      Without this patch, this nested acquisition would happen later in the
      series:
      
            pcrypt_init_padata
              get_online_cpus
              alloc_padata_possible
                alloc_padata
                  alloc_workqueue(WQ_UNBOUND)   // later in the series
                    alloc_and_link_pwqs
                      apply_wqattrs_lock
                        get_online_cpus         // recursive rwsem acquisition
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      cc491d8e
    • Daniel Jordan's avatar
      padata: make padata_do_parallel find alternate callback CPU · e6ce0e08
      Daniel Jordan authored
      
      padata_do_parallel currently returns -EINVAL if the callback CPU isn't
      in the callback cpumask.
      
      pcrypt tries to prevent this situation by keeping its own callback
      cpumask in sync with padata's and checks that the callback CPU it passes
      to padata is valid.  Make padata handle this instead.
      
      padata_do_parallel now takes a pointer to the callback CPU and updates
      it for the caller if an alternate CPU is used.  Overall behavior in
      terms of which callback CPUs are chosen stays the same.
      
      Prepares for removal of the padata cpumask notifier in pcrypt, which
      will fix a lockdep complaint about nested acquisition of the CPU hotplug
      lock later in the series.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      e6ce0e08
    • Daniel Jordan's avatar
      padata: allocate workqueue internally · b128a304
      Daniel Jordan authored
      
      Move workqueue allocation inside of padata to prepare for further
      changes to how padata uses workqueues.
      
      Guarantees the workqueue is created with max_active=1, which padata
      relies on to work correctly.  No functional change.
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      b128a304
Loading