- Jan 31, 2022
-
-
Yury Norov authored
padata_do_parallel() calls cpumask_weight() to check if any bit of a given cpumask is set. We can do it more efficiently with cpumask_empty() because cpumask_empty() stops traversing the cpumask as soon as it finds first set bit, while cpumask_weight() counts all bits unconditionally. Signed-off-by:
Yury Norov <yury.norov@gmail.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Aug 27, 2021
-
-
Cai Huoqing authored
remove it because SPDX-License-Identifier is already used Signed-off-by:
Cai Huoqing <caihuoqing@baidu.com> Acked-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Aug 12, 2021
-
-
Sebastian Andrzej Siewior authored
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: linux-crypto@vger.kernel.org Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 30, 2021
-
-
Xiyu Yang authored
refcount_t type and corresponding API can protect refcounters from accidental underflow and overflow and further use-after-free situations. Signed-off-by:
Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by:
Xin Tan <tanxin.ctf@gmail.com> Acked-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Sep 04, 2020
-
-
Daniel Jordan authored
syzbot reports, WARNING: inconsistent lock state 5.9.0-rc2-syzkaller #0 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. syz-executor.0/26715 takes: (padata_works_lock){+.?.}-{2:2}, at: padata_do_parallel kernel/padata.c:220 {IN-SOFTIRQ-W} state was registered at: spin_lock include/linux/spinlock.h:354 [inline] padata_do_parallel kernel/padata.c:220 ... __do_softirq kernel/softirq.c:298 ... sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1091 asm_sysvec_apic_timer_interrupt arch/x86/include/asm/idtentry.h:581 Possible unsafe locking scenario: CPU0 ---- lock(padata_works_lock); <Interrupt> lock(padata_works_lock); padata_do_parallel() takes padata_works_lock with softirqs enabled, so a deadlock is possible if, on the same CPU, the lock is acquired in process context and then softirq handling done in an interrupt leads to the same path. Fix by leaving softirqs disabled while do_parallel holds padata_works_lock. Reported-by:
<syzbot+f4b9f49e38e25eb4ef52@syzkaller.appspotmail.com> Fixes: 4611ce22 ("padata: allocate work structures for parallel jobs from a pool") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 23, 2020
-
-
Daniel Jordan authored
Only its reorder field is actually used now, so remove the struct and embed @reorder directly in parallel_data. No functional change, just a cleanup. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
There's no reason to have two interfaces when there's only one caller. Removing _possible saves text and simplifies future changes. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
A padata instance has effective cpumasks that store the user-supplied masks ANDed with the online mask, but this middleman is unnecessary. parallel_data keeps the same information around. Removing this saves text and code churn in future changes. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
pd_setup_cpumasks() has only one caller. Move its contents inline to prepare for the next cleanup. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
padata_stop() has two callers and is unnecessary in both cases. When pcrypt calls it before padata_free(), it's being unloaded so there are no outstanding padata jobs[0]. When __padata_free() calls it, it's either along the same path or else pcrypt initialization failed, which of course means there are also no outstanding jobs. Removing it simplifies padata and saves text. [0] https://lore.kernel.org/linux-crypto/20191119225017.mjrak2fwa5vccazl@gondor.apana.org.au/ Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
padata_start() is only used right after pcrypt allocates an instance with all possible CPUs, when PADATA_INVALID can't happen, so there's no need for a separate "start" step. It can be done during allocation to save text, make using padata easier, and avoid unneeded calls in the future. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jun 18, 2020
-
-
Daniel Jordan authored
A 5.7 kernel hangs during a tcrypt test of padata that waits for an AEAD request to finish. This is only seen on large machines running many concurrent requests. The issue is that padata never serializes the request. The removal of the reorder_objects atomic missed that the memory barrier in padata_do_serial() depends on it. Upgrade the barrier from smp_mb__after_atomic to smp_mb to get correct ordering again. Fixes: 3facced7 ("padata: remove reorder_objects") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jun 04, 2020
-
-
Daniel Jordan authored
Sometimes the kernel doesn't take full advantage of system memory bandwidth, leading to a single CPU spending excessive time in initialization paths where the data scales with memory size. Multithreading naturally addresses this problem. Extend padata, a framework that handles many parallel yet singlethreaded jobs, to also handle multithreaded jobs by adding support for splitting up the work evenly, specifying a minimum amount of work that's appropriate for one helper thread to do, load balancing between helpers, and coordinating them. This is inspired by work from Pavel Tatashin and Steve Sistare. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by:
Josh Triplett <josh@joshtriplett.org> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Link: http://lkml.kernel.org/r/20200527173608.2885243-5-daniel.m.jordan@oracle.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Daniel Jordan authored
padata allocates per-CPU, per-instance work structs for parallel jobs. A do_parallel call assigns a job to a sequence number and hashes the number to a CPU, where the job will eventually run using the corresponding work. This approach fit with how padata used to bind a job to each CPU round-robin, makes less sense after commit bfde23ce ("padata: unbind parallel jobs from specific CPUs") because a work isn't bound to a particular CPU anymore, and isn't needed at all for multithreaded jobs because they don't have sequence numbers. Replace the per-CPU works with a preallocated pool, which allows sharing them between existing padata users and the upcoming multithreaded user. The pool will also facilitate setting NUMA-aware concurrency limits with later users. The pool is sized according to the number of possible CPUs. With this limit, MAX_OBJ_NUM no longer makes sense, so remove it. If the global pool is exhausted, a parallel job is run in the current task instead to throttle a system trying to do too much in parallel. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by:
Josh Triplett <josh@joshtriplett.org> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Link: http://lkml.kernel.org/r/20200527173608.2885243-4-daniel.m.jordan@oracle.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Daniel Jordan authored
padata will soon initialize the system's struct pages in parallel, so it needs to be ready by page_alloc_init_late(). The error return from padata_driver_init() triggers an initcall warning, so add a warning to padata_init() to avoid silent failure. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by:
Josh Triplett <josh@joshtriplett.org> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Link: http://lkml.kernel.org/r/20200527173608.2885243-3-daniel.m.jordan@oracle.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Daniel Jordan authored
Patch series "padata: parallelize deferred page init", v3. Deferred struct page init is a bottleneck in kernel boot--the biggest for us and probably others. Optimizing it maximizes availability for large-memory systems and allows spinning up short-lived VMs as needed without having to leave them running. It also benefits bare metal machines hosting VMs that are sensitive to downtime. In projects such as VMM Fast Restart[1], where guest state is preserved across kexec reboot, it helps prevent application and network timeouts in the guests. So, multithread deferred init to take full advantage of system memory bandwidth. Extend padata, a framework that handles many parallel singlethreaded jobs, to handle multithreaded jobs as well by adding support for splitting up the work evenly, specifying a minimum amount of work that's appropriate for one helper thread to do, load balancing between helpers, and coordinating them. More documentation in patches 4 and 8. This series is the first step in a project to address other memory proportional bottlenecks in the kernel such as pmem struct page init, vfio page pinning, hugetlb fallocate, and munmap. Deferred page init doesn't require concurrency limits, resource control, or priority adjustments like these other users will because it happens during boot when the system is otherwise idle and waiting for page init to finish. This has been run on a variety of x86 systems and speeds up kernel boot by 4% to 49%, saving up to 1.6 out of 4 seconds. Patch 6 has more numbers. This patch (of 8): padata_driver_exit() is unnecessary because padata isn't built as a module and doesn't exit. padata's init routine will soon allocate memory, so getting rid of the exit function now avoids pointless code to free it. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by:
Josh Triplett <josh@joshtriplett.org> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Link: http://lkml.kernel.org/r/20200527173608.2885243-1-daniel.m.jordan@oracle.com Link: http://lkml.kernel.org/r/20200527173608.2885243-2-daniel.m.jordan@oracle.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 30, 2020
-
-
Daniel Jordan authored
Removing the pcrypt module triggers this: general protection fault, probably for non-canonical address 0xdead000000000122 CPU: 5 PID: 264 Comm: modprobe Not tainted 5.6.0+ #2 Hardware name: QEMU Standard PC RIP: 0010:__cpuhp_state_remove_instance+0xcc/0x120 Call Trace: padata_sysfs_release+0x74/0xce kobject_put+0x81/0xd0 padata_free+0x12/0x20 pcrypt_exit+0x43/0x8ee [pcrypt] padata instances wrongly use the same hlist node for the online and dead states, so __padata_free()'s second cpuhp remove call chokes on the node that the first poisoned. cpuhp multi-instance callbacks only walk forward in cpuhp_step->list and the same node is linked in both the online and dead lists, so the list corruption that results from padata_alloc() adding the node to a second list without removing it from the first doesn't cause problems as long as no instances are freed. Avoid the issue by giving each state its own node. Fixes: 894c9ef9 ("padata: validate cpumask without removed CPU during offline") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v5.4+ Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Mar 06, 2020
-
-
Eric Biggers authored
Simplify the error handling in pcrypt_create_aead() by taking advantage of crypto_grab_aead() now handling an ERR_PTR() name and by taking advantage of crypto_drop_aead() now accepting (as a no-op) a spawn that hasn't been grabbed yet. This required also making padata_free_shell() accept a NULL argument. Signed-off-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Feb 22, 2020
-
-
Daniel Jordan authored
According to Geert's report[0], kernel/padata.c: warning: 'err' may be used uninitialized in this function [-Wuninitialized]: => 539:2 Warning is seen only with older compilers on certain archs. The runtime effect is potentially returning garbage down the stack when padata's cpumasks are modified before any pcrypt requests have run. Simplest fix is to initialize err to the success value. [0] http://lkml.kernel.org/r/20200210135506.11536-1-geert@linux-m68k.org Reported-by:
Geert Uytterhoeven <geert@linux-m68k.org> Fixes: bbefa1dd ("crypto: pcrypt - Avoid deadlock by using per-instance padata queues") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Dec 11, 2019
-
-
Daniel Jordan authored
Remove references to unused functions, standardize language, update to reflect new functionality, migrate to rst format, and fix all kernel-doc warnings. Fixes: 815613da ("kernel/padata.c: removed unused code") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
reorder_objects is unused since the rework of padata's flushing, so remove it. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
Since commit 63d35788 ("crypto: pcrypt - remove padata cpumask notifier") this feature is unused, so get rid of it. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
lockdep complains when padata's paths to update cpumasks via CPU hotplug and sysfs are both taken: # echo 0 > /sys/devices/system/cpu/cpu1/online # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask ====================================================== WARNING: possible circular locking dependency detected 5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted ------------------------------------------------------ bash/205 is trying to acquire lock: ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120 but task is already holding lock: ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120 which lock already depends on the new lock. padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent order. Which should be first? CPU hotplug calls into padata with cpu_hotplug_lock already held, so it should have priority. Fixes: 6751fb3c ("padata: Use get_online_cpus/put_online_cpus") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
Configuring an instance's parallel mask without any online CPUs... echo 2 > /sys/kernel/pcrypt/pencrypt/parallel_cpumask echo 0 > /sys/devices/system/cpu/cpu1/online ...makes tcrypt mode=215 crash like this: divide error: 0000 [#1] SMP PTI CPU: 4 PID: 283 Comm: modprobe Not tainted 5.4.0-rc8-padata-doc-v2+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014 RIP: 0010:padata_do_parallel+0x114/0x300 Call Trace: pcrypt_aead_encrypt+0xc0/0xd0 [pcrypt] crypto_aead_encrypt+0x1f/0x30 do_mult_aead_op+0x4e/0xdf [tcrypt] test_mb_aead_speed.constprop.0.cold+0x226/0x564 [tcrypt] do_test+0x28c2/0x4d49 [tcrypt] tcrypt_mod_init+0x55/0x1000 [tcrypt] ... cpumask_weight() in padata_cpu_hash() returns 0 because the mask has no CPUs. The problem is __padata_remove_cpu() checks for valid masks too early and so doesn't mark the instance PADATA_INVALID as expected, which would have made padata_do_parallel() return error before doing the division. Fix by introducing a second padata CPU hotplug state before CPUHP_BRINGUP_CPU so that __padata_remove_cpu() sees the online mask without @cpu. No need for the second argument to padata_replace() since @cpu is now already missing from the online mask. Fixes: 33e54450 ("padata: Handle empty padata cpumasks") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
If the pcrypt template is used multiple times in an algorithm, then a deadlock occurs because all pcrypt instances share the same padata_instance, which completes requests in the order submitted. That is, the inner pcrypt request waits for the outer pcrypt request while the outer request is already waiting for the inner. This patch fixes this by allocating a set of queues for each pcrypt instance instead of using two global queues. In order to maintain the existing user-space interface, the pinst structure remains global so any sysfs modifications will apply to every pcrypt instance. Note that when an update occurs we have to allocate memory for every pcrypt instance. Should one of the allocations fail we will abort the update without rolling back changes already made. The new per-instance data structure is called padata_shell and is essentially a wrapper around parallel_data. Reproducer: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { struct sockaddr_alg addr = { .salg_type = "aead", .salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))" }; int algfd, reqfd; char buf[32] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (void *)&addr, sizeof(addr)); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20); reqfd = accept(algfd, 0, 0); write(reqfd, buf, 32); read(reqfd, buf, 16); } Reported-by:
<syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com> Fixes: 5068c7a8 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper") Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Tested-by:
Eric Biggers <ebiggers@kernel.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
The function padata_remove_cpu was supposed to have been removed along with padata_add_cpu but somehow it remained behind. Let's kill it now as it doesn't even have a prototype anymore. Fixes: 815613da ("kernel/padata.c: removed unused code") Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
The function padata_flush_queues is fundamentally broken because it cannot force padata users to complete the request that is underway. IOW padata has to passively wait for the completion of any outstanding work. As it stands flushing is used in two places. Its use in padata_stop is simply unnecessary because nothing depends on the queues to be flushed afterwards. The other use in padata_replace is more substantial as we depend on it to free the old pd structure. This patch instead uses the pd->refcnt to dynamically free the pd structure once all requests are complete. Fixes: 2b73b07a ("padata: Flush the padata queues actively") Cc: <stable@vger.kernel.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Sep 13, 2019
-
-
Daniel Jordan authored
With the removal of the ENODATA case from padata_get_next, the cpu_index field is no longer useful, so it can go away. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
Padata binds the parallel part of a job to a single CPU and round-robins over all CPUs in the system for each successive job. Though the serial parts rely on per-CPU queues for correct ordering, they're not necessary for parallel work, and it improves performance to run the job locally on NUMA machines and let the scheduler pick the CPU within a node on a busy system. So, make the parallel workqueue unbound. Update the parallel workqueue's cpumask when the instance's parallel cpumask changes. Now that parallel jobs no longer run on max_active=1 workqueues, two or more parallel works that hash to the same CPU may run simultaneously, finish out of order, and so be serialized out of order. Prevent this by keeping the works sorted on the reorder list by sequence number and checking that in the reordering logic. padata_get_next becomes padata_find_next so it can be reused for the end of padata_reorder, where it's used to avoid uselessly queueing work when the next job by sequence number isn't finished yet but a later job that hashed to the same CPU has. The ENODATA case in padata_find_next no longer makes sense because parallel jobs aren't bound to specific CPUs. The EINPROGRESS case takes care of the scenario where a parallel job is potentially running on the same CPU as padata_find_next, and with only one error code left, just use NULL instead. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Tejun Heo <tj@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
padata currently uses one per-CPU workqueue per instance for all work. Prepare for running parallel jobs on an unbound workqueue by introducing dedicated workqueues for parallel and serial work. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
With pcrypt's cpumask no longer used, take the CPU hotplug lock inside padata_alloc_possible. Useful later in the series for avoiding nested acquisition of the CPU hotplug lock in padata when padata_alloc_possible is allocating an unbound workqueue. Without this patch, this nested acquisition would happen later in the series: pcrypt_init_padata get_online_cpus alloc_padata_possible alloc_padata alloc_workqueue(WQ_UNBOUND) // later in the series alloc_and_link_pwqs apply_wqattrs_lock get_online_cpus // recursive rwsem acquisition Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
padata_do_parallel currently returns -EINVAL if the callback CPU isn't in the callback cpumask. pcrypt tries to prevent this situation by keeping its own callback cpumask in sync with padata's and checks that the callback CPU it passes to padata is valid. Make padata handle this instead. padata_do_parallel now takes a pointer to the callback CPU and updates it for the caller if an alternate CPU is used. Overall behavior in terms of which callback CPUs are chosen stays the same. Prepares for removal of the padata cpumask notifier in pcrypt, which will fix a lockdep complaint about nested acquisition of the CPU hotplug lock later in the series. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Daniel Jordan authored
Move workqueue allocation inside of padata to prepare for further changes to how padata uses workqueues. Guarantees the workqueue is created with max_active=1, which padata relies on to work correctly. No functional change. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-crypto@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Aug 09, 2019
-
-
Daniel Jordan authored
Exercising CPU hotplug on a 5.2 kernel with recent padata fixes from cryptodev-2.6.git in an 8-CPU kvm guest... # modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3 # echo 0 > /sys/devices/system/cpu/cpu1/online # echo c > /sys/kernel/pcrypt/pencrypt/parallel_cpumask # modprobe tcrypt mode=215 ...caused the following crash: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 134 Comm: kworker/2:2 Not tainted 5.2.0-padata-base+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-<snip> Workqueue: pencrypt padata_parallel_worker RIP: 0010:padata_reorder+0xcb/0x180 ... Call Trace: padata_do_serial+0x57/0x60 pcrypt_aead_enc+0x3a/0x50 [pcrypt] padata_parallel_worker+0x9b/0xe0 process_one_work+0x1b5/0x3f0 worker_thread+0x4a/0x3c0 ... In padata_alloc_pd, pd->cpu is set using the user-supplied cpumask instead of the effective cpumask, and in this case cpumask_first picked an offline CPU. The offline CPU's reorder->list.next is NULL in padata_reorder because the list wasn't initialized in padata_init_pqueues, which only operates on CPUs in the effective mask. Fix by using the effective mask in padata_alloc_pd. Fixes: 6fc4dbcf ("padata: Replace delayed timer with immediate workqueue in padata_reorder") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 27, 2019
-
-
Daniel Jordan authored
With the removal of the padata timer, padata_do_serial no longer needs special CPU handling, so remove it. Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
The function padata_reorder will use a timer when it cannot progress while completed jobs are outstanding (pd->reorder_objects > 0). This is suboptimal as if we do end up using the timer then it would have introduced a gratuitous delay of one second. In fact we can easily distinguish between whether completed jobs are outstanding and whether we can make progress. All we have to do is look at the next pqueue list. This patch does that by replacing pd->processed with pd->cpu so that the next pqueue is more accessible. A work queue is used instead of the original try_again to avoid hogging the CPU. Note that we don't bother removing the work queue in padata_flush_queues because the whole premise is broken. You cannot flush async crypto requests so it makes no sense to even try. A subsequent patch will fix it by replacing it with a ref counting scheme. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 18, 2019
-
-
Daniel Jordan authored
Testing padata with the tcrypt module on a 5.2 kernel... # modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3 # modprobe tcrypt mode=211 sec=1 ...produces this splat: INFO: task modprobe:10075 blocked for more than 120 seconds. Not tainted 5.2.0-base+ #16 modprobe D 0 10075 10064 0x80004080 Call Trace: ? __schedule+0x4dd/0x610 ? ring_buffer_unlock_commit+0x23/0x100 schedule+0x6c/0x90 schedule_timeout+0x3b/0x320 ? trace_buffer_unlock_commit_regs+0x4f/0x1f0 wait_for_common+0x160/0x1a0 ? wake_up_q+0x80/0x80 { crypto_wait_req } # entries in braces added by hand { do_one_aead_op } { test_aead_jiffies } test_aead_speed.constprop.17+0x681/0xf30 [tcrypt] do_test+0x4053/0x6a2b [tcrypt] ? 0xffffffffa00f4000 tcrypt_mod_init+0x50/0x1000 [tcrypt] ... The second modprobe command never finishes because in padata_reorder, CPU0's load of reorder_objects is executed before the unlocking store in spin_unlock_bh(pd->lock), causing CPU0 to miss CPU1's increment: CPU0 CPU1 padata_reorder padata_do_serial LOAD reorder_objects // 0 INC reorder_objects // 1 padata_reorder TRYLOCK pd->lock // failed UNLOCK pd->lock CPU0 deletes the timer before returning from padata_reorder and since no other job is submitted to padata, modprobe waits indefinitely. Add a pair of full barriers to guarantee proper ordering: CPU0 CPU1 padata_reorder padata_do_serial UNLOCK pd->lock smp_mb() LOAD reorder_objects INC reorder_objects smp_mb__after_atomic() padata_reorder TRYLOCK pd->lock smp_mb__after_atomic is needed so the read part of the trylock operation comes after the INC, as Andrea points out. Thanks also to Andrea for help with writing a litmus test. Fixes: 16295bec ("padata: Generic parallelization/serialization interface") Signed-off-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Cc: <stable@vger.kernel.org> Cc: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Apr 25, 2019
-
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace padata_attr_type's default_attrs field with default_groups and use the ATTRIBUTE_GROUPS macro to create padata_default_groups. This patch was tested by loading the pcrypt module and verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by:
Kimberly Brown <kimbrownkd@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Nov 16, 2018
-
-
Colin Ian King authored
Trivial fix to clean up an indentation issue Signed-off-by:
Colin Ian King <colin.king@canonical.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jan 05, 2018
-
-
Cheah Kok Cheong authored
Add SPDX license identifier according to the type of license text found in the file. Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by:
Cheah Kok Cheong <thrust73@gmail.com> Acked-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-