- Oct 04, 2023
-
-
Andreas Gruenbacher authored
Add a kthread_stop_put() helper that stops a thread and puts its task struct. Use it to replace the various instances of kthread_stop() followed by put_task_struct(). Remove the kthread_stop_put() macro in usbip that is similar but doesn't return the result of kthread_stop(). [agruenba@redhat.com: fix kerneldoc comment] Link: https://lkml.kernel.org/r/20230911111730.2565537-1-agruenba@redhat.com [akpm@linux-foundation.org: document kthread_stop_put()'s argument] Link: https://lkml.kernel.org/r/20230907234048.2499820-1-agruenba@redhat.com Signed-off-by:
Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Mateusz Guzik authored
The feature got retired in f1a79412 ("mm: convert mm's rss stats into percpu_counter"), but the patch failed to fully clean it up. Link: https://lkml.kernel.org/r/20230823170556.2281747-1-mjguzik@gmail.com Signed-off-by:
Mateusz Guzik <mjguzik@gmail.com> Acked-by:
Shakeel Butt <shakeelb@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Aug 18, 2023
-
-
Greg Kroah-Hartman authored
There are no in-kernel users of __kthread_should_park() so mark it as static and do not export it. Link: https://lkml.kernel.org/r/2023080450-handcuff-stump-1d6e@gregkh Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Cc: John Stultz <jstultz@google.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: "Arve Hjønnevåg" <arve@android.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Christian Brauner (Microsoft)" <brauner@kernel.org> Cc: Mike Christie <michael.christie@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Zqiang <qiang1.zhang@intel.com> Cc: Prathu Baronia <quic_pbaronia@quicinc.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Jun 16, 2023
-
-
Arve Hjønnevåg authored
kthread_park and wait_woken have a similar race that kthread_stop and wait_woken used to have before it was fixed in commit cb6538e7 ("sched/wait: Fix a kthread race with wait_woken()"). Extend that fix to also cover kthread_park. [jstultz: Made changes suggested by Peter to optimize memory loads] Signed-off-by:
Arve Hjønnevåg <arve@android.com> Signed-off-by:
John Stultz <jstultz@google.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20230602212350.535358-1-jstultz@google.com
-
- Jun 10, 2023
-
-
Prathu Baronia authored
- `If present` -> `If present,' - `reuturn` -> `return` - `function exit safely` -> `function to exit safely` Link: https://lkml.kernel.org/r/20230502090242.3037194-1-quic_pbaronia@quicinc.com Signed-off-by:
Prathu Baronia <quic_pbaronia@quicinc.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Kees Cook <keescook@chromium.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zqiang <qiang1.zhang@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Mar 28, 2023
-
-
Nicholas Piggin authored
Add explicit _lazy_tlb annotated functions for lazy tlb mm refcounting. This makes the lazy tlb mm references more obvious, and allows the refcounting scheme to be modified in later changes. There is no functional change with this patch. Link: https://lkml.kernel.org/r/20230203071837.1136453-3-npiggin@gmail.com Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Nicholas Piggin authored
Patch series "shoot lazy tlbs (lazy tlb refcount scalability improvement)", v7. This series improves scalability of context switching between user and kernel threads on large systems with a threaded process spread across a lot of CPUs. Discussion of v6 here: https://lore.kernel.org/linux-mm/20230118080011.2258375-1-npiggin@gmail.com/ This patch (of 5): Remove the special case avoiding refcounting when the mm to be used is the same as the kernel thread's active (lazy tlb) mm. kthread_use_mm() should not be such a performance critical path that this matters much. This simplifies a later change to lazy tlb mm refcounting. Link: https://lkml.kernel.org/r/20230203071837.1136453-1-npiggin@gmail.com Link: https://lkml.kernel.org/r/20230203071837.1136453-2-npiggin@gmail.com Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Mar 12, 2023
-
-
Mike Christie authored
This has us pass in the thread's name during creation in kernel_thread. Signed-off-by:
Mike Christie <michael.christie@oracle.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Christian Brauner (Microsoft) <brauner@kernel.org>
-
Mike Christie authored
This patch allows kernel users to pass in the thread name so it can be set during creation instead of having to use set_task_comm after the thread is created. Signed-off-by:
Mike Christie <michael.christie@oracle.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Christian Brauner (Microsoft) <brauner@kernel.org>
-
- Feb 03, 2023
-
-
Zqiang authored
When destroying a kthread worker warn if there are still some pending delayed works. This indicates that the caller should clear all pending delayed works before destroying the kthread worker. Link: https://lkml.kernel.org/r/20230104144230.938521-1-qiang1.zhang@intel.com Signed-off-by:
Zqiang <qiang1.zhang@intel.com> Acked-by:
Tejun Heo <tj@kernel.org> Reviewed-by:
Petr Mladek <pmladek@suse.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Sep 26, 2022
-
-
Sami Tolvanen authored
CONFIG_CFI_CLANG no longer breaks cross-module function address equality, which makes WARN_ON_FUNCTION_MISMATCH unnecessary. Remove the definition and switch back to WARN_ON_ONCE. Signed-off-by:
Sami Tolvanen <samitolvanen@google.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Tested-by:
Kees Cook <keescook@chromium.org> Tested-by:
Nathan Chancellor <nathan@kernel.org> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220908215504.3686827-15-samitolvanen@google.com
-
- Jul 18, 2022
-
-
Jason A. Donenfeld authored
I was recently surprised to learn that msleep_interruptible(), wait_for_completion_interruptible_timeout(), and related functions simply hung when I called kthread_stop() on kthreads using them. The solution to fixing the case with msleep_interruptible() was more simply to move to schedule_timeout_interruptible(). Why? The reason is that msleep_interruptible(), and many functions just like it, has a loop like this: while (timeout && !signal_pending(current)) timeout = schedule_timeout_interruptible(timeout); The call to kthread_stop() woke up the thread, so schedule_timeout_ interruptible() returned early, but because signal_pending() returned true, it went back into another timeout, which was never woken up. This wait loop pattern is common to various pieces of code, and I suspect that the subtle misuse in a kthread that caused a deadlock in the code I looked at last week is also found elsewhere. So this commit causes signal_pending() to return true when kthread_stop() is called, by setting TIF_NOTIFY_SIGNAL. The same also probably applies to the similar kthread_park() functionality, but that can be addressed later, as its semantics are slightly different. Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com> v1: https://lkml.kernel.org/r/20220627120020.608117-1-Jason@zx2c4.com v2: https://lkml.kernel.org/r/20220627145716.641185-1-Jason@zx2c4.com v3: https://lkml.kernel.org/r/20220628161441.892925-1-Jason@zx2c4.com v4: https://lkml.kernel.org/r/20220711202136.64458-1-Jason@zx2c4.com v5: https://lkml.kernel.org/r/20220711232123.136330-1-Jason@zx2c4.com Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Jun 17, 2022
-
-
Petr Mladek authored
The comments in kernel/kthread.c create a feeling that only SIGKILL is able to terminate the creation of kernel kthreads by kthread_create()/_on_node()/_on_cpu() APIs. In reality, wait_for_completion_killable() might be killed by any fatal signal that does not have a custom handler: (!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \ (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL) static inline void signal_wake_up(struct task_struct *t, bool resume) { signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0); } static void complete_signal(int sig, struct task_struct *p, enum pid_type type) { [...] /* * Found a killable thread. If the signal will be fatal, * then start taking the whole group down immediately. */ if (sig_fatal(p, sig) ...) { if (!sig_kernel_coredump(sig)) { [...] do { task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK); sigaddset(&t->pending.signal, SIGKILL); signal_wake_up(t, 1); } while_each_thread(p, t); return; } } } Update the comments in kernel/kthread.c to make this more obvious. The motivation for this change was debugging why a module initialization failed. The module was being loaded from initrd. It "magically" failed when systemd was switching to the real root. The clean up operations sent SIGTERM to various pending processed that were started from initrd. Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com Signed-off-by:
Petr Mladek <pmladek@suse.com> Reviewed-by:
"Eric W. Biederman" <ebiederm@xmission.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- May 02, 2022
-
-
Christoph Hellwig authored
kthread_blkcg is only used by the built-in blk-cgroup code. Signed-off-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220420042723.1010598-16-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Feb 25, 2022
-
-
Arnd Bergmann authored
There are no remaining callers of set_fs(), so CONFIG_SET_FS can be removed globally, along with the thread_info field and any references to it. This turns access_ok() into a cheaper check against TASK_SIZE_MAX. As CONFIG_SET_FS is now gone, drop all remaining references to set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel(). Acked-by: Sam Ravnborg <sam@ravnborg.org> # for sparc32 changes Acked-by:
"Eric W. Biederman" <ebiederm@xmission.com> Tested-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> # for arc changes Acked-by: Stafford Horne <shorne@gmail.com> # [openrisc, asm-generic] Acked-by:
Dinh Nguyen <dinguyen@kernel.org> Signed-off-by:
Arnd Bergmann <arnd@arndb.de>
-
- Feb 16, 2022
-
-
Frederic Weisbecker authored
Refer to housekeeping APIs using single feature types instead of flags. This prevents from passing multiple isolation features at once to housekeeping interfaces, which soon won't be possible anymore as each isolation features will have their own cpumask. Signed-off-by:
Frederic Weisbecker <frederic@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Juri Lelli <juri.lelli@redhat.com> Reviewed-by:
Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220207155910.527133-5-frederic@kernel.org
-
- Jan 20, 2022
-
-
Yafang Shao authored
When I was implementing a new per-cpu kthread cfs_migration, I found the comm of it "cfs_migration/%u" is truncated due to the limitation of TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 all have the same name "cfs_migration/1", which will confuse the user. This issue is not critical, because we can get the corresponding CPU from the task's Cpus_allowed. But for kthreads corresponding to other hardware devices, it is not easy to get the detailed device info from task comm, for example, jbd2/nvme0n1p2- xfs-reclaim/sdf Currently there are so many truncated kthreads: rcu_tasks_kthre rcu_tasks_rude_ rcu_tasks_trace poll_mpt3sas0_s ext4-rsv-conver xfs-reclaim/sd{a, b, c, ...} xfs-blockgc/sd{a, b, c, ...} xfs-inodegc/sd{a, b, c, ...} audit_send_repl ecryptfs-kthrea vfio-irqfd-clea jbd2/nvme0n1p2- ... We can shorten these names to work around this problem, but it may be not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for example, it is a nice name, and it is not a good idea to shorten it. One possible way to fix this issue is extending the task comm size, but as task->comm is used in lots of places, that may cause some potential buffer overflows. Another more conservative approach is introducing a new pointer to store kthread's full name if it is truncated, which won't introduce too much overhead as it is in the non-critical path. Finally we make a dicision to use the second approach. See also the discussions in this thread: https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/ After this change, the full name of these truncated kthreads will be displayed via /proc/[pid]/comm: rcu_tasks_kthread rcu_tasks_rude_kthread rcu_tasks_trace_kthread poll_mpt3sas0_statu ext4-rsv-conversion xfs-reclaim/sdf1 xfs-blockgc/sdf1 xfs-inodegc/sdf1 audit_send_reply ecryptfs-kthread vfio-irqfd-cleanup jbd2/nvme0n1p2-8 Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com Signed-off-by:
Yafang Shao <laoar.shao@gmail.com> Reviewed-by:
David Hildenbrand <david@redhat.com> Reviewed-by:
Petr Mladek <pmladek@suse.com> Suggested-by:
Petr Mladek <pmladek@suse.com> Suggested-by:
Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 15, 2022
-
-
Cai Huoqing authored
Add a new helper function kthread_run_on_cpu(), which includes kthread_create_on_cpu/wake_up_process(). In some cases, use kthread_run_on_cpu() directly instead of kthread_create_on_node/kthread_bind/wake_up_process() or kthread_create_on_cpu/wake_up_process() or kthreadd_create/kthread_bind/wake_up_process() to simplify the code. [akpm@linux-foundation.org: export kthread_create_on_cpu to modules] Link: https://lkml.kernel.org/r/20211022025711.3673-2-caihuoqing@baidu.com Signed-off-by:
Cai Huoqing <caihuoqing@baidu.com> Cc: Bernard Metzler <bmt@zurich.ibm.com> Cc: Cai Huoqing <caihuoqing@baidu.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Doug Ledford <dledford@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 08, 2022
-
-
Eric W. Biederman authored
The point of using set_child_tid to hold the kthread pointer was that it already did what is necessary. There are now restrictions on when set_child_tid can be initialized and when set_child_tid can be used in schedule_tail. Which indicates that continuing to use set_child_tid to hold the kthread pointer is a bad idea. Instead of continuing to use the set_child_tid field of task_struct generalize the pf_io_worker field of task_struct and use it to hold the kthread pointer. Rename pf_io_worker (which is a void * pointer) to worker_private so it can be used to store kthreads struct kthread pointer. Update the kthread code to store the kthread pointer in the worker_private field. Remove the places where set_child_tid had to be dealt with carefully because kthreads also used it. Link: https://lkml.kernel.org/r/CAHk-=wgtFAA9SbVYg0gR1tqPMC17-NYcs0GQkaYg1bGhh1uJQQ@mail.gmail.com Link: https://lkml.kernel.org/r/87a6grvqy8.fsf_-_@email.froward.int.ebiederm.org Suggested-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Dec 14, 2021
-
-
Eric W. Biederman authored
I misspelled kthread_complete_and_exit in the kernel doc comment fix it. Link: https://lkml.kernel.org/r/202112141329.KBkyJ5ql-lkp@intel.com Link: https://lkml.kernel.org/r/202112141422.Cykr6YUS-lkp@intel.com Reported-by:
kernel test robot <lkp@intel.com> Fixes: cead1855 ("exit: Rename complete_and_exit to kthread_complete_and_exit") Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Dec 13, 2021
-
-
Eric W. Biederman authored
The exit code of kernel threads has different semantics than the exit_code of userspace tasks. To avoid confusion and allow the userspace implementation to change as needed move the kernel thread exit code into struct kthread. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
Today the rules are a bit iffy and arbitrary about which kernel threads have struct kthread present. Both idle threads and thread started with create_kthread want struct kthread present so that is effectively all kernel threads. Make the rule that if PF_KTHREAD and the task is running then struct kthread is present. This will allow the kernel thread code to using tsk->exit_code with different semantics from ordinary processes. To make ensure that struct kthread is present for all kernel threads move it's allocation into copy_process. Add a deallocation of struct kthread in exec for processes that were kernel threads. Move the allocation of struct kthread for the initial thread earlier so that it is not repeated for each additional idle thread. Move the initialization of struct kthread into set_kthread_struct so that the structure is always and reliably initailized. Clear set_child_tid in free_kthread_struct to ensure the kthread struct is reliably freed during exec. The function free_kthread_struct does not need to clear vfork_done during exec as exec_mm_release called from exec_mmap has already cleared vfork_done. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
Update complete_and_exit to call kthread_exit instead of do_exit. Change the name to reflect this change in functionality. All of the users of complete_and_exit are causing the current kthread to exit so this change makes it clear what is happening. Move the implementation of kthread_complete_and_exit from kernel/exit.c to to kernel/kthread.c. As this function is kthread specific it makes most sense to live with the kthread functions. There are no functional change. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
The way the per task_struct exit_code is used by kernel threads is not quite compatible how it is used by userspace applications. The low byte of the userspace exit_code value encodes the exit signal. While kthreads just use the value as an int holding ordinary kernel function exit status like -EPERM. Add kthread_exit to clearly separate the two kinds of uses. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Oct 29, 2021
-
-
Eric W. Biederman authored
In 2009 Oleg reworked[1] the kernel threads so that it is not necessary to call do_exit if you are not using kthread_stop(). Remove the explicit calls of do_exit and complete_and_exit (with a NULL completion) that were previously necessary. [1] 63706172 ("kthreads: rework kthread_stop()") Link: https://lkml.kernel.org/r/20211020174406.17889-12-ebiederm@xmission.com Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Oct 05, 2021
-
-
Sebastian Andrzej Siewior authored
With enabled threaded interrupts the nouveau driver reported the following: | Chain exists of: | &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem | | Possible unsafe locking scenario: | | CPU0 CPU1 | ---- ---- | lock(&cpuset_rwsem); | lock(&device->mutex); | lock(&cpuset_rwsem); | lock(&mm->mmap_lock#2); The device->mutex is nvkm_device::mutex. Unblocking the lockchain at `cpuset_rwsem' is probably the easiest thing to do. Move the priority reset to the start of the newly created thread. Fixes: 710da3c8 ("sched/core: Prevent race condition between cpuset and __sched_setscheduler()") Reported-by:
Mike Galbraith <efault@gmx.de> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/a23a826af7c108ea5651e73b8fbae5e653f16e86.camel@gmx.de
-
- Jun 29, 2021
-
-
Petr Mladek authored
kthread_worker: fix return value when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() kthread_mod_delayed_work() might race with kthread_cancel_delayed_work_sync() or another kthread_mod_delayed_work() call. The function lets the other operation win when it sees work->canceling counter set. And it returns @false. But it should return @true as it is done by the related workqueue API, see mod_delayed_work_on(). The reason is that the return value might be used for reference counting. It has to distinguish the case when the number of queued works has changed or stayed the same. The change is safe. kthread_mod_delayed_work() return value is not checked anywhere at the moment. Link: https://lore.kernel.org/r/20210521163526.GA17916@redhat.com Link: https://lkml.kernel.org/r/20210610133051.15337-4-pmladek@suse.com Signed-off-by:
Petr Mladek <pmladek@suse.com> Reported-by:
Oleg Nesterov <oleg@redhat.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Minchan Kim <minchan@google.com> Cc: <jenhaochen@google.com> Cc: Martin Liu <liumartin@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jun 25, 2021
-
-
Petr Mladek authored
kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() The system might hang with the following backtrace: schedule+0x80/0x100 schedule_timeout+0x48/0x138 wait_for_common+0xa4/0x134 wait_for_completion+0x1c/0x2c kthread_flush_work+0x114/0x1cc kthread_cancel_work_sync.llvm.16514401384283632983+0xe8/0x144 kthread_cancel_delayed_work_sync+0x18/0x2c xxxx_pm_notify+0xb0/0xd8 blocking_notifier_call_chain_robust+0x80/0x194 pm_notifier_call_chain_robust+0x28/0x4c suspend_prepare+0x40/0x260 enter_state+0x80/0x3f4 pm_suspend+0x60/0xdc state_store+0x108/0x144 kobj_attr_store+0x38/0x88 sysfs_kf_write+0x64/0xc0 kernfs_fop_write_iter+0x108/0x1d0 vfs_write+0x2f4/0x368 ksys_write+0x7c/0xec It is caused by the following race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync(): CPU0 CPU1 Context: Thread A Context: Thread B kthread_mod_delayed_work() spin_lock() __kthread_cancel_work() spin_unlock() del_timer_sync() kthread_cancel_delayed_work_sync() spin_lock() __kthread_cancel_work() spin_unlock() del_timer_sync() spin_lock() work->canceling++ spin_unlock spin_lock() queue_delayed_work() // dwork is put into the worker->delayed_work_list spin_unlock() kthread_flush_work() // flush_work is put at the tail of the dwork wait_for_completion() Context: IRQ kthread_delayed_work_timer_fn() spin_lock() list_del_init(&work->node); spin_unlock() BANG: flush_work is not longer linked and will never get proceed. The problem is that kthread_mod_delayed_work() checks work->canceling flag before canceling the timer. A simple solution is to (re)check work->canceling after __kthread_cancel_work(). But then it is not clear what should be returned when __kthread_cancel_work() removed the work from the queue (list) and it can't queue it again with the new @delay. The return value might be used for reference counting. The caller has to know whether a new work has been queued or an existing one was replaced. The proper solution is that kthread_mod_delayed_work() will remove the work from the queue (list) _only_ when work->canceling is not set. The flag must be checked after the timer is stopped and the remaining operations can be done under worker->lock. Note that kthread_mod_delayed_work() could remove the timer and then bail out. It is fine. The other canceling caller needs to cancel the timer as well. The important thing is that the queue (list) manipulation is done atomically under worker->lock. Link: https://lkml.kernel.org/r/20210610133051.15337-3-pmladek@suse.com Fixes: 9a6b06c8 ("kthread: allow to modify delayed kthread work") Signed-off-by:
Petr Mladek <pmladek@suse.com> Reported-by:
Martin Liu <liumartin@google.com> Cc: <jenhaochen@google.com> Cc: Minchan Kim <minchan@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Petr Mladek authored
Patch series "kthread_worker: Fix race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync()". This patchset fixes the race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync() including proper return value handling. This patch (of 2): Simple code refactoring as a preparation step for fixing a race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync(). It does not modify the existing behavior. Link: https://lkml.kernel.org/r/20210610133051.15337-2-pmladek@suse.com Signed-off-by:
Petr Mladek <pmladek@suse.com> Cc: <jenhaochen@google.com> Cc: Martin Liu <liumartin@google.com> Cc: Minchan Kim <minchan@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jun 18, 2021
-
-
Peter Zijlstra authored
Change the type and name of task_struct::state. Drop the volatile and shrink it to an 'unsigned int'. Rename it in order to find all uses such that we can use READ_ONCE/WRITE_ONCE as appropriate. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by:
Will Deacon <will@kernel.org> Acked-by:
Daniel Thompson <daniel.thompson@linaro.org> Link: https://lore.kernel.org/r/20210611082838.550736351@infradead.org
-
- May 18, 2021
-
-
Valentin Schneider authored
For all intents and purposes, the idle task is a per-CPU kthread. It isn't created via the same route as other pcpu kthreads however, and as a result it is missing a few bells and whistles: it fails kthread_is_per_cpu() and it doesn't have PF_NO_SETAFFINITY set. Fix the former by giving the idle task a kthread struct along with the KTHREAD_IS_PER_CPU flag. This requires some extra iffery as init_idle() call be called more than once on the same idle task. Signed-off-by:
Valentin Schneider <valentin.schneider@arm.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210510151024.2448573-2-valentin.schneider@arm.com
-
- Apr 21, 2021
-
-
Peter Zijlstra authored
The kthread_is_per_cpu() construct relies on only being called on PF_KTHREAD tasks (per the WARN in to_kthread). This gives rise to the following usage pattern: if ((p->flags & PF_KTHREAD) && kthread_is_per_cpu(p)) However, as reported by syzcaller, this is broken. The scenario is: CPU0 CPU1 (running p) (p->flags & PF_KTHREAD) // true begin_new_exec() me->flags &= ~(PF_KTHREAD|...); kthread_is_per_cpu(p) to_kthread(p) WARN(!(p->flags & PF_KTHREAD) <-- *SPLAT* Introduce __to_kthread() that omits the WARN and is sure to check both values. Use this to remove the problematic pattern for kthread_is_per_cpu() and fix a number of other kthread_*() functions that have similar issues but are currently not used in ways that would expose the problem. Notably kthread_func() is only ever called on 'current', while kthread_probe_data() is only used for PF_WQ_WORKER, which implies the task is from kthread_create*(). Fixes: ac687e6e ("kthread: Extract KTHREAD_IS_PER_CPU") Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Valentin Schneider <Valentin.Schneider@arm.com> Link: https://lkml.kernel.org/r/YH6WJc825C4P0FCK@hirez.programming.kicks-ass.net
-
- Apr 08, 2021
-
-
Sami Tolvanen authored
With CONFIG_CFI_CLANG, a callback function passed to __kthread_queue_delayed_work from a module points to a jump table entry defined in the module instead of the one used in the core kernel, which breaks function address equality in this check: WARN_ON_ONCE(timer->function != ktead_delayed_work_timer_fn); Use WARN_ON_FUNCTION_MISMATCH() instead to disable the warning when CFI and modules are both enabled. Signed-off-by:
Sami Tolvanen <samitolvanen@google.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Tested-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-7-samitolvanen@google.com
-
- Jan 22, 2021
-
-
Peter Zijlstra authored
There is a need to distinguish geniune per-cpu kthreads from kthreads that happen to have a single CPU affinity. Geniune per-cpu kthreads are kthreads that are CPU affine for correctness, these will obviously have PF_KTHREAD set, but must also have PF_NO_SETAFFINITY set, lest userspace modify their affinity and ruins things. However, these two things are not sufficient, PF_NO_SETAFFINITY is also set on other tasks that have their affinities controlled through other means, like for instance workqueues. Therefore another bit is needed; it turns out kthread_create_per_cpu() already has such a bit: KTHREAD_IS_PER_CPU, which is used to make kthread_park()/kthread_unpark() work correctly. Expose this flag and remove the implicit setting of it from kthread_create_on_cpu(); the io_uring usage of it seems dubious at best. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Valentin Schneider <valentin.schneider@arm.com> Tested-by:
Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210121103506.557620262@infradead.org
-
- Jan 11, 2021
-
-
Yanfei Xu authored
The old _do_fork() helper has been removed in favor of kernel_clone(). Here correct some comments which still contain _do_fork() Link: https://lore.kernel.org/r/20210111104807.18022-1-yanfei.xu@windriver.com Cc: christian@brauner.io Cc: linux-kernel@vger.kernel.org Acked-by:
Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by:
Yanfei Xu <yanfei.xu@windriver.com> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
- Dec 15, 2020
-
-
Petr Mladek authored
The kthread worker API is simple. In short, it allows to create, use, and destroy workers. kthread_create_worker_on_cpu() just allows to bind a newly created worker to a given CPU. It is up to the API user how to handle CPU hotplug. They have to decide how to handle pending work items, prevent queuing new ones, and restore the functionality when the CPU goes off and on. There are few catches: + The CPU affinity gets lost when it is scheduled on an offline CPU. + The worker might not exist when the CPU was off when the user created the workers. A good practice is to implement two CPU hotplug callbacks and destroy/create the worker when CPU goes down/up. Mention this in the function description. [akpm@linux-foundation.org: grammar tweaks] Link: https://lore.kernel.org/r/20201028073031.4536-1-qiang.zhang@windriver.com Link: https://lkml.kernel.org/r/20201102101039.19227-1-pmladek@suse.com Reported-by:
Zhang Qiang <Qiang.Zhang@windriver.com> Signed-off-by:
Petr Mladek <pmladek@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Rob Clark authored
While migrating some code from wq to kthread_worker, I found that I missed the execute_start/end tracepoints. So add similar tracepoints for kthread_work. And for completeness, queue_work tracepoint (although this one differs slightly from the matching workqueue tracepoint). Link: https://lkml.kernel.org/r/20201010180323.126634-1-robdclark@gmail.com Signed-off-by:
Rob Clark <robdclark@chromium.org> Cc: Rob Clark <robdclark@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Phil Auld <pauld@redhat.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Thara Gopinath <thara.gopinath@linaro.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Vincent Donnefort <vincent.donnefort@arm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Ilias Stamatis <stamatis.iliass@gmail.com> Cc: Liang Chen <cl@rock-chips.com> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "J. Bruce Fields" <bfields@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Nov 02, 2020
-
-
Zqiang authored
There is a small race window when a delayed work is being canceled and the work still might be queued from the timer_fn: CPU0 CPU1 kthread_cancel_delayed_work_sync() __kthread_cancel_work_sync() __kthread_cancel_work() work->canceling++; kthread_delayed_work_timer_fn() kthread_insert_work(); BUG: kthread_insert_work() should not get called when work->canceling is set. Signed-off-by:
Zqiang <qiang.zhang@windriver.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Reviewed-by:
Petr Mladek <pmladek@suse.com> Acked-by:
Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201014083030.16895-1-qiang.zhang@windriver.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 29, 2020
-
-
Mathieu Desnoyers authored
Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm to allow the effect of membarrier(2) to apply to kthreads accessing user-space memory as well. Given that no prior kthread use this guarantee and that it only affects kthreads, adding this guarantee does not affect user-space ABI. Refine the check in membarrier_global_expedited to exclude runqueues running the idle thread rather than all kthreads from the IPI cpumask. Now that membarrier_global_expedited can IPI kthreads, the scheduler also needs to update the runqueue's membarrier_state when entering lazy TLB state. Signed-off-by:
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201020134715.13909-3-mathieu.desnoyers@efficios.com
-
- Oct 16, 2020
-
-
Randy Dunlap authored
Fix multiple occurrences of duplicated words in kernel/. Fix one typo/spello on the same line as a duplicate word. Change one instance of "the the" to "that the". Otherwise just drop one of the repeated words. Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/98202fa6-8919-ef63-9efe-c0fad5ca7af1@infradead.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-