Skip to content
Snippets Groups Projects
  1. Sep 11, 2020
  2. Sep 08, 2020
  3. Sep 05, 2020
    • Tobias Klauser's avatar
      fork: adjust sysctl_max_threads definition to match prototype · b0daa2c7
      Tobias Klauser authored
      
      Commit 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
      changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
      definition of sysctl_max_threads to match its prototype in
      linux/sysctl.h which fixes the following sparse error/warning:
      
        kernel/fork.c:3050:47: warning: incorrect type in argument 3 (different address spaces)
        kernel/fork.c:3050:47:    expected void *
        kernel/fork.c:3050:47:    got void [noderef] __user *buffer
        kernel/fork.c:3036:5: error: symbol 'sysctl_max_threads' redeclared with different type (incompatible argument 3 (different address spaces)):
        kernel/fork.c:3036:5:    int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )
        kernel/fork.c: note: in included file (through include/linux/key.h, include/linux/cred.h, include/linux/sched/signal.h, include/linux/sched/cputime.h):
        include/linux/sysctl.h:242:5: note: previously declared as:
        include/linux/sysctl.h:242:5:    int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... )
      
      Fixes: 32927393 ("sysctl: pass kernel pointers to ->proc_handler")
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Link: https://lkml.kernel.org/r/20200825093647.24263-1-tklauser@distanz.ch
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0daa2c7
  4. Sep 04, 2020
  5. Aug 30, 2020
    • Thomas Gleixner's avatar
      genirq/matrix: Deal with the sillyness of for_each_cpu() on UP · 784a0830
      Thomas Gleixner authored
      
      Most of the CPU mask operations behave the same way, but for_each_cpu() and
      it's variants ignore the cpumask argument and claim that CPU0 is always in
      the mask. This is historical, inconsistent and annoying behaviour.
      
      The matrix allocator uses for_each_cpu() and can be called on UP with an
      empty cpumask. The calling code does not expect that this succeeds but
      until commit e027ffff ("x86/irq: Unbreak interrupt affinity setting")
      this went unnoticed. That commit added a WARN_ON() to catch cases which
      move an interrupt from one vector to another on the same CPU. The warning
      triggers on UP.
      
      Add a check for the cpumask being empty to prevent this.
      
      Fixes: 2f75d9e1 ("genirq: Implement bitmap matrix allocator")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      784a0830
  6. Aug 27, 2020
  7. Aug 26, 2020
  8. Aug 25, 2020
  9. Aug 23, 2020
  10. Aug 21, 2020
  11. Aug 19, 2020
    • Yonghong Song's avatar
      bpf: Avoid visit same object multiple times · e60572b8
      Yonghong Song authored
      
      Currently when traversing all tasks, the next tid
      is always increased by one. This may result in
      visiting the same task multiple times in a
      pid namespace.
      
      This patch fixed the issue by seting the next
      tid as pid_nr_ns(pid, ns) + 1, similar to
      funciton next_tgid().
      
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Link: https://lore.kernel.org/bpf/20200818222310.2181500-1-yhs@fb.com
      e60572b8
    • Yonghong Song's avatar
      bpf: Fix a rcu_sched stall issue with bpf task/task_file iterator · e679654a
      Yonghong Song authored
      
      In our production system, we observed rcu stalls when
      'bpftool prog` is running.
        rcu: INFO: rcu_sched self-detected stall on CPU
        rcu: \x097-....: (20999 ticks this GP) idle=302/1/0x4000000000000000 softirq=1508852/1508852 fqs=4913
        \x09(t=21031 jiffies g=2534773 q=179750)
        NMI backtrace for cpu 7
        CPU: 7 PID: 184195 Comm: bpftool Kdump: loaded Tainted: G        W         5.8.0-00004-g68bfc7f8c1b4 #6
        Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A17 05/03/2019
        Call Trace:
        <IRQ>
        dump_stack+0x57/0x70
        nmi_cpu_backtrace.cold+0x14/0x53
        ? lapic_can_unplug_cpu.cold+0x39/0x39
        nmi_trigger_cpumask_backtrace+0xb7/0xc7
        rcu_dump_cpu_stacks+0xa2/0xd0
        rcu_sched_clock_irq.cold+0x1ff/0x3d9
        ? tick_nohz_handler+0x100/0x100
        update_process_times+0x5b/0x90
        tick_sched_timer+0x5e/0xf0
        __hrtimer_run_queues+0x12a/0x2a0
        hrtimer_interrupt+0x10e/0x280
        __sysvec_apic_timer_interrupt+0x51/0xe0
        asm_call_on_stack+0xf/0x20
        </IRQ>
        sysvec_apic_timer_interrupt+0x6f/0x80
        asm_sysvec_apic_timer_interrupt+0x12/0x20
        RIP: 0010:task_file_seq_get_next+0x71/0x220
        Code: 00 00 8b 53 1c 49 8b 7d 00 89 d6 48 8b 47 20 44 8b 18 41 39 d3 76 75 48 8b 4f 20 8b 01 39 d0 76 61 41 89 d1 49 39 c1 48 19 c0 <48> 8b 49 08 21 d0 48 8d 04 c1 4c 8b 08 4d 85 c9 74 46 49 8b 41 38
        RSP: 0018:ffffc90006223e10 EFLAGS: 00000297
        RAX: ffffffffffffffff RBX: ffff888f0d172388 RCX: ffff888c8c07c1c0
        RDX: 00000000000f017b RSI: 00000000000f017b RDI: ffff888c254702c0
        RBP: ffffc90006223e68 R08: ffff888be2a1c140 R09: 00000000000f017b
        R10: 0000000000000002 R11: 0000000000100000 R12: ffff888f23c24118
        R13: ffffc90006223e60 R14: ffffffff828509a0 R15: 00000000ffffffff
        task_file_seq_next+0x52/0xa0
        bpf_seq_read+0xb9/0x320
        vfs_read+0x9d/0x180
        ksys_read+0x5f/0xe0
        do_syscall_64+0x38/0x60
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7f8815f4f76e
        Code: c0 e9 f6 fe ff ff 55 48 8d 3d 76 70 0a 00 48 89 e5 e8 36 06 02 00 66 0f 1f 44 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 52 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5
        RSP: 002b:00007fff8f9df578 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
        RAX: ffffffffffffffda RBX: 000000000170b9c0 RCX: 00007f8815f4f76e
        RDX: 0000000000001000 RSI: 00007fff8f9df5b0 RDI: 0000000000000007
        RBP: 00007fff8f9e05f0 R08: 0000000000000049 R09: 0000000000000010
        R10: 00007f881601fa40 R11: 0000000000000246 R12: 00007fff8f9e05a8
        R13: 00007fff8f9e05a8 R14: 0000000001917f90 R15: 000000000000e22e
      
      Note that `bpftool prog` actually calls a task_file bpf iterator
      program to establish an association between prog/map/link/btf anon
      files and processes.
      
      In the case where the above rcu stall occured, we had a process
      having 1587 tasks and each task having roughly 81305 files.
      This implied 129 million bpf prog invocations. Unfortunwtely none of
      these files are prog/map/link/btf files so bpf iterator/prog needs
      to traverse all these files and not able to return to user space
      since there are no seq_file buffer overflow.
      
      This patch fixed the issue in bpf_seq_read() to limit the number
      of visited objects. If the maximum number of visited objects is
      reached, no more objects will be visited in the current syscall.
      If there is nothing written in the seq_file buffer, -EAGAIN will
      return to the user so user can try again.
      
      The maximum number of visited objects is set at 1 million.
      In our Intel Xeon D-2191 2.3GHZ 18-core server, bpf_seq_read()
      visiting 1 million files takes around 0.18 seconds.
      
      We did not use cond_resched() since for some iterators, e.g.,
      netlink iterator, where rcu read_lock critical section spans between
      consecutive seq_ops->next(), which makes impossible to do cond_resched()
      in the key while loop of function bpf_seq_read().
      
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Link: https://lore.kernel.org/bpf/20200818222309.2181348-1-yhs@fb.com
      e679654a
  12. Aug 17, 2020
    • Yonghong Song's avatar
      bpf: Use get_file_rcu() instead of get_file() for task_file iterator · cf28f3bb
      Yonghong Song authored
      
      With latest `bpftool prog` command, we observed the following kernel
      panic.
          BUG: kernel NULL pointer dereference, address: 0000000000000000
          #PF: supervisor instruction fetch in kernel mode
          #PF: error_code(0x0010) - not-present page
          PGD dfe894067 P4D dfe894067 PUD deb663067 PMD 0
          Oops: 0010 [#1] SMP
          CPU: 9 PID: 6023 ...
          RIP: 0010:0x0
          Code: Bad RIP value.
          RSP: 0000:ffffc900002b8f18 EFLAGS: 00010286
          RAX: ffff8883a405f400 RBX: ffff888e46a6bf00 RCX: 000000008020000c
          RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8883a405f400
          RBP: ffff888e46a6bf50 R08: 0000000000000000 R09: ffffffff81129600
          R10: ffff8883a405f300 R11: 0000160000000000 R12: 0000000000002710
          R13: 000000e9494b690c R14: 0000000000000202 R15: 0000000000000009
          FS:  00007fd9187fe700(0000) GS:ffff888e46a40000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffffffffffffffd6 CR3: 0000000de5d33002 CR4: 0000000000360ee0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           <IRQ>
           rcu_core+0x1a4/0x440
           __do_softirq+0xd3/0x2c8
           irq_exit+0x9d/0xa0
           smp_apic_timer_interrupt+0x68/0x120
           apic_timer_interrupt+0xf/0x20
           </IRQ>
          RIP: 0033:0x47ce80
          Code: Bad RIP value.
          RSP: 002b:00007fd9187fba40 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
          RAX: 0000000000000002 RBX: 00007fd931789160 RCX: 000000000000010c
          RDX: 00007fd9308cdfb4 RSI: 00007fd9308cdfb4 RDI: 00007ffedd1ea0a8
          RBP: 00007fd9187fbab0 R08: 000000000000000e R09: 000000000000002a
          R10: 0000000000480210 R11: 00007fd9187fc570 R12: 00007fd9316cc400
          R13: 0000000000000118 R14: 00007fd9308cdfb4 R15: 00007fd9317a9380
      
      After further analysis, the bug is triggered by
      Commit eaaacd23 ("bpf: Add task and task/file iterator targets")
      which introduced task_file bpf iterator, which traverses all open file
      descriptors for all tasks in the current namespace.
      The latest `bpftool prog` calls a task_file bpf program to traverse
      all files in the system in order to associate processes with progs/maps, etc.
      When traversing files for a given task, rcu read_lock is taken to
      access all files in a file_struct. But it used get_file() to grab
      a file, which is not right. It is possible file->f_count is 0 and
      get_file() will unconditionally increase it.
      Later put_file() may cause all kind of issues with the above
      as one of sympotoms.
      
      The failure can be reproduced with the following steps in a few seconds:
          $ cat t.c
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/stat.h>
          #include <fcntl.h>
          #include <unistd.h>
      
          #define N 10000
          int fd[N];
          int main() {
            int i;
      
            for (i = 0; i < N; i++) {
              fd[i] = open("./note.txt", 'r');
              if (fd[i] < 0) {
                 fprintf(stderr, "failed\n");
                 return -1;
              }
            }
            for (i = 0; i < N; i++)
              close(fd[i]);
      
            return 0;
          }
          $ gcc -O2 t.c
          $ cat run.sh
          #/bin/bash
          for i in {1..100}
          do
            while true; do ./a.out; done &
          done
          $ ./run.sh
          $ while true; do bpftool prog >& /dev/null; done
      
      This patch used get_file_rcu() which only grabs a file if the
      file->f_count is not zero. This is to ensure the file pointer
      is always valid. The above reproducer did not fail for more
      than 30 minutes.
      
      Fixes: eaaacd23 ("bpf: Add task and task/file iterator targets")
      Suggested-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Link: https://lore.kernel.org/bpf/20200817174214.252601-1-yhs@fb.com
      cf28f3bb
    • David Howells's avatar
      watch_queue: Limit the number of watches a user can hold · 29e44f45
      David Howells authored
      
      Impose a limit on the number of watches that a user can hold so that
      they can't use this mechanism to fill up all the available memory.
      
      This is done by putting a counter in user_struct that's incremented when
      a watch is allocated and decreased when it is released.  If the number
      exceeds the RLIMIT_NOFILE limit, the watch is rejected with EAGAIN.
      
      This can be tested by the following means:
      
       (1) Create a watch queue and attach it to fd 5 in the program given - in
           this case, bash:
      
      	keyctl watch_session /tmp/nlog /tmp/gclog 5 bash
      
       (2) In the shell, set the maximum number of files to, say, 99:
      
      	ulimit -n 99
      
       (3) Add 200 keyrings:
      
      	for ((i=0; i<200; i++)); do keyctl newring a$i @s || break; done
      
       (4) Try to watch all of the keyrings:
      
      	for ((i=0; i<200; i++)); do echo $i; keyctl watch_add 5 %:a$i || break; done
      
           This should fail when the number of watches belonging to the user hits
           99.
      
       (5) Remove all the keyrings and all of those watches should go away:
      
      	for ((i=0; i<200; i++)); do keyctl unlink %:a$i; done
      
       (6) Kill off the watch queue by exiting the shell spawned by
           watch_session.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29e44f45
  13. Aug 15, 2020
    • Xiaoming Ni's avatar
      all arch: remove system call sys_sysctl · 88db0aa2
      Xiaoming Ni authored
      Since commit 61a47c1a ("sysctl: Remove the sysctl system call"),
      sys_sysctl is actually unavailable: any input can only return an error.
      
      We have been warning about people using the sysctl system call for years
      and believe there are no more users.  Even if there are users of this
      interface if they have not complained or fixed their code by now they
      probably are not going to, so there is no point in warning them any
      longer.
      
      So completely remove sys_sysctl on all architectures.
      
      [nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
       Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com
      
      
      
      Signed-off-by: default avatarXiaoming Ni <nixiaoming@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: Will Deacon <will@kernel.org>		[arm/arm64]
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bin Meng <bin.meng@windriver.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: chenzefeng <chenzefeng2@huawei.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christian Brauner <christian@brauner.io>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kars de Jong <jongk@linux-m68k.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: "Paul E. McKenney" <paulmck@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Sven Schnelle <svens@stackframe.org>
      Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
      Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      88db0aa2
    • Christoph Hellwig's avatar
      dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig · 846f9e1f
      Christoph Hellwig authored
      
      Have a single definition that architetures can select.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      846f9e1f
  14. Aug 14, 2020
    • Linus Torvalds's avatar
      dma-debug: remove debug_dma_assert_idle() function · 5848dc5b
      Linus Torvalds authored
      
      This remoes the code from the COW path to call debug_dma_assert_idle(),
      which was added many years ago.
      
      Google shows that it hasn't caught anything in the 6+ years we've had it
      apart from a false positive, and Hugh just noticed how it had a very
      unfortunate spinlock serialization in the COW path.
      
      He fixed that issue the previous commit (a85ffd59: "dma-debug: fix
      debug_dma_assert_idle(), use rcu_read_lock()"), but let's see if anybody
      even notices when we remove this function entirely.
      
      NOTE! We keep the dma tracking infrastructure that was added by the
      commit that introduced it.  Partly to make it easier to resurrect this
      debug code if we ever deside to, and partly because that tracking by pfn
      and offset looks quite reasonable.
      
      The problem with this debug code was simply that it was expensive and
      didn't seem worth it, not that it was wrong per se.
      
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5848dc5b
    • Hugh Dickins's avatar
      dma-debug: fix debug_dma_assert_idle(), use rcu_read_lock() · a85ffd59
      Hugh Dickins authored
      
      Since commit 2a9127fc ("mm: rewrite wait_on_page_bit_common()
      logic") improved unlock_page(), it has become more noticeable how
      cow_user_page() in a kernel with CONFIG_DMA_API_DEBUG=y can create and
      suffer from heavy contention on DMA debug's radix_lock in
      debug_dma_assert_idle().
      
      It is only doing a lookup: use rcu_read_lock() and rcu_read_unlock()
      instead; though that does require the static ents[] to be moved
      onstack...
      
      ...but, hold on, isn't that radix_tree_gang_lookup() and loop doing
      quite the wrong thing: searching CACHELINES_PER_PAGE entries for an
      exact match with the first cacheline of the page in question?
      radix_tree_gang_lookup() is the right tool for the job, but we need
      nothing more than to check the first entry it can find, reporting if
      that falls anywhere within the page.
      
      (Is RCU safe here? As safe as using the spinlock was. The entries are
      never freed, so don't need to be freed by RCU. They may be reused, and
      there is a faint chance of a race, with an offending entry reused while
      printing its error info; but the spinlock did not prevent that either,
      and I agree that it's not worth worrying about. ]
      
      [ Side noe: this patch is a clear improvement to the status quo, but the
        next patch will be removing this debug function entirely.
      
        But just in case we decide we want to resurrect the debugging code
        some day, I'm first applying this improvement patch so that it doesn't
        get lost    - Linus ]
      
      Fixes: 3b7a6418 ("dma debug: account for cachelines and read-only mappings in overlap tracking")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a85ffd59
    • Nicolas Saenz Julienne's avatar
      dma-pool: Only allocate from CMA when in same memory zone · d7e673ec
      Nicolas Saenz Julienne authored
      
      There is no guarantee to CMA's placement, so allocating a zone specific
      atomic pool from CMA might return memory from a completely different
      memory zone. To get around this double check CMA's placement before
      allocating from it.
      
      Signed-off-by: default avatarNicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      d7e673ec
    • Christoph Hellwig's avatar
      dma-pool: fix coherent pool allocations for IOMMU mappings · 9420139f
      Christoph Hellwig authored
      
      When allocating coherent pool memory for an IOMMU mapping we don't care
      about the DMA mask.  Move the guess for the initial GFP mask into the
      dma_direct_alloc_pages and pass dma_coherent_ok as a function pointer
      argument so that it doesn't get applied to the IOMMU case.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      9420139f
    • Libing Zhou's avatar
      sched/debug: Fix the alignment of the show-state debug output · cc172ff3
      Libing Zhou authored and Ingo Molnar's avatar Ingo Molnar committed
      
      Current sysrq(t) output task fields name are not aligned with
      actual task fields value, e.g.:
      
      	kernel: sysrq: Show State
      	kernel:  task                        PC stack   pid father
      	kernel: systemd         S12456     1      0 0x00000000
      	kernel: Call Trace:
      	kernel: ? __schedule+0x240/0x740
      
      To make it more readable, print fields name together with task fields
      value in the same line, with fixed width:
      
      	kernel: sysrq: Show State
      	kernel: task:systemd         state:S stack:12920 pid:    1 ppid:     0 flags:0x00000000
      	kernel: Call Trace:
      	kernel: __schedule+0x282/0x620
      
      Signed-off-by: default avatarLibing Zhou <libing.zhou@nokia-sbell.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200814030236.37835-1-libing.zhou@nokia-sbell.com
      cc172ff3
  15. Aug 13, 2020
  16. Aug 12, 2020
Loading