Skip to content
Snippets Groups Projects
  1. Jun 27, 2021
    • Linus Torvalds's avatar
      Revert "signal: Allow tasks to cache one sigqueue struct" · b4b27b9e
      Linus Torvalds authored
      
      This reverts commits 4bad58eb (and
      399f8dd9, which tried to fix it).
      
      I do not believe these are correct, and I'm about to release 5.13, so am
      reverting them out of an abundance of caution.
      
      The locking is odd, and appears broken.
      
      On the allocation side (in __sigqueue_alloc()), the locking is somewhat
      straightforward: it depends on sighand->siglock.  Since one caller
      doesn't hold that lock, it further then tests 'sigqueue_flags' to avoid
      the case with no locks held.
      
      On the freeing side (in sigqueue_cache_or_free()), there is no locking
      at all, and the logic instead depends on 'current' being a single
      thread, and not able to race with itself.
      
      To make things more exciting, there's also the data race between freeing
      a signal and allocating one, which is handled by using WRITE_ONCE() and
      READ_ONCE(), and being mutually exclusive wrt the initial state (ie
      freeing will only free if the old state was NULL, while allocating will
      obviously only use the value if it was non-NULL, so only one or the
      other will actually act on the value).
      
      However, while the free->alloc paths do seem mutually exclusive thanks
      to just the data value dependency, it's not clear what the memory
      ordering constraints are on it.  Could writes from the previous
      allocation possibly be delayed and seen by the new allocation later,
      causing logical inconsistencies?
      
      So it's all very exciting and unusual.
      
      And in particular, it seems that the freeing side is incorrect in
      depending on "current" being single-threaded.  Yes, 'current' is a
      single thread, but in the presense of asynchronous events even a single
      thread can have data races.
      
      And such asynchronous events can and do happen, with interrupts causing
      signals to be flushed and thus free'd (for example - sending a
      SIGCONT/SIGSTOP can happen from interrupt context, and can flush
      previously queued process control signals).
      
      So regardless of all the other questions about the memory ordering and
      locking for this new cached allocation, the sigqueue_cache_or_free()
      assumptions seem to be fundamentally incorrect.
      
      It may be that people will show me the errors of my ways, and tell me
      why this is all safe after all.  We can reinstate it if so.  But my
      current belief is that the WRITE_ONCE() that sets the cached entry needs
      to be a smp_store_release(), and the READ_ONCE() that finds a cached
      entry needs to be a smp_load_acquire() to handle memory ordering
      correctly.
      
      And the sequence in sigqueue_cache_or_free() would need to either use a
      lock or at least be interrupt-safe some way (perhaps by using something
      like the percpu 'cmpxchg': it doesn't need to be SMP-safe, but like the
      percpu operations it needs to be interrupt-safe).
      
      Fixes: 399f8dd9 ("signal: Prevent sigqueue caching after task got released")
      Fixes: 4bad58eb ("signal: Allow tasks to cache one sigqueue struct")
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4b27b9e
  2. Jun 25, 2021
    • Hugh Dickins's avatar
      mm, futex: fix shared futex pgoff on shmem huge page · fe19bd3d
      Hugh Dickins authored
      If more than one futex is placed on a shmem huge page, it can happen
      that waking the second wakes the first instead, and leaves the second
      waiting: the key's shared.pgoff is wrong.
      
      When 3.11 commit 13d60f4b ("futex: Take hugepages into account when
      generating futex_key"), the only shared huge pages came from hugetlbfs,
      and the code added to deal with its exceptional page->index was put into
      hugetlb source.  Then that was missed when 4.8 added shmem huge pages.
      
      page_to_pgoff() is what others use for this nowadays: except that, as
      currently written, it gives the right answer on hugetlbfs head, but
      nonsense on hugetlbfs tails.  Fix that by calling hugetlbfs-specific
      hugetlb_basepage_index() on PageHuge tails as well as on head.
      
      Yes, it's unconventional to declare hugetlb_basepage_index() there in
      pagemap.h, rather than in hugetlb.h; but I do not expect anything but
      page_to_pgoff() ever to need it.
      
      [akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope]
      
      Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com
      
      
      Fixes: 800d8c63 ("shmem: add huge pages support")
      Reported-by: default avatarNeel Natu <neelnatu@google.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Zhang Yi <wetpzy@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fe19bd3d
    • Petr Mladek's avatar
      kthread: prevent deadlock when kthread_mod_delayed_work() races with... · 5fa54346
      Petr Mladek authored
      kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
      
      The system might hang with the following backtrace:
      
      	schedule+0x80/0x100
      	schedule_timeout+0x48/0x138
      	wait_for_common+0xa4/0x134
      	wait_for_completion+0x1c/0x2c
      	kthread_flush_work+0x114/0x1cc
      	kthread_cancel_work_sync.llvm.16514401384283632983+0xe8/0x144
      	kthread_cancel_delayed_work_sync+0x18/0x2c
      	xxxx_pm_notify+0xb0/0xd8
      	blocking_notifier_call_chain_robust+0x80/0x194
      	pm_notifier_call_chain_robust+0x28/0x4c
      	suspend_prepare+0x40/0x260
      	enter_state+0x80/0x3f4
      	pm_suspend+0x60/0xdc
      	state_store+0x108/0x144
      	kobj_attr_store+0x38/0x88
      	sysfs_kf_write+0x64/0xc0
      	kernfs_fop_write_iter+0x108/0x1d0
      	vfs_write+0x2f4/0x368
      	ksys_write+0x7c/0xec
      
      It is caused by the following race between kthread_mod_delayed_work()
      and kthread_cancel_delayed_work_sync():
      
      CPU0				CPU1
      
      Context: Thread A		Context: Thread B
      
      kthread_mod_delayed_work()
        spin_lock()
        __kthread_cancel_work()
           spin_unlock()
           del_timer_sync()
      				kthread_cancel_delayed_work_sync()
      				  spin_lock()
      				  __kthread_cancel_work()
      				    spin_unlock()
      				    del_timer_sync()
      				    spin_lock()
      
      				  work->canceling++
      				  spin_unlock
           spin_lock()
         queue_delayed_work()
           // dwork is put into the worker->delayed_work_list
      
         spin_unlock()
      
      				  kthread_flush_work()
           // flush_work is put at the tail of the dwork
      
      				    wait_for_completion()
      
      Context: IRQ
      
        kthread_delayed_work_timer_fn()
          spin_lock()
          list_del_init(&work->node);
          spin_unlock()
      
      BANG: flush_work is not longer linked and will never get proceed.
      
      The problem is that kthread_mod_delayed_work() checks work->canceling
      flag before canceling the timer.
      
      A simple solution is to (re)check work->canceling after
      __kthread_cancel_work().  But then it is not clear what should be
      returned when __kthread_cancel_work() removed the work from the queue
      (list) and it can't queue it again with the new @delay.
      
      The return value might be used for reference counting.  The caller has
      to know whether a new work has been queued or an existing one was
      replaced.
      
      The proper solution is that kthread_mod_delayed_work() will remove the
      work from the queue (list) _only_ when work->canceling is not set.  The
      flag must be checked after the timer is stopped and the remaining
      operations can be done under worker->lock.
      
      Note that kthread_mod_delayed_work() could remove the timer and then
      bail out.  It is fine.  The other canceling caller needs to cancel the
      timer as well.  The important thing is that the queue (list)
      manipulation is done atomically under worker->lock.
      
      Link: https://lkml.kernel.org/r/20210610133051.15337-3-pmladek@suse.com
      
      
      Fixes: 9a6b06c8 ("kthread: allow to modify delayed kthread work")
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Reported-by: default avatarMartin Liu <liumartin@google.com>
      Cc: <jenhaochen@google.com>
      Cc: Minchan Kim <minchan@google.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5fa54346
    • Petr Mladek's avatar
      kthread_worker: split code for canceling the delayed work timer · 34b3d534
      Petr Mladek authored
      Patch series "kthread_worker: Fix race between kthread_mod_delayed_work()
      and kthread_cancel_delayed_work_sync()".
      
      This patchset fixes the race between kthread_mod_delayed_work() and
      kthread_cancel_delayed_work_sync() including proper return value
      handling.
      
      This patch (of 2):
      
      Simple code refactoring as a preparation step for fixing a race between
      kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync().
      
      It does not modify the existing behavior.
      
      Link: https://lkml.kernel.org/r/20210610133051.15337-2-pmladek@suse.com
      
      
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: <jenhaochen@google.com>
      Cc: Martin Liu <liumartin@google.com>
      Cc: Minchan Kim <minchan@google.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34b3d534
  3. Jun 22, 2021
  4. Jun 21, 2021
    • Bumyong Lee's avatar
      swiotlb: manipulate orig_addr when tlb_addr has offset · 5f89468e
      Bumyong Lee authored
      
      in case of driver wants to sync part of ranges with offset,
      swiotlb_tbl_sync_single() copies from orig_addr base to tlb_addr with
      offset and ends up with data mismatch.
      
      It was removed from
      "swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single",
      but said logic has to be added back in.
      
      From Linus's email:
      "That commit which the removed the offset calculation entirely, because the old
      
              (unsigned long)tlb_addr & (IO_TLB_SIZE - 1)
      
      was wrong, but instead of removing it, I think it should have just
      fixed it to be
      
              (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);
      
      instead. That way the slot offset always matches the slot index calculation."
      
      (Unfortunatly that broke NVMe).
      
      The use-case that drivers are hitting is as follow:
      
      1. Get dma_addr_t from dma_map_single()
      
      dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE);
      
          |<---------------vsize------------->|
          +-----------------------------------+
          |                                   | original buffer
          +-----------------------------------+
        vaddr
      
       swiotlb_align_offset
           |<----->|<---------------vsize------------->|
           +-------+-----------------------------------+
           |       |                                   | swiotlb buffer
           +-------+-----------------------------------+
                tlb_addr
      
      2. Do something
      3. Sync dma_addr_t through dma_sync_single_for_device(..)
      
      dma_sync_single_for_device(dev, tlb_addr + offset, size, DMA_TO_DEVICE);
      
        Error case.
          Copy data to original buffer but it is from base addr (instead of
        base addr + offset) in original buffer:
      
       swiotlb_align_offset
           |<----->|<- offset ->|<- size ->|
           +-------+-----------------------------------+
           |       |            |##########|           | swiotlb buffer
           +-------+-----------------------------------+
                tlb_addr
      
          |<- size ->|
          +-----------------------------------+
          |##########|                        | original buffer
          +-----------------------------------+
        vaddr
      
      The fix is to copy the data to the original buffer and take into
      account the offset, like so:
      
       swiotlb_align_offset
           |<----->|<- offset ->|<- size ->|
           +-------+-----------------------------------+
           |       |            |##########|           | swiotlb buffer
           +-------+-----------------------------------+
                tlb_addr
      
          |<- offset ->|<- size ->|
          +-----------------------------------+
          |            |##########|           | original buffer
          +-----------------------------------+
        vaddr
      
      [One fix which was Linus's that made more sense to as it created a
      symmetry would break NVMe. The reason for that is the:
       unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);
      
      would come up with the proper offset, but it would lose the
      alignment (which this patch contains).]
      
      Fixes: 16fc3cef ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
      Signed-off-by: default avatarBumyong Lee <bumyong.lee@samsung.com>
      Signed-off-by: default avatarChanho Park <chanho61.park@samsung.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reported-by: default avatarDominique MARTINET <dominique.martinet@atmark-techno.com>
      Reported-by: default avatarHoria Geantă <horia.geanta@nxp.com>
      Tested-by: default avatarHoria Geantă <horia.geanta@nxp.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      5f89468e
  5. Jun 18, 2021
    • Steven Rostedt (VMware)'s avatar
      tracing: Do no increment trace_clock_global() by one · 89529d8b
      Steven Rostedt (VMware) authored
      
      The trace_clock_global() tries to make sure the events between CPUs is
      somewhat in order. A global value is used and updated by the latest read
      of a clock. If one CPU is ahead by a little, and is read by another CPU, a
      lock is taken, and if the timestamp of the other CPU is behind, it will
      simply use the other CPUs timestamp.
      
      The lock is also only taken with a "trylock" due to tracing, and strange
      recursions can happen. The lock is not taken at all in NMI context.
      
      In the case where the lock is not able to be taken, the non synced
      timestamp is returned. But it will not be less than the saved global
      timestamp.
      
      The problem arises because when the time goes "backwards" the time
      returned is the saved timestamp plus 1. If the lock is not taken, and the
      plus one to the timestamp is returned, there's a small race that can cause
      the time to go backwards!
      
      	CPU0				CPU1
      	----				----
      				trace_clock_global() {
      				    ts = clock() [ 1000 ]
      				    trylock(clock_lock) [ success ]
      				    global_ts = ts; [ 1000 ]
      
      				    <interrupted by NMI>
       trace_clock_global() {
          ts = clock() [ 999 ]
          if (ts < global_ts)
      	ts = global_ts + 1 [ 1001 ]
      
          trylock(clock_lock) [ fail ]
      
          return ts [ 1001]
       }
      				    unlock(clock_lock);
      				    return ts; [ 1000 ]
      				}
      
       trace_clock_global() {
          ts = clock() [ 1000 ]
          if (ts < global_ts) [ false 1000 == 1000 ]
      
          trylock(clock_lock) [ success ]
          global_ts = ts; [ 1000 ]
          unlock(clock_lock)
      
          return ts; [ 1000 ]
       }
      
      The above case shows to reads of trace_clock_global() on the same CPU, but
      the second read returns one less than the first read. That is, time when
      backwards, and this is not what is allowed by trace_clock_global().
      
      This was triggered by heavy tracing and the ring buffer checker that tests
      for the clock going backwards:
      
       Ring buffer clock went backwards: 20613921464 -> 20613921463
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 0 at kernel/trace/ring_buffer.c:3412 check_buffer+0x1b9/0x1c0
       Modules linked in:
       [..]
       [CPU: 2]TIME DOES NOT MATCH expected:20620711698 actual:20620711697 delta:6790234 before:20613921463 after:20613921463
         [20613915818] PAGE TIME STAMP
         [20613915818] delta:0
         [20613915819] delta:1
         [20613916035] delta:216
         [20613916465] delta:430
         [20613916575] delta:110
         [20613916749] delta:174
         [20613917248] delta:499
         [20613917333] delta:85
         [20613917775] delta:442
         [20613917921] delta:146
         [20613918321] delta:400
         [20613918568] delta:247
         [20613918768] delta:200
         [20613919306] delta:538
         [20613919353] delta:47
         [20613919980] delta:627
         [20613920296] delta:316
         [20613920571] delta:275
         [20613920862] delta:291
         [20613921152] delta:290
         [20613921464] delta:312
         [20613921464] delta:0 TIME EXTEND
         [20613921464] delta:0
      
      This happened more than once, and always for an off by one result. It also
      started happening after commit aafe104a was added.
      
      Cc: stable@vger.kernel.org
      Fixes: aafe104a ("tracing: Restructure trace_clock_global() to never block")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      89529d8b
    • Steven Rostedt (VMware)'s avatar
      tracing: Do not stop recording comms if the trace file is being read · 4fdd595e
      Steven Rostedt (VMware) authored
      
      A while ago, when the "trace" file was opened, tracing was stopped, and
      code was added to stop recording the comms to saved_cmdlines, for mapping
      of the pids to the task name.
      
      Code has been added that only records the comm if a trace event occurred,
      and there's no reason to not trace it if the trace file is opened.
      
      Cc: stable@vger.kernel.org
      Fixes: 7ffbd48d ("tracing: Cache comms only after an event occurred")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      4fdd595e
    • Steven Rostedt (VMware)'s avatar
      tracing: Do not stop recording cmdlines when tracing is off · 85550c83
      Steven Rostedt (VMware) authored
      
      The saved_cmdlines is used to map pids to the task name, such that the
      output of the tracing does not just show pids, but also gives a human
      readable name for the task.
      
      If the name is not mapped, the output looks like this:
      
          <...>-1316          [005] ...2   132.044039: ...
      
      Instead of this:
      
          gnome-shell-1316    [005] ...2   132.044039: ...
      
      The names are updated when tracing is running, but are skipped if tracing
      is stopped. Unfortunately, this stops the recording of the names if the
      top level tracer is stopped, and not if there's other tracers active.
      
      The recording of a name only happens when a new event is written into a
      ring buffer, so there is no need to test if tracing is on or not. If
      tracing is off, then no event is written and no need to test if tracing is
      off or not.
      
      Remove the check, as it hides the names of tasks for events in the
      instance buffers.
      
      Cc: stable@vger.kernel.org
      Fixes: 7ffbd48d ("tracing: Cache comms only after an event occurred")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      85550c83
  6. Jun 16, 2021
  7. Jun 14, 2021
    • Daniel Borkmann's avatar
      bpf: Fix leakage under speculation on mispredicted branches · 9183671a
      Daniel Borkmann authored
      
      The verifier only enumerates valid control-flow paths and skips paths that
      are unreachable in the non-speculative domain. And so it can miss issues
      under speculative execution on mispredicted branches.
      
      For example, a type confusion has been demonstrated with the following
      crafted program:
      
        // r0 = pointer to a map array entry
        // r6 = pointer to readable stack slot
        // r9 = scalar controlled by attacker
        1: r0 = *(u64 *)(r0) // cache miss
        2: if r0 != 0x0 goto line 4
        3: r6 = r9
        4: if r0 != 0x1 goto line 6
        5: r9 = *(u8 *)(r6)
        6: // leak r9
      
      Since line 3 runs iff r0 == 0 and line 5 runs iff r0 == 1, the verifier
      concludes that the pointer dereference on line 5 is safe. But: if the
      attacker trains both the branches to fall-through, such that the following
      is speculatively executed ...
      
        r6 = r9
        r9 = *(u8 *)(r6)
        // leak r9
      
      ... then the program will dereference an attacker-controlled value and could
      leak its content under speculative execution via side-channel. This requires
      to mistrain the branch predictor, which can be rather tricky, because the
      branches are mutually exclusive. However such training can be done at
      congruent addresses in user space using different branches that are not
      mutually exclusive. That is, by training branches in user space ...
      
        A:  if r0 != 0x0 goto line C
        B:  ...
        C:  if r0 != 0x0 goto line D
        D:  ...
      
      ... such that addresses A and C collide to the same CPU branch prediction
      entries in the PHT (pattern history table) as those of the BPF program's
      lines 2 and 4, respectively. A non-privileged attacker could simply brute
      force such collisions in the PHT until observing the attack succeeding.
      
      Alternative methods to mistrain the branch predictor are also possible that
      avoid brute forcing the collisions in the PHT. A reliable attack has been
      demonstrated, for example, using the following crafted program:
      
        // r0 = pointer to a [control] map array entry
        // r7 = *(u64 *)(r0 + 0), training/attack phase
        // r8 = *(u64 *)(r0 + 8), oob address
        // [...]
        // r0 = pointer to a [data] map array entry
        1: if r7 == 0x3 goto line 3
        2: r8 = r0
        // crafted sequence of conditional jumps to separate the conditional
        // branch in line 193 from the current execution flow
        3: if r0 != 0x0 goto line 5
        4: if r0 == 0x0 goto exit
        5: if r0 != 0x0 goto line 7
        6: if r0 == 0x0 goto exit
        [...]
        187: if r0 != 0x0 goto line 189
        188: if r0 == 0x0 goto exit
        // load any slowly-loaded value (due to cache miss in phase 3) ...
        189: r3 = *(u64 *)(r0 + 0x1200)
        // ... and turn it into known zero for verifier, while preserving slowly-
        // loaded dependency when executing:
        190: r3 &= 1
        191: r3 &= 2
        // speculatively bypassed phase dependency
        192: r7 += r3
        193: if r7 == 0x3 goto exit
        194: r4 = *(u8 *)(r8 + 0)
        // leak r4
      
      As can be seen, in training phase (phase != 0x3), the condition in line 1
      turns into false and therefore r8 with the oob address is overridden with
      the valid map value address, which in line 194 we can read out without
      issues. However, in attack phase, line 2 is skipped, and due to the cache
      miss in line 189 where the map value is (zeroed and later) added to the
      phase register, the condition in line 193 takes the fall-through path due
      to prior branch predictor training, where under speculation, it'll load the
      byte at oob address r8 (unknown scalar type at that point) which could then
      be leaked via side-channel.
      
      One way to mitigate these is to 'branch off' an unreachable path, meaning,
      the current verification path keeps following the is_branch_taken() path
      and we push the other branch to the verification stack. Given this is
      unreachable from the non-speculative domain, this branch's vstate is
      explicitly marked as speculative. This is needed for two reasons: i) if
      this path is solely seen from speculative execution, then we later on still
      want the dead code elimination to kick in in order to sanitize these
      instructions with jmp-1s, and ii) to ensure that paths walked in the
      non-speculative domain are not pruned from earlier walks of paths walked in
      the speculative domain. Additionally, for robustness, we mark the registers
      which have been part of the conditional as unknown in the speculative path
      given there should be no assumptions made on their content.
      
      The fix in here mitigates type confusion attacks described earlier due to
      i) all code paths in the BPF program being explored and ii) existing
      verifier logic already ensuring that given memory access instruction
      references one specific data structure.
      
      An alternative to this fix that has also been looked at in this scope was to
      mark aux->alu_state at the jump instruction with a BPF_JMP_TAKEN state as
      well as direction encoding (always-goto, always-fallthrough, unknown), such
      that mixing of different always-* directions themselves as well as mixing of
      always-* with unknown directions would cause a program rejection by the
      verifier, e.g. programs with constructs like 'if ([...]) { x = 0; } else
      { x = 1; }' with subsequent 'if (x == 1) { [...] }'. For unprivileged, this
      would result in only single direction always-* taken paths, and unknown taken
      paths being allowed, such that the former could be patched from a conditional
      jump to an unconditional jump (ja). Compared to this approach here, it would
      have two downsides: i) valid programs that otherwise are not performing any
      pointer arithmetic, etc, would potentially be rejected/broken, and ii) we are
      required to turn off path pruning for unprivileged, where both can be avoided
      in this work through pushing the invalid branch to the verification stack.
      
      The issue was originally discovered by Adam and Ofek, and later independently
      discovered and reported as a result of Benedict and Piotr's research work.
      
      Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
      Reported-by: default avatarAdam Morrison <mad@cs.tau.ac.il>
      Reported-by: default avatarOfek Kirzner <ofekkir@gmail.com>
      Reported-by: default avatarBenedict Schlueter <benedict.schlueter@rub.de>
      Reported-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Reviewed-by: default avatarBenedict Schlueter <benedict.schlueter@rub.de>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9183671a
    • Daniel Borkmann's avatar
      bpf: Do not mark insn as seen under speculative path verification · fe9a5ca7
      Daniel Borkmann authored
      
      ... in such circumstances, we do not want to mark the instruction as seen given
      the goal is still to jmp-1 rewrite/sanitize dead code, if it is not reachable
      from the non-speculative path verification. We do however want to verify it for
      safety regardless.
      
      With the patch as-is all the insns that have been marked as seen before the
      patch will also be marked as seen after the patch (just with a potentially
      different non-zero count). An upcoming patch will also verify paths that are
      unreachable in the non-speculative domain, hence this extension is needed.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Reviewed-by: default avatarBenedict Schlueter <benedict.schlueter@rub.de>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      fe9a5ca7
    • Daniel Borkmann's avatar
      bpf: Inherit expanded/patched seen count from old aux data · d203b0fd
      Daniel Borkmann authored
      
      Instead of relying on current env->pass_cnt, use the seen count from the
      old aux data in adjust_insn_aux_data(), and expand it to the new range of
      patched instructions. This change is valid given we always expand 1:n
      with n>=1, so what applies to the old/original instruction needs to apply
      for the replacement as well.
      
      Not relying on env->pass_cnt is a prerequisite for a later change where we
      want to avoid marking an instruction seen when verified under speculative
      execution path.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Reviewed-by: default avatarBenedict Schlueter <benedict.schlueter@rub.de>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d203b0fd
    • Odin Ugedal's avatar
      sched/fair: Correctly insert cfs_rq's to list on unthrottle · a7b359fc
      Odin Ugedal authored
      
      Fix an issue where fairness is decreased since cfs_rq's can end up not
      being decayed properly. For two sibling control groups with the same
      priority, this can often lead to a load ratio of 99/1 (!!).
      
      This happens because when a cfs_rq is throttled, all the descendant
      cfs_rq's will be removed from the leaf list. When they initial cfs_rq
      is unthrottled, it will currently only re add descendant cfs_rq's if
      they have one or more entities enqueued. This is not a perfect
      heuristic.
      
      Instead, we insert all cfs_rq's that contain one or more enqueued
      entities, or it its load is not completely decayed.
      
      Can often lead to situations like this for equally weighted control
      groups:
      
        $ ps u -C stress
        USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
        root       10009 88.8  0.0   3676   100 pts/1    R+   11:04   0:13 stress --cpu 1
        root       10023  3.0  0.0   3676   104 pts/1    R+   11:04   0:00 stress --cpu 1
      
      Fixes: 31bc6aea ("sched/fair: Optimize update_blocked_averages()")
      [vingo: !SMP build fix]
      Signed-off-by: default avatarOdin Ugedal <odin@uged.al>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Link: https://lore.kernel.org/r/20210612112815.61678-1-odin@uged.al
      a7b359fc
    • Viresh Kumar's avatar
      Revert "cpufreq: CPPC: Add support for frequency invariance" · 771fac5e
      Viresh Kumar authored
      
      This reverts commit 4c38f2df.
      
      There are few races in the frequency invariance support for CPPC driver,
      namely the driver doesn't stop the kthread_work and irq_work on policy
      exit during suspend/resume or CPU hotplug.
      
      A proper fix won't be possible for the 5.13-rc, as it requires a lot of
      changes. Lets revert the patch instead for now.
      
      Fixes: 4c38f2df ("cpufreq: CPPC: Add support for frequency invariance")
      Reported-by: default avatarQian Cai <quic_qiancai@quicinc.com>
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      771fac5e
  8. Jun 10, 2021
  9. Jun 08, 2021
    • Liangyan's avatar
      tracing: Correct the length check which causes memory corruption · 3e08a9f9
      Liangyan authored
      We've suffered from severe kernel crashes due to memory corruption on
      our production environment, like,
      
      Call Trace:
      [1640542.554277] general protection fault: 0000 [#1] SMP PTI
      [1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G
      [1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190
      [1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286
      [1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX:
      0000000006e931bf
      [1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI:
      ffff9a45ff004300
      [1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09:
      0000000000000000
      [1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12:
      ffffffff9a20608d
      [1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15:
      696c662f65636976
      [1640542.563128] FS:  00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000)
      knlGS:0000000000000000
      [1640542.563937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4:
      00000000003606e0
      [1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [1640542.566742] Call Trace:
      [1640542.567009]  anon_vma_clone+0x5d/0x170
      [1640542.567417]  __split_vma+0x91/0x1a0
      [1640542.567777]  do_munmap+0x2c6/0x320
      [1640542.568128]  vm_munmap+0x54/0x70
      [1640542.569990]  __x64_sys_munmap+0x22/0x30
      [1640542.572005]  do_syscall_64+0x5b/0x1b0
      [1640542.573724]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [1640542.575642] RIP: 0033:0x7f45d6e61e27
      
      James Wang has reproduced it stably on the latest 4.19 LTS.
      After some debugging, we finally proved that it's due to ftrace
      buffer out-of-bound access using a debug tool as follows:
      [   86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000
      [   86.780806]  no_context+0xdf/0x3c0
      [   86.784327]  __do_page_fault+0x252/0x470
      [   86.788367]  do_page_fault+0x32/0x140
      [   86.792145]  page_fault+0x1e/0x30
      [   86.795576]  strncpy_from_unsafe+0x66/0xb0
      [   86.799789]  fetch_memory_string+0x25/0x40
      [   86.804002]  fetch_deref_string+0x51/0x60
      [   86.808134]  kprobe_trace_func+0x32d/0x3a0
      [   86.812347]  kprobe_dispatcher+0x45/0x50
      [   86.816385]  kprobe_ftrace_handler+0x90/0xf0
      [   86.820779]  ftrace_ops_assist_func+0xa1/0x140
      [   86.825340]  0xffffffffc00750bf
      [   86.828603]  do_sys_open+0x5/0x1f0
      [   86.832124]  do_syscall_64+0x5b/0x1b0
      [   86.835900]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      commit b220c049 ("tracing: Check length before giving out
      the filter buffer") adds length check to protect trace data
      overflow introduced in 0fc1b09f, seems that this fix can't prevent
      overflow entirely, the length check should also take the sizeof
      entry->array[0] into account, since this array[0] is filled the
      length of trace data and occupy addtional space and risk overflow.
      
      Link: https://lkml.kernel.org/r/20210607125734.1770447-1-liangyan.peng@linux.alibaba.com
      
      
      
      Cc: stable@vger.kernel.org
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Xunlei Pang <xlpang@linux.alibaba.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Fixes: b220c049 ("tracing: Check length before giving out the filter buffer")
      Reviewed-by: default avatarXunlei Pang <xlpang@linux.alibaba.com>
      Reviewed-by: default avataryinbinbin <yinbinbin@alibabacloud.com>
      Reviewed-by: default avatarWetp Zhang <wetp.zy@linux.alibaba.com>
      Tested-by: default avatarJames Wang <jnwang@linux.alibaba.com>
      Signed-off-by: default avatarLiangyan <liangyan.peng@linux.alibaba.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      3e08a9f9
    • Steven Rostedt (VMware)'s avatar
      ftrace: Do not blindly read the ip address in ftrace_bug() · 6c14133d
      Steven Rostedt (VMware) authored
      It was reported that a bug on arm64 caused a bad ip address to be used for
      updating into a nop in ftrace_init(), but the error path (rightfully)
      returned -EINVAL and not -EFAULT, as the bug caused more than one error to
      occur. But because -EINVAL was returned, the ftrace_bug() tried to report
      what was at the location of the ip address, and read it directly. This
      caused the machine to panic, as the ip was not pointing to a valid memory
      address.
      
      Instead, read the ip address with copy_from_kernel_nofault() to safely
      access the memory, and if it faults, report that the address faulted,
      otherwise report what was in that location.
      
      Link: https://lore.kernel.org/lkml/20210607032329.28671-1-mark-pk.tsai@mediatek.com/
      
      
      
      Cc: stable@vger.kernel.org
      Fixes: 05736a42 ("ftrace: warn on failure to disable mcount callers")
      Reported-by: default avatarMark-PK Tsai <mark-pk.tsai@mediatek.com>
      Tested-by: default avatarMark-PK Tsai <mark-pk.tsai@mediatek.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      6c14133d
  10. Jun 03, 2021
  11. Jun 02, 2021
    • Daniel Borkmann's avatar
      bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks · ff40e510
      Daniel Borkmann authored
      Commit 59438b46 ("security,lockdown,selinux: implement SELinux lockdown")
      added an implementation of the locked_down LSM hook to SELinux, with the aim
      to restrict which domains are allowed to perform operations that would breach
      lockdown. This is indirectly also getting audit subsystem involved to report
      events. The latter is problematic, as reported by Ondrej and Serhei, since it
      can bring down the whole system via audit:
      
        1) The audit events that are triggered due to calls to security_locked_down()
           can OOM kill a machine, see below details [0].
      
        2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit()
           when trying to wake up kauditd, for example, when using trace_sched_switch()
           tracepoint, see details in [1]. Triggering this was not via some hypothetical
           corner case, but with existing tools like runqlat & runqslower from bcc, for
           example, which make use of this tracepoint. Rough call sequence goes like:
      
           rq_lock(rq) -> -------------------------+
             trace_sched_switch() ->               |
               bpf_prog_xyz() ->                   +-> deadlock
                 selinux_lockdown() ->             |
                   audit_log_end() ->              |
                     wake_up_interruptible() ->    |
                       try_to_wake_up() ->         |
                         rq_lock(rq) --------------+
      
      What's worse is that the intention of 59438b46 to further restrict lockdown
      settings for specific applications in respect to the global lockdown policy is
      completely broken for BPF. The SELinux policy rule for the current lockdown check
      looks something like this:
      
        allow <who> <who> : lockdown { <reason> };
      
      However, this doesn't match with the 'current' task where the security_locked_down()
      is executed, example: httpd does a syscall. There is a tracing program attached
      to the syscall which triggers a BPF program to run, which ends up doing a
      bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does
      the permission check against 'current', that is, httpd in this example. httpd
      has literally zero relation to this tracing program, and it would be nonsensical
      having to write an SELinux policy rule against httpd to let the tracing helper
      pass. The policy in this case needs to be against the entity that is installing
      the BPF program. For example, if bpftrace would generate a histogram of syscall
      counts by user space application:
      
        bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
      
      bpftrace would then go and generate a BPF program from this internally. One way
      of doing it [for the sake of the example] could be to call bpf_get_current_task()
      helper and then access current->comm via one of bpf_probe_read_kernel{,_str}()
      helpers. So the program itself has nothing to do with httpd or any other random
      app doing a syscall here. The BPF program _explicitly initiated_ the lockdown
      check. The allow/deny policy belongs in the context of bpftrace: meaning, you
      want to grant bpftrace access to use these helpers, but other tracers on the
      system like my_random_tracer _not_.
      
      Therefore fix all three issues at the same time by taking a completely different
      approach for the security_locked_down() hook, that is, move the check into the
      program verification phase where we actually retrieve the BPF func proto. This
      also reliably gets the task (current) that is trying to install the BPF tracing
      program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since
      we're moving this out of the BPF helper's fast-path which can be called several
      millions of times per second.
      
      The check is then also in line with other security_locked_down() hooks in the
      system where the enforcement is performed at open/load time, for example,
      open_kcore() for /proc/kcore access or module_sig_check() for module signatures
      just to pick few random ones. What's out of scope in the fix as well as in
      other security_locked_down() hook locations /outside/ of BPF subsystem is that
      if the lockdown policy changes on the fly there is no retrospective action.
      This requires a different discussion, potentially complex infrastructure, and
      it's also not clear whether this can be solved generically. Either way, it is
      out of scope for a suitable stable fix which this one is targeting. Note that
      the breakage is specifically on 59438b46 where it started to rely on 'current'
      as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5 ("bpf:
      Restrict bpf when kernel lockdown is in confidentiality mode").
      
      [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says:
      
        I starting seeing this with F-34. When I run a container that is traced with
        BPF to record the syscalls it is doing, auditd is flooded with messages like:
      
        type=AVC msg=audit(1619784520.593:282387): avc:  denied  { confidentiality }
          for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM"
            scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0
              tclass=lockdown permissive=0
      
        This seems to be leading to auditd running out of space in the backlog buffer
        and eventually OOMs the machine.
      
        [...]
        auditd running at 99% CPU presumably processing all the messages, eventually I get:
        Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
        Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64
        Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64
        Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
        Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1
        Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
        [...]
      
      [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/
      
      ,
          Serhei Makarov says:
      
        Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
        bpf_probe_read_compat() call within a sched_switch tracepoint. The problem
        is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend
        testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on
        ppc64le. Example stack trace:
      
        [...]
        [  730.868702] stack backtrace:
        [  730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1
        [  730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
        [  730.873278] Call Trace:
        [  730.873770]  dump_stack+0x7f/0xa1
        [  730.874433]  check_noncircular+0xdf/0x100
        [  730.875232]  __lock_acquire+0x1202/0x1e10
        [  730.876031]  ? __lock_acquire+0xfc0/0x1e10
        [  730.876844]  lock_acquire+0xc2/0x3a0
        [  730.877551]  ? __wake_up_common_lock+0x52/0x90
        [  730.878434]  ? lock_acquire+0xc2/0x3a0
        [  730.879186]  ? lock_is_held_type+0xa7/0x120
        [  730.880044]  ? skb_queue_tail+0x1b/0x50
        [  730.880800]  _raw_spin_lock_irqsave+0x4d/0x90
        [  730.881656]  ? __wake_up_common_lock+0x52/0x90
        [  730.882532]  __wake_up_common_lock+0x52/0x90
        [  730.883375]  audit_log_end+0x5b/0x100
        [  730.884104]  slow_avc_audit+0x69/0x90
        [  730.884836]  avc_has_perm+0x8b/0xb0
        [  730.885532]  selinux_lockdown+0xa5/0xd0
        [  730.886297]  security_locked_down+0x20/0x40
        [  730.887133]  bpf_probe_read_compat+0x66/0xd0
        [  730.887983]  bpf_prog_250599c5469ac7b5+0x10f/0x820
        [  730.888917]  trace_call_bpf+0xe9/0x240
        [  730.889672]  perf_trace_run_bpf_submit+0x4d/0xc0
        [  730.890579]  perf_trace_sched_switch+0x142/0x180
        [  730.891485]  ? __schedule+0x6d8/0xb20
        [  730.892209]  __schedule+0x6d8/0xb20
        [  730.892899]  schedule+0x5b/0xc0
        [  730.893522]  exit_to_user_mode_prepare+0x11d/0x240
        [  730.894457]  syscall_exit_to_user_mode+0x27/0x70
        [  730.895361]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [...]
      
      Fixes: 59438b46 ("security,lockdown,selinux: implement SELinux lockdown")
      Reported-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Reported-by: default avatarJakub Hrozek <jhrozek@redhat.com>
      Reported-by: default avatarSerhei Makarov <smakarov@redhat.com>
      Reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Tested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: James Morris <jamorris@linux.microsoft.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Frank Eigler <fche@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net
      ff40e510
  12. May 31, 2021
  13. May 29, 2021
  14. May 25, 2021
  15. May 24, 2021
  16. May 23, 2021
    • Petr Mladek's avatar
      watchdog: reliable handling of timestamps · 0f90b88d
      Petr Mladek authored
      Commit 9bf3bc94 ("watchdog: cleanup handling of false positives")
      tried to handle a virtual host stopped by the host a more
      straightforward and cleaner way.
      
      But it introduced a risk of false softlockup reports.  The virtual host
      might be stopped at any time, for example between
      kvm_check_and_clear_guest_paused() and is_softlockup().  As a result,
      is_softlockup() might read the updated jiffies and detects a softlockup.
      
      A solution might be to put back kvm_check_and_clear_guest_paused() after
      is_softlockup() and detect it.  But it would put back the cycle that
      complicates the logic.
      
      In fact, the handling of all the timestamps is not reliable.  The code
      does not guarantee when and how many times the timestamps are read.  For
      example, "period_ts" might be touched anytime also from NMI and re-read in
      is_softlockup().  It works just by chance.
      
      Fix all the problems by making the code even more explicit.
      
      1. Make sure that "now" and "period_ts" timestamps are read only once.
         They might be changed at anytime by NMI or when the virtual guest is
         stopped by the host.  Note that "now" timestamp does this implicitly
         because "jiffies" is marked volatile.
      
      2. "now" time must be read first.  The state of "period_ts" will
         decide whether it will be used or the period will get restarted.
      
      3. kvm_check_and_clear_guest_paused() must be called before reading
         "period_ts".  It touches the variable when the guest was stopped.
      
      As a result, "now" timestamp is used only when the watchdog was not
      touched and the guest not stopped in the meantime.  "period_ts" is
      restarted in all other situations.
      
      Link: https://lkml.kernel.org/r/YKT55gw+RZfyoFf7@alley
      
      
      Fixes: 9bf3bc94 ("watchdog: cleanup handling of false positives")
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Reported-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Reviewed-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f90b88d
  17. May 20, 2021
Loading