Commits · 13fcfcb2a9a4 · Dmitry Baryshkov / msm

Dec 02, 2022

debugobjects: Print object pointer in debug_print_object() · c4db2d3b

Stephen Boyd authored 2 years ago

Delayed kobject debugging (CONFIG_DEBUG_KOBJECT_RELEASE) prints the kobject
pointer that's being released in kobject_release() before scheduling a
randomly delayed work to do the actual release work.

If the caller of kobject_put() frees the kobject upon return then this will
typically emit a debugobject warning about freeing an active timer.

Usually the release function is the function that does the kfree() of the
struct containing the kobject.

For example the following print is seen

kobject: 'queue' (ffff888114236190): kobject_release, parent 0000000000000000 (delayed 1000)
------------[ cut here ]------------
ODEBUG: free active (active state 0) object type: timer_list hint: kobject_delayed_cleanup+0x0/0x390

but the kobject printk cannot be matched with the debug object printk
because it could be any number of kobjects that was released around that
time. The random delay for the work doesn't help either.

Print the address of the object being tracked to help to figure out which
kobject is the problem here. Note that this does not use %px here to match
the other %p usage in debugobject debugging. Due to %p usage it is required
to disable pointer hashing to correlate the two pointer printks.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220519202201.2348343-1-swboyd@chromium.org

c4db2d3b

Nov 15, 2022

lib/debugobjects: fix stat count and optimize debug_objects_mem_init · eabb7f1a

wuchi authored 2 years ago

1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so
   track it in debug_objects_replace_static_objects.  Do similar things in
   object_cpu_offline.

2. In debug_objects_mem_init, there is no need to call function
   cpuhp_setup_state_nocalls when debug_objects_enabled = 0 (out of
   memory).

Link: https://lkml.kernel.org/r/20220611130634.99741-1-wuchi.zero@gmail.com


Fixes: 634d61f4 ("debugobjects: Percpu pool lookahead freeing/allocation")
Fixes: c4b73aab ("debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done")
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

eabb7f1a

May 13, 2022

debugobjects: Convert to SPDX license identifier · 9e4a51ad

Thomas Gleixner authored 2 years ago


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/87v8udpy3u.ffs@tglx

9e4a51ad

Aug 13, 2021

debugobjects: Make them PREEMPT_RT aware · 4bedcc28

Thomas Gleixner authored 3 years ago


On PREEMPT_RT enabled kernels it is not possible to refill the object pool
from atomic context (preemption or interrupts disabled) as the allocator
might acquire 'sleeping' spinlocks.

Guard the invocation of fill_pool() accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/87sfzehdnl.ffs@tglx

4bedcc28

Oct 01, 2020

debugobjects: Free per CPU pool after CPU unplug · 88451f2c

Zqiang authored 4 years ago


If a CPU is offlined the debug objects per CPU pool is not cleaned up. If
the CPU is never onlined again then the objects in the pool are wasted.

Add a CPU hotplug callback which is invoked after the CPU is dead to free
the pool.

[ tglx: Massaged changelog and added comment about remote access safety ]

Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20200908062709.11441-1-qiang.zhang@windriver.com

88451f2c

Sep 24, 2020

debugobjects: Allow debug_obj_descr to be const · aedcade6

Stephen Boyd authored 4 years ago


The debugobject core could be slightly harder to corrupt if the
debug_obj_descr would be a pointer to const memory.

Depending on the architecture, const data structures are placed into
read-only memory and thus are harder to corrupt or hijack.

This descriptor is used to fix up stuff like timers and workqueues when
core kernel data structures are busted, so moving the descriptors to
read-only memory will make debugobjects more resilient to something going
wrong and then corrupting the function pointers inside struct
debug_obj_descr.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200815004027.2046113-2-swboyd@chromium.org

aedcade6

Jul 17, 2020

debugobjects: Convert to DEFINE_SHOW_ATTRIBUTE · 0f85c480

Qinglang Miao authored 4 years ago


Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

[ tglx: Distangled it from the mess in -next ]

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hch@lst.de
Link: https://lkml.kernel.org/r/20200716084747.8034-1-miaoqinglang@huawei.com

0f85c480

Jan 17, 2020

debugobjects: Fix various data races · 35fd7a63

Marco Elver authored 5 years ago


The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are
read locklessly outside the pool_lock critical sections. If read with plain
accesses, this would result in data races.

This is addressed as follows:

 * reads outside critical sections become READ_ONCE()s (pairing with
   WRITE_ONCE()s added);

 * writes become WRITE_ONCE()s (pairing with READ_ONCE()s added); since
   writes happen inside critical sections, only the write and not the read
   of RMWs needs to be atomic, thus WRITE_ONCE(var, var +/- X) is
   sufficient.

The data races were reported by KCSAN:

  BUG: KCSAN: data-race in __free_object / fill_pool

  write to 0xffffffff8beb04f8 of 4 bytes by interrupt on cpu 1:
   __free_object+0x1ee/0x8e0 lib/debugobjects.c:404
   __debug_check_no_obj_freed+0x199/0x330 lib/debugobjects.c:969
   debug_check_no_obj_freed+0x3c/0x44 lib/debugobjects.c:994
   slab_free_hook mm/slub.c:1422 [inline]

  read to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 2:
   fill_pool+0x3d/0x520 lib/debugobjects.c:135
   __debug_object_init+0x3c/0x810 lib/debugobjects.c:536
   debug_object_init lib/debugobjects.c:591 [inline]
   debug_object_activate+0x228/0x320 lib/debugobjects.c:677
   debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]

  BUG: KCSAN: data-race in __debug_object_init / fill_pool

  read to 0xffffffff8beb04f8 of 4 bytes by task 10 on cpu 6:
   fill_pool+0x3d/0x520 lib/debugobjects.c:135
   __debug_object_init+0x3c/0x810 lib/debugobjects.c:536
   debug_object_init_on_stack+0x39/0x50 lib/debugobjects.c:606
   init_timer_on_stack_key kernel/time/timer.c:742 [inline]

  write to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 3:
   alloc_object lib/debugobjects.c:258 [inline]
   __debug_object_init+0x717/0x810 lib/debugobjects.c:544
   debug_object_init lib/debugobjects.c:591 [inline]
   debug_object_activate+0x228/0x320 lib/debugobjects.c:677
   debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]

  BUG: KCSAN: data-race in free_obj_work / free_object

  read to 0xffffffff9140c190 of 4 bytes by task 10 on cpu 6:
   free_object+0x4b/0xd0 lib/debugobjects.c:426
   debug_object_free+0x190/0x210 lib/debugobjects.c:824
   destroy_timer_on_stack kernel/time/timer.c:749 [inline]

  write to 0xffffffff9140c190 of 4 bytes by task 93 on cpu 1:
   free_obj_work+0x24f/0x480 lib/debugobjects.c:313
   process_one_work+0x454/0x8d0 kernel/workqueue.c:2264
   worker_thread+0x9a/0x780 kernel/workqueue.c:2410

Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200116185529.11026-1-elver@google.com

35fd7a63

Jun 14, 2019

debugobjects: Move printk out of db->lock critical sections · d5f34153

Waiman Long authored 5 years ago


The db->lock is a raw spinlock and so the lock hold time is supposed
to be short. This will not be the case when printk() is being involved
in some of the critical sections. In order to avoid the long hold time,
in case some messages need to be printed, the debug_object_is_on_stack()
and debug_print_object() calls are now moved out of those critical
sections.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-6-longman@redhat.com

d5f34153

debugobjects: Less aggressive freeing of excess debug objects · a7344a68

Waiman Long authored 5 years ago


After a system bootup and 3 parallel kernel builds, a partial output
of the debug objects stats file was:

pool_free     :5101
pool_pcp_free :4181
pool_min_free :220
pool_used     :104172
pool_max_used :171920
on_free_list  :0
objs_allocated:39268280
objs_freed    :39160031

More than 39 millions debug objects had since been allocated and then
freed. The pool_max_used, however, was only about 172k. So this is a
lot of extra overhead in freeing and allocating objects from slabs. It
may also causes the slabs to be more fragmented and harder to reclaim.

Make the freeing of excess debug objects less aggressive by freeing them at
a maximum frequency of 10Hz and about 1k objects at each round of freeing.

With that change applied, the partial output of the debug objects stats
file after similar actions became:

pool_free     :5901
pool_pcp_free :3742
pool_min_free :1022
pool_used     :104805
pool_max_used :168081
on_free_list  :0
objs_allocated:5796864
objs_freed    :5687182

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-5-longman@redhat.com

a7344a68

debugobjects: Reduce number of pool_lock acquisitions in fill_pool() · d26bf505

Waiman Long authored 5 years ago


In fill_pool(), the pool_lock is acquired and then released once per debug
object. If many objects are to be filled, the constant lock and unlock
operations are extra overhead.

To reduce the overhead, batch them up and do an allocation of 4 objects per
lock/unlock sequence.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-4-longman@redhat.com

d26bf505

debugobjects: Percpu pool lookahead freeing/allocation · 634d61f4

Waiman Long authored 5 years ago


Most workloads will allocate a bunch of memory objects, work on them
and then freeing all or most of them. So just having a percpu free pool
may not reduce the pool_lock contention significantly if large number
of objects are being used.

To help those situations, we are now doing lookahead allocation and
freeing of the debug objects into and out of the percpu free pool. This
will hopefully reduce the number of times the pool_lock needs to be
taken and hence its contention level.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-3-longman@redhat.com

634d61f4

debugobjects: Add percpu free pools · d86998b1

Waiman Long authored 5 years ago


When a multi-threaded workload does a lot of small memory object
allocations and deallocations, it may cause the allocation and freeing of
many debug objects. This will make the global pool_lock a bottleneck in the
performance of the workload.  Since interrupts are disabled when acquiring
the pool_lock, it may even cause hard lockups to happen.

To reduce contention of the global pool_lock, add a percpu debug object
free pool that can be used to buffer some of the debug object allocation
and freeing requests without acquiring the pool_lock.  Each CPU will now
have a percpu free pool that can hold up to a maximum of 64 debug
objects. Allocation and freeing requests will go to the percpu free pool
first. If that fails, the pool_lock will be taken and the global free pool
will be used.

The presence or absence of obj_cache is used as a marker to see if the
percpu cache should be used.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-2-longman@redhat.com

d86998b1

debugobjects: No need to check return value of debugfs_create() · fecb0d95

Greg Kroah-Hartman authored 5 years ago

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qian Cai <cai@gmx.us>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Waiman Long <longman@redhat.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190612153513.GA21082@kroah.com

fecb0d95

Dec 28, 2018

debugobjects: call debug_objects_mem_init eariler · a9ee3a63

Qian Cai authored 6 years ago

The current value of the early boot static pool size, 1024 is not big
enough for systems with large number of CPUs with timer or/and workqueue
objects selected.  As the results, systems have 60+ CPUs with both timer
and workqueue objects enabled could trigger "ODEBUG: Out of memory.
ODEBUG disabled".

Some debug objects are allocated during the early boot.  Enabling some
options like timers or workqueue objects may increase the size required
significantly with large number of CPUs.  For example,

CONFIG_DEBUG_OBJECTS_TIMERS:
No. CPUs x 2 (worker pool) objects:
start_kernel
  workqueue_init_early
    init_worker_pool
      init_timer_key
        debug_object_init

plus No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
sched_init
  hrtick_rq_init
    hrtimer_init

CONFIG_DEBUG_OBJECTS_WORK:
No. CPUs objects:
vmalloc_init
  __init_work

plus No. CPUs x 6 (workqueue) objects:
workqueue_init_early
  alloc_workqueue
    __alloc_workqueue_key
      alloc_and_link_pwqs
        init_pwq

Also, plus No. CPUs objects:
perf_event_init
  __init_srcu_struct
    init_srcu_struct_fields
      init_srcu_struct_nodes
        __init_work

However, none of the things are actually used or required before
debug_objects_mem_init() is invoked, so just move the call right before
vmalloc_init().

According to tglx, "the reason why the call is at this place in
start_kernel() is historical.  It's because back in the days when
debugobjects were added the memory allocator was enabled way later than
today."

Link: http://lkml.kernel.org/r/20181126102407.1836-1-cai@gmx.us


Signed-off-by: Qian Cai <cai@gmx.us>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a9ee3a63

Nov 30, 2018

debugobjects: avoid recursive calls with kmemleak · 8de456cf

Qian Cai authored 6 years ago

CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
recursive calls.

fill_pool
  kmemleak_ignore
    make_black_object
      put_object
        __call_rcu (kernel/rcu/tree.c)
          debug_rcu_head_queue
            debug_object_activate
              debug_object_init
                fill_pool
                  kmemleak_ignore
                    make_black_object
                      ...

So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
allocated debug objects at all.

Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us


Signed-off-by: Qian Cai <cai@gmx.us>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8de456cf

Aug 02, 2018

debugobjects: Remove redundant NULL pointer check · 3ff4f80a

Zhong Jiang authored 6 years ago


kmem_cache_destroy() has a built in NULL pointer check, so the one at the
call can be removed.

Signed-off-by: Zhong Jiang <zhongjiang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <longman@redhat.com>
Cc: <arnd@arndb.de>
Cc: <yang.shi@linux.alibaba.com>
Link: https://lkml.kernel.org/r/1533054298-35824-1-git-send-email-zhongjiang@huawei.com

3ff4f80a

Jul 30, 2018

debugobjects: Make stack check warning more informative · fc91a3c4

Joel Fernandes (Google) authored 6 years ago


While debugging an issue debugobject tracking warned about an annotation
issue of an object on stack. It turned out that the issue was due to the
object in concern being on a different stack which was due to another
issue.

Thomas suggested to print the pointers and the location of the stack for
the currently running task. This helped to figure out that the object was
on the wrong stack.

As this is general useful information for debugging similar issues, make
the error message more informative by printing the pointers.

[ tglx: Massaged changelog ]

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: kernel-team@android.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: astrachan@google.com
Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org

fc91a3c4

Mar 14, 2018

debugobjects: Avoid another unused variable warning · 163cf842

Arnd Bergmann authored 7 years ago


debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(),
and only read in debug_objects_maxchecked, unfortunately both of these are
optional and depend on different Kconfig symbols.

When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this
warning is emitted:

  lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable]

Rather than trying to add more complex #ifdef protections, mark the
variable as __maybe_unused so it can be silently dropped when usused.

Fixes: bd9dcd04 ("debugobjects: Export max loops counter")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de

163cf842

Feb 22, 2018

debugobjects: Fix debug_objects_freed accounting · 04148187

Arnd Bergmann authored 7 years ago


The removal of the batched object freeing has caused the debug_objects_freed
to become read-only, and the reading is inside an ifdef, so gcc warns that it
is completely unused without CONFIG_DEBUG_FS:

lib/debugobjects.c:71:14: error: 'debug_objects_freed' defined but not used [-Werror=unused-variable]

Assuming we are still interested in this number, this adds back code to
keep track of the freed objects.

Fixes: 636e1970 ("debugobjects: Use global free list in free_object()")
Suggested-by: Waiman Long <longman@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20180222155335.1647466-1-arnd@arndb.de

04148187

Feb 13, 2018

debugobjects: Use global free list in __debug_check_no_obj_freed() · 1ea9b98b

Yang Shi authored 7 years ago


__debug_check_no_obj_freed() iterates over the to be freed memory region in
chunks and iterates over the corresponding hash bucket list for each
chunk. This can accumulate to hundred thousands of checked objects. In the
worst case this can trigger the soft lockup detector:

NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s!
CPU: 15 PID: 110342 Comm: stress-ng-getde
Call Trace:
  [<ffffffff8141177e>] debug_check_no_obj_freed+0x13e/0x220
  [<ffffffff811f8751>] __free_pages_ok+0x1f1/0x5c0
  [<ffffffff811fa785>] __free_pages+0x25/0x40
  [<ffffffff812638db>] __free_slab+0x19b/0x270
  [<ffffffff812639e9>] discard_slab+0x39/0x50
  [<ffffffff812679f7>] __slab_free+0x207/0x270
  [<ffffffff81269966>] ___cache_free+0xa6/0xb0
  [<ffffffff8126c267>] qlist_free_all+0x47/0x80
  [<ffffffff8126c5a9>] quarantine_reduce+0x159/0x190
  [<ffffffff8126b3bf>] kasan_kmalloc+0xaf/0xc0
  [<ffffffff8126b8a2>] kasan_slab_alloc+0x12/0x20
  [<ffffffff81265e8a>] kmem_cache_alloc+0xfa/0x360
  [<ffffffff812abc8f>] ? getname_flags+0x4f/0x1f0
  [<ffffffff812abc8f>] getname_flags+0x4f/0x1f0
  [<ffffffff812abe42>] getname+0x12/0x20
  [<ffffffff81298da9>] do_sys_open+0xf9/0x210
  [<ffffffff81298ede>] SyS_open+0x1e/0x20
  [<ffffffff817d6e01>] entry_SYSCALL_64_fastpath+0x1f/0xc2

The code path might be called in either atomic or non-atomic context, but
in_atomic() can't tell if the current context is atomic or not on a
PREEMPT=n kernel, so cond_resched() can't be used to prevent the
softlockup.

Utilize the global free list to shorten the loop execution time.

[ tglx: Massaged changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-5-git-send-email-yang.shi@linux.alibaba.com

1ea9b98b

debugobjects: Use global free list in free_object() · 636e1970

Yang Shi authored 7 years ago


The newly added global free list allows to avoid lengthy pool_list
iterations in free_obj_work() by putting objects either into the pool list
when the fill level of the pool is below the maximum or by putting them on
the global free list immediately.

As the pool is now guaranteed to never exceed the maximum fill level this
allows to remove the batch removal from pool list in free_obj_work().

Split free_object() into two parts, so the actual queueing function can be
reused without invoking schedule_work() on every invocation.

[ tglx: Remove the batch removal from pool list and massage changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-4-git-send-email-yang.shi@linux.alibaba.com

636e1970

debugobjects: Add global free list and the counter · 36c4ead6

Yang Shi authored 7 years ago


free_object() adds objects to the pool list and schedules work when the
pool list is larger than the pool size.  The worker handles the actual
kfree() of the object by iterating the pool list until the pool size is
below the maximum pool size again.

To iterate the pool list, pool_lock has to be held and the objects which
should be freed() need to be put into temporary storage so pool_lock can be
dropped for the actual kmem_cache_free() invocation. That's a pointless and
expensive exercise if there is a large number of objects to free.

In such a case its better to evaulate the fill level of the pool in
free_objects() and queue the object to free either in the pool list or if
it's full on a separate global free list.

The worker can then do the following simpler operation:

  - Move objects back from the global free list to the pool list if the
    pool list is not longer full.

  - Remove the remaining objects in a single list move operation from the
    global free list and do the kmem_cache_free() operation lockless from
    the temporary list head.

In fill_pool() the global free list is checked as well to avoid real
allocations from the kmem cache.

Add the necessary list head and a counter for the number of objects on the
global free list and export that counter via sysfs:

max_chain     :79
max_loops     :8147
warnings      :0
fixups        :0
pool_free     :1697
pool_min_free :346
pool_used     :15356
pool_max_used :23933
on_free_list  :39
objs_allocated:32617
objs_freed    :16588

Nothing queues objects on the global free list yet. This happens in a
follow up change.

[ tglx: Simplified implementation and massaged changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-3-git-send-email-yang.shi@linux.alibaba.com

36c4ead6

debugobjects: Export max loops counter · bd9dcd04

Yang Shi authored 7 years ago


__debug_check_no_obj_freed() can be an expensive operation depending on the
size of memory freed. It already exports the maximum chain walk length via
debugfs, but this only records the maximum of a single memory chunk.

Though there is no information about the total number of objects inspected
for a __debug_check_no_obj_freed() operation, which might be significantly
larger when a huge memory region is freed.

Aggregate the number of objects inspected for a single invocation of
__debug_check_no_obj_freed() and export it via sysfs.

The resulting output of /sys/kernel/debug/debug_objects/stats looks like:

max_chain     :121
max_checked   :543267
warnings      :0
fixups        :0
pool_free     :1764
pool_min_free :341
pool_used     :86438
pool_max_used :268887
objs_allocated:6068254
objs_freed    :5981076

[ tglx: Renamed the variable to max_checked and adjusted changelog ]

Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-2-git-send-email-yang.shi@linux.alibaba.com

bd9dcd04

Aug 14, 2017

debugobjects: Make kmemleak ignore debug objects · caba4cbb

Waiman Long authored 7 years ago


The allocated debug objects are either on the free list or in the
hashed bucket lists. So they won't get lost. However if both debug
objects and kmemleak are enabled and kmemleak scanning is done
while some of the debug objects are transitioning from one list to
the others, false negative reporting of memory leaks may happen for
those objects. For example,

[38687.275678] kmemleak: 12 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
unreferenced object 0xffff92e98aabeb68 (size 40):
  comm "ksmtuned", pid 4344, jiffies 4298403600 (age 906.430s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 d0 bc db 92 e9 92 ff ff  ................
    01 00 00 00 00 00 00 00 38 36 8a 61 e9 92 ff ff  ........86.a....
  backtrace:
    [<ffffffff8fa5378a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff8f47c019>] kmem_cache_alloc+0xe9/0x320
    [<ffffffff8f62ed96>] __debug_object_init+0x3e6/0x400
    [<ffffffff8f62ef01>] debug_object_activate+0x131/0x210
    [<ffffffff8f330d9f>] __call_rcu+0x3f/0x400
    [<ffffffff8f33117d>] call_rcu_sched+0x1d/0x20
    [<ffffffff8f4a183c>] put_object+0x2c/0x40
    [<ffffffff8f4a188c>] __delete_object+0x3c/0x50
    [<ffffffff8f4a18bd>] delete_object_full+0x1d/0x20
    [<ffffffff8fa535c2>] kmemleak_free+0x32/0x80
    [<ffffffff8f47af07>] kmem_cache_free+0x77/0x350
    [<ffffffff8f453912>] unlink_anon_vmas+0x82/0x1e0
    [<ffffffff8f440341>] free_pgtables+0xa1/0x110
    [<ffffffff8f44af91>] exit_mmap+0xc1/0x170
    [<ffffffff8f29db60>] mmput+0x80/0x150
    [<ffffffff8f2a7609>] do_exit+0x2a9/0xd20

The references in the debug objects may also hide a real memory leak.

As there is no point in having kmemleak to track debug object
allocations, kmemleak checking is now disabled for debug objects.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1502718733-8527-1-git-send-email-longman@redhat.com

caba4cbb

Mar 02, 2017

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> · 68db0cf1

Ingo Molnar authored 8 years ago


We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

68db0cf1

Feb 10, 2017

debugobjects: Improve variable naming · 0cad93c3

Waiman Long authored 8 years ago and

Ingo Molnar committed 8 years ago


As suggested by Ingo, the debug_objects_alloc counter is now renamed to
debug_objects_allocated with minor twist in comment and debug output.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1486503630-1501-1-git-send-email-longman@redhat.com


Signed-off-by: Ingo Molnar <mingo@kernel.org>

0cad93c3

Feb 05, 2017

debugobjects: Reduce contention on the global pool_lock · 858274b6

Waiman Long authored 8 years ago


On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing is reduced.

To further reduce the lock operations free debug objects in batches of
four.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

858274b6

Feb 04, 2017

debugobjects: Scale thresholds with # of CPUs · 97dd552e

Waiman Long authored 8 years ago


On a large SMP systems with hundreds of CPUs, the current thresholds
for allocating and freeing debug objects (256 and 1024 respectively)
may not work well. This can cause a lot of needless calls to
kmem_aloc() and kmem_free() on those systems.

To alleviate this thrashing problem, the object freeing threshold
is now increased to "1024 + # of CPUs * 32". Whereas the object
allocation threshold is increased to "256 + # of CPUs * 4". That
should make the debug objects subsystem scale better with the number
of CPUs available in the system.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

97dd552e

debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done · c4b73aab

Waiman Long authored 8 years ago

New debugfs stat counters are added to track the numbers of
kmem_cache_alloc() and kmem_cache_free() function calls to get a
sense of how the internal debug objects cache management is performing.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

c4b73aab

Dec 01, 2016

lib/debugobjects: export for use in modules · f8ff04e2

Chris Wilson authored 8 years ago

Drivers, or other modules, that use a mixture of objects (especially
objects embedded within other objects) would like to take advantage of
the debugobjects facilities to help catch misuse.  Currently, the
debugobjects interface is only available to builtin drivers and requires
a set of EXPORT_SYMBOL_GPL for use by modules.

I am using the debugobjects in i915.ko to try and catch some invalid
operations on embedded objects.  The problem currently only presents
itself across module unload so forcing i915 to be builtin is not an
option.

Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk


Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Du, Changbin" <changbin.du@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f8ff04e2

Sep 17, 2016

debugobj, workqueue: remove keventd_up() usage · 7092dff2

Tejun Heo authored 8 years ago


Now that workqueue can handle work item queueing from very early
during boot, there is no need to gate schedule_work() while
!keventd_up().  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>

7092dff2

May 20, 2016

debugobjects: insulate non-fixup logic related to static obj from fixup callbacks · b9fdac7f

Changbin Du authored 8 years ago

When activating a static object we need make sure that the object is
tracked in the object tracker.  If it is a non-static object then the
activation is illegal.

In previous implementation, each subsystem need take care of this in
their fixup callbacks.  Actually we can put it into debugobjects core.
Thus we can save duplicated code, and have *pure* fixup callbacks.

To achieve this, a new callback "is_static_object" is introduced to let
the type specific code decide whether a object is static or not.  If
yes, we take it into object tracker, otherwise give warning and invoke
fixup callback.

This change has paassed debugobjects selftest, and I also do some test
with all debugobjects supports enabled.

At last, I have a concern about the fixups that can it change the object
which is in incorrect state on fixup? Because the 'addr' may not point
to any valid object if a non-static object is not tracked.  Then Change
such object can overwrite someone's memory and cause unexpected
behaviour.  For example, the timer_fixup_activate bind timer to function
stub_timer.

Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
[changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
  Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com


Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b9fdac7f

debugobjects: correct the usage of fixup call results · e7a8e78b

Changbin Du authored 8 years ago


If debug_object_fixup() return non-zero when problem has been fixed.
But the code got it backwards, it taks 0 as fixup successfully.  So fix
it.

Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e7a8e78b

debugobjects: make fixup functions return bool instead of int · b1e4d9d8

Changbin Du authored 8 years ago


I am going to introduce debugobjects infrastructure to USB subsystem.
But before this, I found the code of debugobjects could be improved.
This patchset will make fixup functions return bool type instead of int.
Because fixup only need report success or no.  boolean is the 'real'
type.

This patch (of 7):

The object debugging infrastructure core provides some fixup callbacks
for the subsystem who use it.  These callbacks are called from the debug
code whenever a problem in debug_object_init is detected.  And
debugobjects core suppose them returns 1 when the fixup was successful,
otherwise 0.  So the return type is boolean.

A bad thing is that debug_object_fixup use the return value for
arithmetic operation.  It confused me that what is the reall return
type.

Reading over the whole code, I found some place do use the return value
incorrectly(see next patch).  So why use bool type instead?

Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b1e4d9d8

Jan 27, 2016

debugobjects: Allow bigger number of early boot objects · 0b6ec8c0

Christian Borntraeger authored 9 years ago


On my bigger s390 systems  I always get "Out of memory.
ODEBUG disabled". Since the number of objects is needed at
compile time, we can not change the size dynamically before
the caches etc are available. Doubling the size seems to
do the trick. Since it is init data it will be freed anyway,
this should be ok.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/1453905478-13409-1-git-send-email-borntraeger@de.ibm.com


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

0b6ec8c0

Jun 04, 2014

lib/debugobjects.c: convert printk(KERN_DEBUG to pr_debug · c0f35cc0

Fabian Frederick authored 10 years ago


Direct conversion of one KERN_DEBUG message without DEBUG definition
(suggested by Josh Triplett)

That message will now be disabled by default.  (see
Documentation/CodingStyle Chapter 13)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c0f35cc0

lib/debugobjects.c: add pr_fmt to logging · 719e4843

Fabian Frederick authored 10 years ago


Add ODEBUG: prefix to pr_fmt

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

719e4843

lib/debugobjects.c: convert printk to pr_foo() · d7ffef28

Fabian Frederick authored 10 years ago


Convert all printk to pr_foo() except KERN_DEBUG (see
Documentation/CodingStyle Chapter 13)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d7ffef28

Nov 13, 2013

lib/debugobjects.c: remove unnecessary work pending test · d3773ba1

Xie XiuQi authored 11 years ago


Remove unnecessary work pending test before calling schedule_work().  It
has been tested in queue_work_on() already.  No functional changed.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d3773ba1

Admin message

Admin message