1. 30 Nov, 2018 1 commit
  2. 17 Oct, 2018 1 commit
    • Mathieu Desnoyers's avatar
      tracepoint: Fix tracepoint array element size mismatch · 9c0be3f6
      Mathieu Desnoyers authored
      commit 46e0c9be ("kernel: tracepoints: add support for relative
      references") changes the layout of the __tracepoint_ptrs section on
      architectures supporting relative references. However, it does so
      without turning struct tracepoint * const into const int elsewhere in
      the tracepoint code, which has the following side-effect:
      
      Setting mod->num_tracepoints is done in by module.c:
      
          mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs",
                                               sizeof(*mod->tracepoints_ptrs),
                                               &mod->num_tracepoints);
      
      Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size
      (rather than sizeof(int)), num_tracepoints is erroneously set to half the
      size it should be on 64-bit arch. So a module with an odd number of
      tracepoints misses the last tracepoint due to effect of integer
      division.
      
      So in the module going notifier:
      
              for_each_tracepoint_range(mod->tracepoints_ptrs,
                      mod->tracepoints_ptrs + mod->num_tracepoints,
                      tp_module_going_check_quiescent, NULL);
      
      the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually
      evaluates to something within the bounds of the array, but miss the
      last tracepoint if the number of tracepoints is odd on 64-bit arch.
      
      Fix this by introducing a new typedef: tracepoint_ptr_t, which
      is either "const int" on architectures that have PREL32 relocations,
      or "struct tracepoint * const" on architectures that does not have
      this feature.
      
      Also provide a new tracepoint_ptr_defer() static inline to
      encapsulate deferencing this type rather than duplicate code and
      ugly idefs within the for_each_tracepoint_range() implementation.
      
      This issue appears in 4.19-rc kernels, and should ideally be fixed
      before the end of the rc cycle.
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morris <james.morris@microsoft.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      9c0be3f6
  3. 05 Sep, 2018 1 commit
    • Steven Rostedt (VMware)'s avatar
      tracing: Add back in rcu_irq_enter/exit_irqson() for rcuidle tracepoints · 865e63b0
      Steven Rostedt (VMware) authored
      Borislav reported the following splat:
      
       =============================
       WARNING: suspicious RCU usage
       4.19.0-rc1+ #1 Not tainted
       -----------------------------
       ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!
       other info that might help us debug this:
      
       RCU used illegally from idle CPU!
       rcu_scheduler_active = 2, debug_locks = 1
       RCU used illegally from extended quiescent state!
       1 lock held by swapper/0/0:
        #0: 000000004557ee0e (rcu_read_lock){....}, at: perf_event_output_forward+0x0/0x130
      
       stack backtrace:
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1
       Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
       Call Trace:
        dump_stack+0x85/0xcb
        perf_event_output_forward+0xf6/0x130
        __perf_event_overflow+0x52/0xe0
        perf_swevent_overflow+0x91/0xb0
        perf_tp_event+0x11a/0x350
        ? find_held_lock+0x2d/0x90
        ? __lock_acquire+0x2ce/0x1350
        ? __lock_acquire+0x2ce/0x1350
        ? retint_kernel+0x2d/0x2d
        ? find_held_lock+0x2d/0x90
        ? tick_nohz_get_sleep_length+0x83/0xb0
        ? perf_trace_cpu+0xbb/0xd0
        ? perf_trace_buf_alloc+0x5a/0xa0
        perf_trace_cpu+0xbb/0xd0
        cpuidle_enter_state+0x185/0x340
        do_idle+0x1eb/0x260
        cpu_startup_entry+0x5f/0x70
        start_kernel+0x49b/0x4a6
        secondary_startup_64+0xa4/0xb0
      
      This is due to the tracepoints moving to SRCU usage which does not require
      RCU to be "watching". But perf uses these tracepoints with RCU and expects
      it to be. Hence, we still need to add in the rcu_irq_enter/exit_irqson()
      calls for "rcuidle" tracepoints. This is a temporary fix until we have SRCU
      working in NMI context, and then perf can be converted to use that instead
      of normal RCU.
      
      Link: http://lkml.kernel.org/r/20180904162611.6a120068@gandalf.local.home
      
      Cc: x86-ml <x86@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Tested-by: default avatarBorislav Petkov <bp@alien8.de>
      Reviewed-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Fixes: e6753f23 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      865e63b0
  4. 22 Aug, 2018 1 commit
  5. 30 Jul, 2018 1 commit
    • Joel Fernandes (Google)'s avatar
      tracepoint: Make rcuidle tracepoint callers use SRCU · e6753f23
      Joel Fernandes (Google) authored
      In recent tests with IRQ on/off tracepoints, a large performance
      overhead ~10% is noticed when running hackbench. This is root caused to
      calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
      tracepoint code. Following a long discussion on the list [1] about this,
      we concluded that srcu is a better alternative for use during rcu idle.
      Although it does involve extra barriers, its lighter than the sched-rcu
      version which has to do additional RCU calls to notify RCU idle about
      entry into RCU sections.
      
      In this patch, we change the underlying implementation of the
      trace_*_rcuidle API to use SRCU. This has shown to improve performance
      alot for the high frequency irq enable/disable tracepoints.
      
      Test: Tested idle and preempt/irq tracepoints.
      
      Here are some performance numbers:
      
      With a run of the following 30 times on a single core x86 Qemu instance
      with 1GB memory:
      hackbench -g 4 -f 2 -l 3000
      
      Completion times in seconds. CONFIG_PROVE_LOCKING=y.
      
      No patches (without this series)
      Mean: 3.048
      Median: 3.025
      Std Dev: 0.064
      
      With Lockdep using irq tracepoints with RCU implementation:
      Mean: 3.451   (-11.66 %)
      Median: 3.447 (-12.22%)
      Std Dev: 0.049
      
      With Lockdep using irq tracepoints with SRCU implementation (this series):
      Mean: 3.020   (I would consider the improvement against the "without
      	       this series" case as just noise).
      Median: 3.013
      Std Dev: 0.033
      
      [1] https://patchwork.kernel.org/patch/10344297/
      
      [remove rcu_read_lock_sched_notrace as its the equivalent of
      preempt_disable_notrace and is unnecessary to call in tracepoint code]
      Link: http://lkml.kernel.org/r/20180730222423.196630-3-joel@joelfernandes.orgCleaned-up-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      [ Simplified WARN_ON_ONCE() ]
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      e6753f23
  6. 15 Jun, 2018 1 commit
  7. 27 Nov, 2017 1 commit
  8. 13 Jun, 2017 2 commits
  9. 10 Apr, 2017 1 commit
  10. 09 Dec, 2016 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Have the reg function allow to fail · 8cf868af
      Steven Rostedt (Red Hat) authored
      Some tracepoints have a registration function that gets enabled when the
      tracepoint is enabled. There may be cases that the registraction function
      must fail (for example, can't allocate enough memory). In this case, the
      tracepoint should also fail to register, otherwise the user would not know
      why the tracepoint is not working.
      
      Cc: David Howells <dhowells@redhat.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8cf868af
  11. 09 Mar, 2016 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix check for cpu online when event is disabled · dc17147d
      Steven Rostedt (Red Hat) authored
      Commit f3775549 ("tracepoints: Do not trace when cpu is offline") added
      a check to make sure that tracepoints only get called when the cpu is
      online, as it uses rcu_read_lock_sched() for protection.
      
      Commit 3a630178 ("tracing: generate RCU warnings even when tracepoints
      are disabled") added lockdep checks (including rcu checks) for events that
      are not enabled to catch possible RCU issues that would only be triggered if
      a trace event was enabled. Commit f3775549 only stopped the warnings
      when the trace event was enabled but did not prevent warnings if the trace
      event was called when disabled.
      
      To fix this, the cpu online check is moved to where the condition is added
      to the trace event. This will place the cpu online check in all places that
      it may be used now and in the future.
      
      Cc: stable@vger.kernel.org # v3.18+
      Fixes: f3775549 ("tracepoints: Do not trace when cpu is offline")
      Fixes: 3a630178 ("tracing: generate RCU warnings even when tracepoints are disabled")
      Reported-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Tested-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      dc17147d
  12. 15 Feb, 2016 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracepoints: Do not trace when cpu is offline · f3775549
      Steven Rostedt (Red Hat) authored
      The tracepoint infrastructure uses RCU sched protection to enable and
      disable tracepoints safely. There are some instances where tracepoints are
      used in infrastructure code (like kfree()) that get called after a CPU is
      going offline, and perhaps when it is coming back online but hasn't been
      registered yet.
      
      This can probuce the following warning:
      
       [ INFO: suspicious RCU usage. ]
       4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
       -------------------------------
       include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
      
       other info that might help us debug this:
      
       RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
       no locks held by swapper/8/0.
      
       stack backtrace:
        CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty #34
        Call Trace:
        [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
        [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
        [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
        [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
        [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
        [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
        [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
        [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
        [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
        [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
        [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
        [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
      
      This warning is not a false positive either. RCU is not protecting code that
      is being executed while the CPU is offline.
      
      Instead of playing "whack-a-mole(TM)" and adding conditional statements to
      the tracepoints we find that are used in this instance, simply add a
      cpu_online() test to the tracepoint code where the tracepoint will be
      ignored if the CPU is offline.
      
      Use of raw_smp_processor_id() is fine, as there should never be a case where
      the tracepoint code goes from running on a CPU that is online and suddenly
      gets migrated to a CPU that is offline.
      
      Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.orgReported-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Fixes: 97e1c18e ("tracing: Kernel Tracepoints")
      Cc: stable@vger.kernel.org # v2.6.28+
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f3775549
  13. 23 Dec, 2015 1 commit
  14. 08 Dec, 2015 1 commit
    • Paul E. McKenney's avatar
      rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}() · 7c9906ca
      Paul E. McKenney authored
      This commit replaces a local_irq_save()/local_irq_restore() pair with
      a lockdep assertion that interrupts are already disabled.  This should
      remove the corresponding overhead from the interrupt entry/exit fastpaths.
      
      This change was inspired by the fact that Iftekhar Ahmed's mutation
      testing showed that removing rcu_irq_enter()'s call to local_ird_restore()
      had no effect, which might indicate that interrupts were always enabled
      anyway.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      7c9906ca
  15. 06 Dec, 2015 1 commit
  16. 02 Nov, 2015 1 commit
  17. 26 Oct, 2015 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracepoint: Give priority to probes of tracepoints · 7904b5c4
      Steven Rostedt (Red Hat) authored
      In order to guarantee that a probe will be called before other probes that
      are attached to a tracepoint, there needs to be a mechanism to provide
      priority of one probe over the others.
      
      Adding a prio field to the struct tracepoint_func, which lets the probes be
      sorted by the priority set in the structure. If no priority is specified,
      then a priority of 10 is given (this is a macro, and perhaps may be changed
      in the future).
      
      Now probes may be added to affect other probes that are attached to a
      tracepoint with a guaranteed order.
      
      One use case would be to allow tracing of tracepoints be able to filter by
      pid. A special (higher priority probe) may be added to the sched_switch
      tracepoint and set the necessary flags of the other tracepoints to notify
      them if they should be traced or not. In case a tracepoint is enabled at the
      sched_switch tracepoint too, the order of the two are not random.
      
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7904b5c4
  18. 21 Oct, 2015 1 commit
  19. 08 Apr, 2015 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values · 0c564a53
      Steven Rostedt (Red Hat) authored
      Several tracepoints use the helper functions __print_symbolic() or
      __print_flags() and pass in enums that do the mapping between the
      binary data stored and the value to print. This works well for reading
      the ASCII trace files, but when the data is read via userspace tools
      such as perf and trace-cmd, the conversion of the binary value to a
      human string format is lost if an enum is used, as userspace does not
      have access to what the ENUM is.
      
      For example, the tracepoint trace_tlb_flush() has:
      
       __print_symbolic(REC->reason,
          { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
          { TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
          { TLB_LOCAL_SHOOTDOWN, "local shootdown" },
          { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
      
      Which maps the enum values to the strings they represent. But perf and
      trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
      not be able to map it.
      
      With TRACE_DEFINE_ENUM(), developers can place these in the event header
      files and ftrace will convert the enums to their values:
      
      By adding:
      
       TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
       TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
       TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
       TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
      
       $ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
      [...]
       __print_symbolic(REC->reason,
          { 0, "flush on task switch" },
          { 1, "remote shootdown" },
          { 2, "local shootdown" },
          { 3, "local mm shootdown" })
      
      The above is what userspace expects to see, and tools do not need to
      be modified to parse them.
      
      Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
      
      Cc: Guilherme Cox <cox@computer.org>
      Cc: Tony Luck <tony.luck@gmail.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0c564a53
  20. 08 Feb, 2015 1 commit
  21. 10 Sep, 2014 1 commit
  22. 08 Aug, 2014 1 commit
  23. 07 May, 2014 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add trace_<tracepoint>_enabled() function · 7c65bbc7
      Steven Rostedt (Red Hat) authored
      There are some code paths in the kernel that need to do some preparations
      before it calls a tracepoint. As that code is worthless overhead when
      the tracepoint is not enabled, it would be prudent to have that code
      only run when the tracepoint is active. To accomplish this, all tracepoints
      now get a static inline function called "trace_<tracepoint-name>_enabled()"
      which returns true when the tracepoint is enabled and false otherwise.
      
      As an added bonus, that function uses the static_key of the tracepoint
      such that no branch is needed.
      
        if (trace_mytracepoint_enabled()) {
      	arg = process_tp_arg();
      	trace_mytracepoint(arg);
        }
      
      Will keep the "process_tp_arg()" (which may be expensive to run) from
      being executed when the tracepoint isn't enabled.
      
      It's best to encapsulate the tracepoint itself in the if statement
      just to keep races. For example, if you had:
      
        if (trace_mytracepoint_enabled())
      	arg = process_tp_arg();
        trace_mytracepoint(arg);
      
      There's a chance that the tracepoint could be enabled just after the
      if statement, and arg will be undefined when calling the tracepoint.
      
      Link: http://lkml.kernel.org/r/20140506094407.507b6435@gandalf.local.homeAcked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7c65bbc7
  24. 09 Apr, 2014 3 commits
  25. 21 Mar, 2014 1 commit
  26. 04 Mar, 2014 1 commit
  27. 02 Dec, 2013 1 commit
  28. 19 Nov, 2013 1 commit
  29. 21 Jun, 2013 1 commit
    • Steven Rostedt's avatar
      tracing: Add DEFINE_EVENT_FN() macro · f5abaa1b
      Steven Rostedt authored
      Each TRACE_EVENT() adds several helper functions. If two or more trace events
      share the same structure and print format, they can also share most of these
      helper functions and save a lot of space from duplicate code. This is why the
      DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created.
      
      Some events require a trigger to be called at registering and unregistering of
      the event and to do so they use TRACE_EVENT_FN().
      
      If multiple events require a trigger, they currently have no choice but to use
      TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This unfortunately
      causes a lot of wasted duplicate code created.
      
      By adding a DEFINE_EVENT_FN(), these events can still use a
      DECLARE_EVENT_CLASS() and then define their own triggers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/51C3236C.8030508@hds.comSigned-off-by: default avatarSeiji Aguchi <seiji.aguchi@hds.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      f5abaa1b
  30. 10 Jun, 2013 1 commit
    • Paul E. McKenney's avatar
      trace: Allow idle-safe tracepoints to be called from irq · d6284099
      Paul E. McKenney authored
      __DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which
      may safely be invoked from what RCU considers to be an idle CPU.
      However, these _rcuidle() tracepoints may -not- be invoked from the
      handler of an irq taken from idle, because rcu_idle_enter() zeroes
      RCU's nesting-level counter, so that the rcu_irq_exit() returning to
      idle will trigger a WARN_ON_ONCE().
      
      This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit()
      and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle()
      tracepoints usable from irq handlers as well as from process context.
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      d6284099
  31. 12 Sep, 2012 1 commit
  32. 06 Jul, 2012 1 commit
  33. 24 Feb, 2012 1 commit
    • Ingo Molnar's avatar
      static keys: Introduce 'struct static_key', static_key_true()/false() and... · c5905afb
      Ingo Molnar authored
      static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()
      
      So here's a boot tested patch on top of Jason's series that does
      all the cleanups I talked about and turns jump labels into a
      more intuitive to use facility. It should also address the
      various misconceptions and confusions that surround jump labels.
      
      Typical usage scenarios:
      
              #include <linux/static_key.h>
      
              struct static_key key = STATIC_KEY_INIT_TRUE;
      
              if (static_key_false(&key))
                      do unlikely code
              else
                      do likely code
      
      Or:
      
              if (static_key_true(&key))
                      do likely code
              else
                      do unlikely code
      
      The static key is modified via:
      
              static_key_slow_inc(&key);
              ...
              static_key_slow_dec(&key);
      
      The 'slow' prefix makes it abundantly clear that this is an
      expensive operation.
      
      I've updated all in-kernel code to use this everywhere. Note
      that I (intentionally) have not pushed through the rename
      blindly through to the lowest levels: the actual jump-label
      patching arch facility should be named like that, so we want to
      decouple jump labels from the static-key facility a bit.
      
      On non-jump-label enabled architectures static keys default to
      likely()/unlikely() branches.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarJason Baron <jbaron@redhat.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: mathieu.desnoyers@efficios.com
      Cc: davem@davemloft.net
      Cc: ddaney.cavm@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c5905afb
  34. 13 Feb, 2012 1 commit
    • Steven Rostedt's avatar
      tracing/rcu: Add trace_##name##__rcuidle() static tracepoint for inside rcu_idle_exit() sections · 2fbb90db
      Steven Rostedt authored
      Added is a new static inline function that lets *any* tracepoint be used
      inside a rcu_idle_exit() section. And this also solves the problem where
      the same tracepoint may be used inside a rcu_idle_exit() section as well
      as outside of one.
      
      I added a new tracepoint function with a "_rcuidle" extension. All
      tracepoints can be used with either the normal "trace_foobar()"
      function, or the "trace_foobar_rcuidle()" function when inside a
      rcu_idle_exit() section.
      
      All tracepoints defined by TRACE_EVENT() or any of the derivatives
      will have a "_rcuidle()" function also defined. When a tracepoint is
      used within an rcu_idle_exit() section, the "_rcuidle()" version must
      be used. This denotes that the tracepoint is within rcu_idle_exit()
      and it allows the rcu read locks within the tracepoint to still
      be valid, as this version takes us out of rcu_idle_exit().
      
      Another nice aspect about this patch is that "static inline"s are not
      compiled into text when not used. So only the tracepoints that actually
      use the _rcuidle() version will have them defined in the actual text
      that is booted.
      
      Link: http://lkml.kernel.org/r/1328563113.2200.39.camel@gandalf.stny.rr.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: Josh Triplett's avatarJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2fbb90db
  35. 11 Aug, 2011 1 commit
    • Mathieu Desnoyers's avatar
      Tracepoint: Dissociate from module mutex · b75ef8b4
      Mathieu Desnoyers authored
      Copy the information needed from struct module into a local module list
      held within tracepoint.c from within the module coming/going notifier.
      
      This vastly simplifies locking of tracepoint registration /
      unregistration, because we don't have to take the module mutex to
      register and unregister tracepoints anymore. Steven Rostedt ran into
      dependency problems related to modules mutex vs kprobes mutex vs ftrace
      mutex vs tracepoint mutex that seems to be hard to fix without removing
      this dependency between tracepoint and module mutex. (note: it should be
      investigated whether kprobes could benefit of being dissociated from the
      modules mutex too.)
      
      This also fixes module handling of tracepoint list iterators, because it
      was expecting the list to be sorted by pointer address. Given we have
      control on our own list now, it's OK to sort this list which has
      tracepoints as its only purpose. The reason why this sorting is required
      is to handle the fact that seq files (and any read() operation from
      user-space) cannot hold the tracepoint mutex across multiple calls, so
      list entries may vanish between calls. With sorting, the tracepoint
      iterator becomes usable even if the list don't contain the exact item
      pointed to by the iterator anymore.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarJason Baron <jbaron@redhat.com>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Lai Jiangshan <laijs@cn.fujitsu.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Link: http://lkml.kernel.org/r/20110810191839.GC8525@KrystalSigned-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b75ef8b4
  36. 04 Apr, 2011 1 commit
    • Jason Baron's avatar
      jump label: Introduce static_branch() interface · d430d3d7
      Jason Baron authored
      Introduce:
      
      static __always_inline bool static_branch(struct jump_label_key *key);
      
      instead of the old JUMP_LABEL(key, label) macro.
      
      In this way, jump labels become really easy to use:
      
      Define:
      
              struct jump_label_key jump_key;
      
      Can be used as:
      
              if (static_branch(&jump_key))
                      do unlikely code
      
      enable/disale via:
      
              jump_label_inc(&jump_key);
              jump_label_dec(&jump_key);
      
      that's it!
      
      For the jump labels disabled case, the static_branch() becomes an
      atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
      atomic_dec() operations. We show testing results for this change below.
      
      Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
      
      Since we now require a 'struct jump_label_key *key', we can store a pointer into
      the jump table addresses. In this way, we can enable/disable jump labels, in
      basically constant time. This change allows us to completely remove the previous
      hashtable scheme. Thanks to Peter Zijlstra for this re-write.
      
      Testing:
      
      I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
      configurations, where tracepoints were disabled.
      
      jump label configured in
      avg: 815.6
      
      jump label *not* configured in (using atomic reads)
      avg: 800.1
      
      jump label *not* configured in (regular reads)
      avg: 803.4
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20110316212947.GA8792@redhat.com>
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Suggested-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Tested-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Acked-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d430d3d7
  37. 03 Feb, 2011 1 commit
    • Mathieu Desnoyers's avatar
      tracepoints: Fix section alignment using pointer array · 65498646
      Mathieu Desnoyers authored
      Make the tracepoints more robust, making them solid enough to handle compiler
      changes by not relying on anything based on compiler-specific behavior with
      respect to structure alignment. Implement an approach proposed by David Miller:
      use an array of const pointers to refer to the individual structures, and export
      this pointer array through the linker script rather than the structures per se.
      It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
      for the pointers), but are less likely to break due to compiler changes.
      
      History:
      
      commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
      added the aligned(32) type and variable attribute to the tracepoint structures
      to deal with gcc happily aligning statically defined structures on 32-byte
      multiples.
      
      One attempt was to use a 8-byte alignment for tracepoint structures by applying
      both the variable and type attribute to tracepoint structures definitions and
      declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.
      
      The reason is that the "aligned" attribute only specify the _minimum_ alignment
      for a structure, leaving both the compiler and the linker free to align on
      larger multiples. Because tracepoint.c expects the structures to be placed as an
      array within each section, up-alignment cause NULL-pointer exceptions due to the
      extra unexpected padding.
      
      (this patch applies on top of -tip)
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      LKML-Reference: <20110126222622.GA10794@Krystal>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      65498646