Skip to content
  • Davidlohr Bueso's avatar
    locking/rwsem: Scan the wait_list for readers only once · 70800c3c
    Davidlohr Bueso authored and Ingo Molnar's avatar Ingo Molnar committed
    
    
    When wanting to wakeup readers, __rwsem_mark_wakeup() currently
    iterates the wait_list twice while looking to wakeup the first N
    queued reader-tasks. While this can be quite inefficient, it was
    there such that a awoken reader would be first and foremost
    acknowledged by the lock counter.
    
    Keeping the same logic, we can further benefit from the use of
    wake_qs and avoid entirely the first wait_list iteration that sets
    the counter as wake_up_process() isn't going to occur right away,
    and therefore we maintain the counter->list order of going about
    things.
    
    Other than saving cycles with O(n) "scanning", this change also
    nicely cleans up a good chunk of __rwsem_mark_wakeup(); both
    visually and less tedious to read.
    
    For example, the following improvements where seen on some will
    it scale microbenchmarks, on a 48-core Haswell:
    
                                           v4.7              v4.7-rwsem-v1
      Hmean    signal1-processes-8    5792691.42 (  0.00%)  5771971.04 ( -0.36%)
      Hmean    signal1-processes-12   6081199.96 (  0.00%)  6072174.38 ( -0.15%)
      Hmean    signal1-processes-21   3071137.71 (  0.00%)  3041336.72 ( -0.97%)
      Hmean    signal1-processes-48   3712039.98 (  0.00%)  3708113.59 ( -0.11%)
      Hmean    signal1-processes-79   4464573.45 (  0.00%)  4682798.66 (  4.89%)
      Hmean    signal1-processes-110  4486842.01 (  0.00%)  4633781.71 (  3.27%)
      Hmean    signal1-processes-141  4611816.83 (  0.00%)  4692725.38 (  1.75%)
      Hmean    signal1-processes-172  4638157.05 (  0.00%)  4714387.86 (  1.64%)
      Hmean    signal1-processes-203  4465077.80 (  0.00%)  4690348.07 (  5.05%)
      Hmean    signal1-processes-224  4410433.74 (  0.00%)  4687534.43 (  6.28%)
    
      Stddev   signal1-processes-8       6360.47 (  0.00%)     8455.31 ( 32.94%)
      Stddev   signal1-processes-12      4004.98 (  0.00%)     9156.13 (128.62%)
      Stddev   signal1-processes-21      3273.14 (  0.00%)     5016.80 ( 53.27%)
      Stddev   signal1-processes-48     28420.25 (  0.00%)    26576.22 ( -6.49%)
      Stddev   signal1-processes-79     22038.34 (  0.00%)    18992.70 (-13.82%)
      Stddev   signal1-processes-110    23226.93 (  0.00%)    17245.79 (-25.75%)
      Stddev   signal1-processes-141     6358.98 (  0.00%)     7636.14 ( 20.08%)
      Stddev   signal1-processes-172     9523.70 (  0.00%)     4824.75 (-49.34%)
      Stddev   signal1-processes-203    13915.33 (  0.00%)     9326.33 (-32.98%)
      Stddev   signal1-processes-224    15573.94 (  0.00%)    10613.82 (-31.85%)
    
    Other runs that saw improvements include context_switch and pipe; and
    as expected, this is particularly highlighted on larger thread counts
    as it becomes more expensive to walk the list twice.
    
    No change in wakeup ordering or semantics.
    
    Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Waiman.Long@hp.com
    Cc: dave@stgolabs.net
    Cc: jason.low2@hpe.com
    Cc: wanpeng.li@hotmail.com
    Link: http://lkml.kernel.org/r/1470384285-32163-4-git-send-email-dave@stgolabs.net
    
    
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    70800c3c