Skip to content
Snippets Groups Projects
  1. Oct 18, 2021
  2. Jun 21, 2021
  3. Jun 16, 2021
    • Ming Lei's avatar
      block: fix race between adding/removing rq qos and normal IO · 2cafe29a
      Ming Lei authored
      
      Yi reported several kernel panics on:
      
      [16687.001777] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
      ...
      [16687.163549] pc : __rq_qos_track+0x38/0x60
      
      or
      
      [  997.690455] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
      ...
      [  997.850347] pc : __rq_qos_done+0x2c/0x50
      
      Turns out it is caused by race between adding rq qos(wbt) and normal IO
      because rq_qos_add can be run when IO is being submitted, fix this issue
      by freezing queue before adding/deleting rq qos to queue.
      
      rq_qos_exit() needn't to freeze queue because it is called after queue
      has been frozen.
      
      iolatency calls rq_qos_add() during allocating queue, so freezing won't
      add delay because queue usage refcount works at atomic mode at that
      time.
      
      iocost calls rq_qos_add() when writing cgroup attribute file, that is
      fine to freeze queue at that time since we usually freeze queue when
      storing to queue sysfs attribute, meantime iocost only exists on the
      root cgroup.
      
      wbt_init calls it in blk_register_queue() and queue sysfs attribute
      store(queue_wb_lat_store() when write it 1st time in case of !BLK_WBT_MQ),
      the following patch will speedup the queue freezing in wbt_init.
      
      Reported-by: default avatarYi Zhang <yi.zhang@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Tested-by: default avatarYi Zhang <yi.zhang@redhat.com>
      Link: https://lore.kernel.org/r/20210609015822.103433-2-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2cafe29a
  4. Oct 15, 2019
  5. Oct 06, 2019
  6. Aug 29, 2019
    • Tejun Heo's avatar
      blkcg: implement blk-iocost · 7caa4715
      Tejun Heo authored
      
      This patchset implements IO cost model based work-conserving
      proportional controller.
      
      While io.latency provides the capability to comprehensively prioritize
      and protect IOs depending on the cgroups, its protection is binary -
      the lowest latency target cgroup which is suffering is protected at
      the cost of all others.  In many use cases including stacking multiple
      workload containers in a single system, it's necessary to distribute
      IO capacity with better granularity.
      
      One challenge of controlling IO resources is the lack of trivially
      observable cost metric.  The most common metrics - bandwidth and iops
      - can be off by orders of magnitude depending on the device type and
      IO pattern.  However, the cost isn't a complete mystery.  Given
      several key attributes, we can make fairly reliable predictions on how
      expensive a given stream of IOs would be, at least compared to other
      IO patterns.
      
      The function which determines the cost of a given IO is the IO cost
      model for the device.  This controller distributes IO capacity based
      on the costs estimated by such model.  The more accurate the cost
      model the better but the controller adapts based on IO completion
      latency and as long as the relative costs across differents IO
      patterns are consistent and sensible, it'll adapt to the actual
      performance of the device.
      
      Currently, the only implemented cost model is a simple linear one with
      a few sets of default parameters for different classes of device.
      This covers most common devices reasonably well.  All the
      infrastructure to tune and add different cost models is already in
      place and a later patch will also allow using bpf progs for cost
      models.
      
      Please see the top comment in blk-iocost.c and documentation for
      more details.
      
      v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix
          for a divide-by-zero bug in current_hweight() triggered by zero
          inuse_sum.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Andy Newell <newella@fb.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7caa4715
    • Tejun Heo's avatar
      blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ · beab17fc
      Tejun Heo authored
      
      io.weight is gonna be another rq_qos cgroup mechanism.  Let's rename
      RQ_QOS_CGROUP which is being used by io.latency to RQ_QOS_LATENCY in
      preparation.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      beab17fc
    • Tejun Heo's avatar
      block/rq_qos: implement rq_qos_ops->queue_depth_changed() · 9677a3e0
      Tejun Heo authored
      
      wbt already gets queue depth changed notification through
      wbt_set_queue_depth().  Generalize it into
      rq_qos_ops->queue_depth_changed() so that other rq_qos policies can
      easily hook into the events too.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9677a3e0
    • Tejun Heo's avatar
      block/rq_qos: add rq_qos_merge() · d3e65fff
      Tejun Heo authored
      
      Add a merge hook for rq_qos.  This will be used by io.weight.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d3e65fff
  7. Apr 30, 2019
  8. Dec 17, 2018
    • Dennis Zhou's avatar
      block: fix blk-iolatency accounting underflow · 13369816
      Dennis Zhou authored
      The blk-iolatency controller measures the time from rq_qos_throttle() to
      rq_qos_done_bio() and attributes this time to the first bio that needs
      to create the request. This means if a bio is plug-mergeable or
      bio-mergeable, it gets to bypass the blk-iolatency controller.
      
      The recent series [1], to tag all bios w/ blkgs undermined how iolatency
      was determining which bios it was charging and should process in
      rq_qos_done_bio(). Because all bios are being tagged, this caused the
      atomic_t for the struct rq_wait inflight count to underflow and result
      in a stall.
      
      This patch adds a new flag BIO_TRACKED to let controllers know that a
      bio is going through the rq_qos path. blk-iolatency now checks if this
      flag is set to see if it should process the bio in rq_qos_done_bio().
      
      Overloading BLK_QUEUE_ENTERED works, but makes the flag rules confusing.
      BIO_THROTTLED was another candidate, but the flag is set for all bios
      that have gone through blk-throttle code. Overloading a flag comes with
      the burden of making sure that when either implementation changes, a
      change in setting rules for one doesn't cause a bug in the other. So
      here, we unfortunately opt for adding a new flag.
      
      [1] https://lore.kernel.org/lkml/20181205171039.73066-1-dennis@kernel.org/
      
      
      
      Fixes: 5cdf2e3f ("blkcg: associate blkg when associating a device")
      Signed-off-by: default avatarDennis Zhou <dennis@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      13369816
    • Ming Lei's avatar
      blk-mq-debugfs: support rq_qos · cc56694f
      Ming Lei authored
      
      blk-mq-debugfs has been proved as very helpful for debug some
      tough issues, such as IO hang.
      
      We have seen blk-wbt related IO hang several times, even inside
      Red Hat BZ, there is such report not sovled yet, so this patch
      adds support debugfs on rq_qos.
      
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cc56694f
  9. Dec 08, 2018
    • Josef Bacik's avatar
      block: add rq_qos_wait to rq_qos · 84f60324
      Josef Bacik authored
      
      Originally when I split out the common code from blk-wbt into rq_qos I
      left the wbt_wait() where it was and simply copied and modified it
      slightly to work for io-latency.  However they are both basically the
      same thing, and as time has gone on wbt_wait() has ended up much smarter
      and kinder than it was when I copied it into io-latency, which means
      io-latency has lost out on these improvements.
      
      Since they are the same thing essentially except for a few minor things,
      create rq_qos_wait() that replicates what wbt_wait() currently does with
      callbacks that can be passed in for the snowflakes to do their own thing
      as appropriate.
      
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      84f60324
  10. Nov 16, 2018
  11. Nov 15, 2018
  12. Jul 22, 2018
  13. Jul 09, 2018
Loading