1. 12 Oct, 2009 1 commit
  2. 01 Oct, 2009 4 commits
  3. 14 Sep, 2009 1 commit
  4. 01 Aug, 2009 4 commits
  5. 28 Jul, 2009 1 commit
  6. 19 Jun, 2009 1 commit
  7. 18 Jun, 2009 1 commit
  8. 16 Jun, 2009 2 commits
  9. 12 Jun, 2009 1 commit
  10. 09 Jun, 2009 2 commits
  11. 03 Jun, 2009 1 commit
  12. 28 May, 2009 1 commit
  13. 22 May, 2009 4 commits
  14. 22 Apr, 2009 1 commit
    • Tejun Heo's avatar
      block: fix queue bounce limit setting · cd0aca2d
      Tejun Heo authored
      
      
      Impact: don't set GFP_DMA in q->bounce_gfp unnecessarily
      
      All DMA address limits are expressed in terms of the last addressable
      unit (byte or page) instead of one plus that.  However, when
      determining bounce_gfp for 64bit machines in blk_queue_bounce_limit(),
      it compares the specified limit against 0x100000000UL to determine
      whether it's below 4G ending up falsely setting GFP_DMA in
      q->bounce_gfp.
      
      As DMA zone is very small on x86_64, this makes larger SG_IO transfers
      very eager to trigger OOM killer.  Fix it.  While at it, rename the
      parameter to @dma_mask for clarity and convert comment to proper
      winged style.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      cd0aca2d
  15. 07 Apr, 2009 1 commit
  16. 29 Dec, 2008 1 commit
  17. 03 Dec, 2008 1 commit
    • Milan Broz's avatar
      block: fix setting of max_segment_size and seg_boundary mask · 0e435ac2
      Milan Broz authored
      Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
      devices.
      
      When stacking devices (LVM over MD over SCSI) some of the request queue
      parameters are not set up correctly in some cases by default, namely
      max_segment_size and and seg_boundary mask.
      
      If you create MD device over SCSI, these attributes are zeroed.
      
      Problem become when there is over this mapping next device-mapper mapping
      - queue attributes are set in DM this way:
      
      request_queue   max_segment_size  seg_boundary_mask
      SCSI                65536             0xffffffff
      MD RAID1                0                      0
      LVM                 65536                 -1 (64bit)
      
      Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
      physical segments according to these parameters.
      
      During the generic_make_request() is segment cout recalculated and can
      increase bio->bi_phys_segments count over the allowed limit.  (After
      bio_clone() in stack operation.)
      
      Thi is specially problem in CCISS driver, where it produce OOPS here
      
          BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
      
      (MAXSEGENTRIES is 31 by default.)
      
      Sometimes even this command is enough to cause oops:
      
        dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10
      
      This command generates bios with 250 sectors, allocated in 32 4k-pages
      (last page uses only 1024 bytes).
      
      For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
      unfortunatelly on lower layer it is recalculated to 32 segments and this
      violates CCISS restriction and triggers BUG_ON().
      
      The patch tries to fix it by:
      
       * initializing attributes above in queue request constructor
         blk_queue_make_request()
      
       * make sure that blk_queue_stack_limits() inherits setting
      
       (DM uses its own function to set the limits because it
       blk_queue_stack_limits() was introduced later.  It should probably switch
       to use generic stack limit function too.)
      
       * sets the default seg_boundary value in one place (blkdev.h)
      
       * use this mask as default in DM (instead of -1, which differs in 64bit)
      
      Bugs related to this:
      https://bugzilla.redhat.com/show_bug.cgi?id=471639
      http://bugzilla.kernel.org/show_bug.cgi?id=8672
      
      Signed-off-by: default avatarMilan Broz <mbroz@redhat.com>
      Reviewed-by: default avatarAlasdair G Kergon <agk@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      0e435ac2
  18. 17 Oct, 2008 1 commit
  19. 09 Oct, 2008 6 commits
    • Kiyoshi Ueda's avatar
      block: add lld busy state exporting interface · ef9e3fac
      Kiyoshi Ueda authored
      
      
      This patch adds an new interface, blk_lld_busy(), to check lld's
      busy state from the block layer.
      blk_lld_busy() calls down into low-level drivers for the checking
      if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().
      
      This resolves a performance problem on request stacking devices below.
      
      Some drivers like scsi mid layer stop dispatching request when
      they detect busy state on its low-level device like host/target/device.
      It allows other requests to stay in the I/O scheduler's queue
      for a chance of merging.
      
      Request stacking drivers like request-based dm should follow
      the same logic.
      However, there is no generic interface for the stacked device
      to check if the underlying device(s) are busy.
      If the request stacking driver dispatches and submits requests to
      the busy underlying device, the requests will stay in
      the underlying device's queue without a chance of merging.
      This causes performance problem on burst I/O load.
      
      With this patch, busy state of the underlying device is exported
      via q->lld_busy_fn().  So the request stacking driver can check it
      and stop dispatching requests if busy.
      
      The underlying device driver must return the busy state appropriately:
          1: when the device driver can't process requests immediately.
          0: when the device driver can process requests immediately,
             including abnormal situations where the device driver needs
             to kill all requests.
      Signed-off-by: default avatarKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      ef9e3fac
    • Jens Axboe's avatar
      block: unify request timeout handling · 242f9dcb
      Jens Axboe authored
      
      
      Right now SCSI and others do their own command timeout handling.
      Move those bits to the block layer.
      
      Instead of having a timer per command, we try to be a bit more clever
      and simply have one per-queue. This avoids the overhead of having to
      tear down and setup a timer for each command, so it will result in a lot
      less timer fiddling.
      Signed-off-by: default avatarMike Anderson <andmike@linux.vnet.ibm.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      242f9dcb
    • Harvey Harrison's avatar
      block: kmalloc args reversed, small function definition fixes · aeb3d3a8
      Harvey Harrison authored
      
      
      Noticed by sparse:
      block/blk-softirq.c:156:12: warning: symbol 'blk_softirq_init' was not declared. Should it be static?
      block/genhd.c:583:28: warning: function 'bdget_disk' with external linkage has definition
      block/genhd.c:659:17: warning: incorrect type in argument 1 (different base types)
      block/genhd.c:659:17:    expected unsigned int [unsigned] [usertype] size
      block/genhd.c:659:17:    got restricted gfp_t
      block/genhd.c:659:29: warning: incorrect type in argument 2 (different base types)
      block/genhd.c:659:29:    expected restricted gfp_t [usertype] flags
      block/genhd.c:659:29:    got unsigned int
      block: kmalloc args reversed
      Signed-off-by: default avatarHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      aeb3d3a8
    • Jens Axboe's avatar
      block: add support for IO CPU affinity · c7c22e4d
      Jens Axboe authored
      
      
      This patch adds support for controlling the IO completion CPU of
      either all requests on a queue, or on a per-request basis. We export
      a sysfs variable (rq_affinity) which, if set, migrates completions
      of requests to the CPU that originally submitted it. A bio helper
      (bio_set_completion_cpu()) is also added, so that queuers can ask
      for completion on that specific CPU.
      
      In testing, this has been show to cut the system time by as much
      as 20-40% on synthetic workloads where CPU affinity is desired.
      
      This requires a little help from the architecture, so it'll only
      work as designed for archs that are using the new generic smp
      helper infrastructure.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c7c22e4d
    • Randy Dunlap's avatar
      Add some block/ source files to the kernel-api docbook. Fix kernel-doc... · 710027a4
      Randy Dunlap authored
      
      Add some block/ source files to the kernel-api docbook. Fix kernel-doc notation in them as needed. Fix changed function parameter names. Fix typos/spellos. In comments, change REQ_SPECIAL to REQ_TYPE_SPECIAL and REQ_BLOCK_PC to REQ_TYPE_BLOCK_PC.
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      710027a4
    • David Woodhouse's avatar
      Add 'discard' request handling · fb2dce86
      David Woodhouse authored
      
      
      Some block devices benefit from a hint that they can forget the contents
      of certain sectors. Add basic support for this to the block core, along
      with a 'blkdev_issue_discard()' helper function which issues such
      requests.
      
      The caller doesn't get to provide an end_io functio, since
      blkdev_issue_discard() will automatically split the request up into
      multiple bios if appropriate. Neither does the function wait for
      completion -- it's expected that callers won't care about when, or even
      _if_, the request completes. It's only a hint to the device anyway. By
      definition, the file system doesn't _care_ about these sectors any more.
      
      [With feedback from OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> and
      Jens Axboe <jens.axboe@oracle.com]
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      fb2dce86
  20. 04 Jul, 2008 1 commit
  21. 15 May, 2008 1 commit
    • Neil Brown's avatar
      Remove blkdev warning triggered by using md · e7e72bf6
      Neil Brown authored
      
      
      As setting and clearing queue flags now requires that we hold a spinlock
      on the queue, and as blk_queue_stack_limits is called without that lock,
      get the lock inside blk_queue_stack_limits.
      
      For blk_queue_stack_limits to be able to find the right lock, each md
      personality needs to set q->queue_lock to point to the appropriate lock.
      Those personalities which didn't previously use a spin_lock, us
      q->__queue_lock.  So always initialise that lock when allocated.
      
      With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no
      longer cause warnings as it will be clear that the proper lock is held.
      
      Thanks to Dan Williams for review and fixing the silly bugs.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Alistair John Strachan <alistair@devzero.co.uk>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Jacek Luczak <difrost.kernel@gmail.com>
      Cc: Prakash Punnoor <prakash@punnoor.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7e72bf6
  22. 01 May, 2008 1 commit
  23. 29 Apr, 2008 2 commits