1. 30 Nov, 2018 1 commit
  2. 19 Oct, 2018 2 commits
  3. 17 Oct, 2018 1 commit
  4. 24 Jul, 2018 6 commits
  5. 23 Jul, 2018 2 commits
  6. 28 Jun, 2018 1 commit
  7. 20 Jun, 2018 4 commits
  8. 15 Jun, 2018 1 commit
  9. 11 Jun, 2018 1 commit
  10. 08 Jun, 2018 1 commit
  11. 29 May, 2018 1 commit
  12. 25 May, 2018 1 commit
  13. 12 Apr, 2018 1 commit
    • James Smart's avatar
      nvme: expand nvmf_check_if_ready checks · bb06ec31
      James Smart authored
      The nvmf_check_if_ready() checks that were added are very simplistic.
      As such, the routine allows a lot of cases to fail ios during windows
      of reset or re-connection. In cases where there are not multi-path
      options present, the error goes back to the callee - the filesystem
      or application. Not good.
      
      The common routine was rewritten and calling syntax slightly expanded
      so that per-transport is_ready routines don't need to be present.
      The transports now call the routine directly. The routine is now a
      fabrics routine rather than an inline function.
      
      The routine now looks at controller state to decide the action to
      take. Some states mandate io failure. Others define the condition where
      a command can be accepted.  When the decision is unclear, a generic
      queue-or-reject check is made to look for failfast or multipath ios and
      only fails the io if it is so marked. Otherwise, the io will be queued
      and wait for the controller state to resolve.
      
      Admin commands issued via ioctl share a live admin queue with commands
      from the transport for controller init. The ioctls could be intermixed
      with the initialization commands. It's possible for the ioctl cmd to
      be issued prior to the controller being enabled. To block this, the
      ioctl admin commands need to be distinguished from admin commands used
      for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to
      reflect this division and set it on ioctls requests.  As the
      nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(),
      ensure that commands allocated by the ioctl path (actually anything
      in core.c) preps the nvme_req(req) before starting the io. This will
      preserve the USERCMD flag during execution and/or retry.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.e>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bb06ec31
  14. 26 Mar, 2018 4 commits
  15. 22 Feb, 2018 1 commit
  16. 14 Feb, 2018 1 commit
    • Nitzan Carmi's avatar
      nvme-rdma: fix sysfs invoked reset_ctrl error flow · 8000d1fd
      Nitzan Carmi authored
      When reset_controller that is invoked by sysfs fails,
      it enters an error flow which practically removes the
      nvme ctrl entirely (similar to delete_ctrl flow). It
      causes the system to hang, since a sysfs attribute cannot
      be unregistered by one of its own methods.
      
      This can be fixed by calling delete_ctrl as a work rather
      than sequential code. In addition, it should give the ctrl
      a chance to recover using reconnection mechanism (consistant
      with FC reset_ctrl error flow). Also, while we're here, return
      suitable errno in case the reset ended with non live ctrl.
      Signed-off-by: default avatarNitzan Carmi <nitzanc@mellanox.com>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      8000d1fd
  17. 08 Feb, 2018 2 commits
  18. 25 Jan, 2018 1 commit
  19. 15 Jan, 2018 1 commit
    • Roy Shterman's avatar
      nvme: host delete_work and reset_work on separate workqueues · b227c59b
      Roy Shterman authored
      We need to ensure that delete_work will be hosted on a different
      workqueue than all the works we flush or cancel from it.
      Otherwise we may hit a circular dependency warning [1].
      
      Also, given that delete_work flushes reset_work, host reset_work
      on nvme_reset_wq and delete_work on nvme_delete_wq. In addition,
      fix the flushing in the individual drivers to flush nvme_delete_wq
      when draining queued deletes.
      
      [1]:
      [  178.491942] =============================================
      [  178.492718] [ INFO: possible recursive locking detected ]
      [  178.493495] 4.9.0-rc4-c844263313a8-lb #3 Tainted: G           OE
      [  178.494382] ---------------------------------------------
      [  178.495160] kworker/5:1/135 is trying to acquire lock:
      [  178.495894]  (
      [  178.496120] "nvme-wq"
      [  178.496471] ){++++.+}
      [  178.496599] , at:
      [  178.496921] [<ffffffffa70ac206>] flush_work+0x1a6/0x2d0
      [  178.497670]
                     but task is already holding lock:
      [  178.498499]  (
      [  178.498724] "nvme-wq"
      [  178.499074] ){++++.+}
      [  178.499202] , at:
      [  178.499520] [<ffffffffa70ad6c2>] process_one_work+0x162/0x6a0
      [  178.500343]
                     other info that might help us debug this:
      [  178.501269]  Possible unsafe locking scenario:
      
      [  178.502113]        CPU0
      [  178.502472]        ----
      [  178.502829]   lock(
      [  178.503115] "nvme-wq"
      [  178.503467] );
      [  178.503716]   lock(
      [  178.504001] "nvme-wq"
      [  178.504353] );
      [  178.504601]
                      *** DEADLOCK ***
      
      [  178.505441]  May be due to missing lock nesting notation
      
      [  178.506453] 2 locks held by kworker/5:1/135:
      [  178.507068]  #0:
      [  178.507330]  (
      [  178.507598] "nvme-wq"
      [  178.507726] ){++++.+}
      [  178.508079] , at:
      [  178.508173] [<ffffffffa70ad6c2>] process_one_work+0x162/0x6a0
      [  178.509004]  #1:
      [  178.509265]  (
      [  178.509532] (&ctrl->delete_work)
      [  178.509795] ){+.+.+.}
      [  178.510145] , at:
      [  178.510239] [<ffffffffa70ad6c2>] process_one_work+0x162/0x6a0
      [  178.511070]
                     stack backtrace:
      :
      [  178.511693] CPU: 5 PID: 135 Comm: kworker/5:1 Tainted: G           OE   4.9.0-rc4-c844263313a8-lb #3
      [  178.512974] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
      [  178.514247] Workqueue: nvme-wq nvme_del_ctrl_work [nvme_tcp]
      [  178.515071]  ffffc2668175bae0 ffffffffa7450823 ffffffffa88abd80 ffffffffa88abd80
      [  178.516195]  ffffc2668175bb98 ffffffffa70eb012 ffffffffa8d8d90d ffff9c472e9ea700
      [  178.517318]  ffff9c472e9ea700 ffff9c4700000000 ffff9c4700007200 ab83be61bec0d50e
      [  178.518443] Call Trace:
      [  178.518807]  [<ffffffffa7450823>] dump_stack+0x85/0xc2
      [  178.519542]  [<ffffffffa70eb012>] __lock_acquire+0x17d2/0x18f0
      [  178.520377]  [<ffffffffa75839a7>] ? serial8250_console_putchar+0x27/0x30
      [  178.521330]  [<ffffffffa7583980>] ? wait_for_xmitr+0xa0/0xa0
      [  178.522174]  [<ffffffffa70ac1eb>] ? flush_work+0x18b/0x2d0
      [  178.522975]  [<ffffffffa70eb7cb>] lock_acquire+0x11b/0x220
      [  178.523753]  [<ffffffffa70ac206>] ? flush_work+0x1a6/0x2d0
      [  178.524535]  [<ffffffffa70ac229>] flush_work+0x1c9/0x2d0
      [  178.525291]  [<ffffffffa70ac206>] ? flush_work+0x1a6/0x2d0
      [  178.526077]  [<ffffffffa70a9cf0>] ? flush_workqueue_prep_pwqs+0x220/0x220
      [  178.527040]  [<ffffffffa70ae7cf>] __cancel_work_timer+0x10f/0x1d0
      [  178.527907]  [<ffffffffa70fecb9>] ? vprintk_default+0x29/0x40
      [  178.528726]  [<ffffffffa71cb507>] ? printk+0x48/0x50
      [  178.529434]  [<ffffffffa70ae8c3>] cancel_delayed_work_sync+0x13/0x20
      [  178.530381]  [<ffffffffc042100b>] nvme_stop_ctrl+0x5b/0x70 [nvme_core]
      [  178.531314]  [<ffffffffc0403dcc>] nvme_del_ctrl_work+0x2c/0x50 [nvme_tcp]
      [  178.532271]  [<ffffffffa70ad741>] process_one_work+0x1e1/0x6a0
      [  178.533101]  [<ffffffffa70ad6c2>] ? process_one_work+0x162/0x6a0
      [  178.533954]  [<ffffffffa70adc4e>] worker_thread+0x4e/0x490
      [  178.534735]  [<ffffffffa70adc00>] ? process_one_work+0x6a0/0x6a0
      [  178.535588]  [<ffffffffa70adc00>] ? process_one_work+0x6a0/0x6a0
      [  178.536441]  [<ffffffffa70b48cf>] kthread+0xff/0x120
      [  178.537149]  [<ffffffffa70b47d0>] ? kthread_park+0x60/0x60
      [  178.538094]  [<ffffffffa70b47d0>] ? kthread_park+0x60/0x60
      [  178.538900]  [<ffffffffa78e332a>] ret_from_fork+0x2a/0x40
      Signed-off-by: default avatarRoy Shterman <roys@lightbitslabs.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      b227c59b
  20. 08 Jan, 2018 1 commit
  21. 29 Dec, 2017 1 commit
    • Sagi Grimberg's avatar
      nvme-rdma: fix concurrent reset and reconnect · d5bf4b7f
      Sagi Grimberg authored
      Now ctrl state machine allows to transition from RESETTING to
      RECONNECTING.  In nvme-rdma when we receive a rdma cm DISONNECTED event,
      we trigger nvme_rdma_error_recovery. This happens also when we execute a
      controller reset, issue a cm diconnect request and receive a cm
      disconnect reply, as a result, the reset work and the error recovery work
      can run concurrently.
      
      Until now the state machine prevented from the error recovery work from
      running as a result of a controller reset (RESETTING -> RECONNECTING was
      not allowed).
      
      To fix this, we adopt the FC state machine approach, we always transition
      from LIVE to RESETTING and only then to RECONNECTING.  We do this both
      for the error recovery work and the controller reset work:
      
       1. transition to RESETTING
       2. teardown the controller association
       3. transition to RECONNECTING
      
      This will restore the protection against reset work and error recovery work
      from concurrently running together.
      
      Fixes: 3cec7f9d ("nvme: allow controller RESETTING to RECONNECTING transition")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      d5bf4b7f
  22. 28 Nov, 2017 1 commit
    • Max Gurtovoy's avatar
      nvme-rdma: fix memory leak during queue allocation · eb1bd249
      Max Gurtovoy authored
      In case nvme_rdma_wait_for_cm timeout expires before we get
      an established or rejected event (rdma_connect succeeded) from
      rdma_cm, we end up with leaking the ib transport resources for
      dedicated queue. This scenario can easily reproduced using traffic
      test during port toggling.
      Also, in order to protect from parallel ib queue destruction, that
      may be invoked from different context's, introduce new flag that
      stands for transport readiness. While we're here, protect also against
      a situation that we can receive rdma_cm events during ib queue destruction.
      Signed-off-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      eb1bd249
  23. 26 Nov, 2017 4 commits