igt@i915_suspend@sysfs-reader - incomplete - is trying to acquire lock at: nvme_dev_disable, but task is already holding lock at: process_one_work
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8105/shard-tglb7/igt@i915_suspend@sysfs-reader.html https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8105/shard-tglb7/pstore8-1668532430_Panic_1.txt
4>[ 374.803813] nvme nvme0: I/O 834 (I/O Cmd) QID 8 timeout, aborting
<4>[ 401.107815] nvme nvme0: I/O 655 QID 6 timeout, reset controller
<4>[ 401.107953]
<4>[ 401.107973] ======================================================
<4>[ 401.108019] WARNING: possible circular locking dependency detected
<4>[ 401.108037] 6.1.0-rc5-CI_DRM_12382-gcb7486469341+ #1 Not tainted
<4>[ 401.108056] ------------------------------------------------------
<4>[ 401.108074] kworker/5:1H/191 is trying to acquire lock:
<4>[ 401.108090] ffff888104312340 (&dev->shutdown_lock){+.+.}-{3:3}, at: nvme_dev_disable+0x32/0x570
<4>[ 401.108134]
<4>[ 401.108134] but task is already holding lock:
<4>[ 401.108152] ffffc900003c7e78 ((work_completion)(&q->timeout_work)){+.+.}-{0:0}, at: process_one_work+0x1eb/0x5b0
<4>[ 401.108189]
<4>[ 401.108189] which lock already depends on the new lock.
<4>[ 401.108189]
<4>[ 401.108214]
<4>[ 401.108214] the existing dependency chain (in reverse order) is:
<4>[ 401.108235]
<4>[ 401.108235] -> #2 ((work_completion)(&q->timeout_work)){+.+.}-{0:0}:
<4>[ 401.108261] lock_acquire+0xd3/0x310
<4>[ 401.108278] __flush_work+0x77/0x4e0
<4>[ 401.108294] __cancel_work_timer+0x14e/0x1f0
<4>[ 401.108311] nvme_sync_io_queues+0x2f/0x50
<4>[ 401.108332] nvme_sync_queues+0x9/0x30
<4>[ 401.108349] nvme_reset_work+0x63/0x10f0
<4>[ 401.108366] process_one_work+0x272/0x5b0
<4>[ 401.108387] worker_thread+0x37/0x370
<4>[ 401.108403] kthread+0xed/0x120
<4>[ 401.108418] ret_from_fork+0x1f/0x30
<4>[ 401.108438]
<4>[ 401.108438] -> #1 (&ctrl->namespaces_rwsem){++++}-{3:3}:
<4>[ 401.108465] lock_acquire+0xd3/0x310
<4>[ 401.108483] down_read+0x39/0x140
<4>[ 401.108503] nvme_start_freeze+0x1d/0x50
<4>[ 401.108523] nvme_dev_disable+0x451/0x570
<4>[ 401.108543] nvme_suspend+0x4c/0x160
<4>[ 401.108561] pci_pm_suspend+0x6b/0x150
<4>[ 401.108582] dpm_run_callback+0x5d/0x250
<4>[ 401.108605] __device_suspend+0x143/0x590
<4>[ 401.108622] async_suspend+0x15/0x90
<4>[ 401.108646] async_run_entry_fn+0x28/0x130
<4>[ 401.108666] process_one_work+0x272/0x5b0
<4>[ 401.108685] worker_thread+0x37/0x370
<4>[ 401.108733] kthread+0xed/0x120
<4>[ 401.108748] ret_from_fork+0x1f/0x30
<4>[ 401.108766]
<4>[ 401.108766] -> #0 (&dev->shutdown_lock){+.+.}-{3:3}:
<4>[ 401.108792] validate_chain+0xb3d/0x2000
<4>[ 401.108811] __lock_acquire+0x5a4/0xb70
<4>[ 401.108829] lock_acquire+0xd3/0x310
<4>[ 401.108845] __mutex_lock+0x97/0xf10
<4>[ 401.108864] nvme_dev_disable+0x32/0x570
<4>[ 401.108884] nvme_timeout.cold.78+0xe8/0x1d5
<4>[ 401.108909] blk_mq_check_expired+0x5a/0x90
<4>[ 401.108931] bt_iter+0x7e/0x90
<4>[ 401.108951] blk_mq_queue_tag_busy_iter+0x3d6/0x650
<4>[ 401.108973] blk_mq_timeout_work+0xd5/0x250
<4>[ 401.108992] process_one_work+0x272/0x5b0
<4>[ 401.109012] worker_thread+0x37/0x370
<4>[ 401.109030] kthread+0xed/0x120
<4>[ 401.109046] ret_from_fork+0x1f/0x30
<4>[ 401.109064]
<4>[ 401.109064] other info that might help us debug this:
<4>[ 401.109064]
<4>[ 401.110930] Chain exists of:
<4>[ 401.110930] &dev->shutdown_lock --> &ctrl->namespaces_rwsem --> (work_completion)(&q->timeout_work)
<4>[ 401.110930]
<4>[ 401.113420] Possible unsafe locking scenario:
<4>[ 401.113420]
<4>[ 401.114578] CPU0 CPU1
<4>[ 401.115153] ---- ----
<4>[ 401.115717] lock((work_completion)(&q->timeout_work));
<4>[ 401.116278] lock(&ctrl->namespaces_rwsem);
<4>[ 401.116844] lock((work_completion)(&q->timeout_work));
<4>[ 401.117424] lock(&dev->shutdown_lock);
<4>[ 401.117992]
<4>[ 401.117992] *** DEADLOCK ***