[CI][SHARDS] aliasing-ppgtt vs userptr - dmesg-warn - WARNING: possible circular locking dependency detected
@l4kshmi
Submitted by LAKSHMINARAYANA VUDUM Assigned to Intel GFX Bugs mailing list
Link to original bug (#111891)
Description
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6988/shard-snb6/igt@gem_exec_basic@gtt-rcs0.html
<4>
[45.725639] ======================================================
<4>
[45.725642] WARNING: possible circular locking dependency detected
<4>
[45.725647] 5.4.0-rc1-CI-CI_DRM_6988+ #1 (moved) Tainted: G U
<4>
[45.725652] ------------------------------------------------------
<4>
[45.725657] kworker/u16:6/200 is trying to acquire lock:
<4>
[45.725669] ffff888205bd7958 (&mapping->i_mmap_rwsem)++, at: unmap_mapping_pages+0x48/0x130
<4>
[45.725680]
but task is already holding lock:
<4>
[45.725685] ffff88820d2d93a0 (&vm->mutex){+.+.}, at: i915_vma_unbind+0xe6/0x4a0 [i915]
<4>
[45.725764]
which lock already depends on the new lock.
<4>
[45.725769]
the existing dependency chain (in reverse order) is:
<4>
[45.725774]
-> #2 (&vm->mutex){+.+.}:
<4>
[45.725782] __mutex_lock+0x9a/0x9d0
<4>
[45.725843] i915_vma_remove+0x53/0x250 [i915]
<4>
[45.725904] i915_vma_unbind+0x19c/0x4a0 [i915]
<4>
[45.725965] i915_gem_object_unbind+0x153/0x1c0 [i915]
<4>
[45.726025] userptr_mn_invalidate_range_start+0x9f/0x200 [i915]
<4>
[45.726033] __mmu_notifier_invalidate_range_start+0xa3/0x180
<4>
[45.726039] unmap_vmas+0x143/0x150
<4>
[45.726044] unmap_region+0xa3/0x100
<4>
[45.726049] __do_munmap+0x25d/0x490
<4>
[45.726053] __vm_munmap+0x6e/0xc0
<4>
[45.726058] __x64_sys_munmap+0x12/0x20
<4>
[45.726063] do_syscall_64+0x4f/0x210
<4>
[45.726069] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>
[45.726073]
-> #1 (moved) (mmu_notifier_invalidate_range_start){+.+.}:
<4>
[45.726082] page_mkclean_one+0xda/0x210
<4>
[45.726087] rmap_walk_file+0xff/0x260
<4>
[45.726092] page_mkclean+0x9f/0xb0
<4>
[45.726097] clear_page_dirty_for_io+0xa2/0x300
<4>
[45.726103] mpage_submit_page+0x1a/0x70
<4>
[45.726108] mpage_process_page_bufs+0xe7/0x110
<4>
[45.726113] mpage_prepare_extent_to_map+0x1d2/0x2b0
<4>
[45.726119] ext4_writepages+0x592/0x1230
<4>
[45.726124] do_writepages+0x46/0xe0
<4>
[45.726130] __filemap_fdatawrite_range+0xc6/0x100
<4>
[45.726135] file_write_and_wait_range+0x3c/0x90
<4>
[45.726140] ext4_sync_file+0x154/0x500
<4>
[45.726146] do_fsync+0x33/0x60
<4>
[45.726150] __x64_sys_fsync+0xb/0x10
<4>
[45.726155] do_syscall_64+0x4f/0x210
<4>
[45.726160] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>
[45.726164]
-> #0 (&mapping->i_mmap_rwsem)++:
<4>
[45.726173] __lock_acquire+0x1328/0x15d0
<4>
[45.726178] lock_acquire+0xa7/0x1c0
<4>
[45.726183] down_write+0x33/0x70
<4>
[45.726188] unmap_mapping_pages+0x48/0x130
<4>
[45.726250] i915_vma_revoke_mmap+0x81/0x1b0 [i915]
<4>
[45.726312] i915_vma_unbind+0xee/0x4a0 [i915]
<4>
[45.726374] i915_vma_destroy+0x31/0x2f0 [i915]
<4>
[45.726431] __i915_gem_free_objects+0xb8/0x4b0 [i915]
<4>
[45.726438] process_one_work+0x26a/0x620
<4>
[45.726442] worker_thread+0x37/0x380
<4>
[45.726448] kthread+0x119/0x130
<4>
[45.726452] ret_from_fork+0x3a/0x50
<4>
[45.726456]
other info that might help us debug this:
<4>
[45.726463] Chain exists of:
&mapping->i_mmap_rwsem --> mmu_notifier_invalidate_range_start --> &vm->mutex
<4>
[45.726474] Possible unsafe locking scenario:
<4>
[45.726479] CPU0 CPU1
<4>
[45.726483] ---- ----
<4>
[45.726487] lock(&vm->mutex);
<4>
[45.726498] lock(mmu_notifier_invalidate_range_start);
<4>
[45.726505] lock(&vm->mutex);
<4>
[45.726510] lock(&mapping->i_mmap_rwsem);
<4>
[45.726514]
*** DEADLOCK ***
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6992/shard-snb1/igt@gem_mmap_gtt@basic-small-copy.html
<4>
[33.386129] ======================================================
<4>
[33.386132] WARNING: possible circular locking dependency detected
<4>
[33.386135] 5.4.0-rc1-CI-CI_DRM_6992+ #1 (moved) Tainted: G U
<4>
[33.386138] ------------------------------------------------------
<4>
[33.386141] kworker/u16:3/197 is trying to acquire lock:
<4>
[33.386143] ffff8882034802d8 (&mapping->i_mmap_rwsem)++, at: unmap_mapping_pages+0x48/0x130
<4>
[33.386153]
but task is already holding lock:
<4>
[33.386155] ffff8882155793a0 (&vm->mutex){+.+.}, at: i915_vma_unbind+0xe6/0x4a0 [i915]
<4>
[33.386214]
which lock already depends on the new lock.
<4>
[33.386217]
the existing dependency chain (in reverse order) is:
<4>
[33.386220]
-> #2 (&vm->mutex){+.+.}:
<4>
[33.386225] __mutex_lock+0x9a/0x9d0
<4>
[33.386266] i915_vma_remove+0x53/0x250 [i915]
<4>
[33.386306] i915_vma_unbind+0x19c/0x4a0 [i915]
<4>
[33.386346] i915_gem_object_unbind+0x153/0x1c0 [i915]
<4>
[33.386383] userptr_mn_invalidate_range_start+0x9f/0x200 [i915]
<4>
[33.386388] __mmu_notifier_invalidate_range_start+0xa3/0x180
<4>
[33.386391] unmap_vmas+0x143/0x150
<4>
[33.386394] unmap_region+0xa3/0x100
<4>
[33.386397] __do_munmap+0x25d/0x490
<4>
[33.386399] __vm_munmap+0x6e/0xc0
<4>
[33.386402] __x64_sys_munmap+0x12/0x20
<4>
[33.386405] do_syscall_64+0x4f/0x210
<4>
[33.386409] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>
[33.386411]
-> #1 (moved) (mmu_notifier_invalidate_range_start){+.+.}:
<4>
[33.386416] page_mkclean_one+0xda/0x210
<4>
[33.386419] rmap_walk_file+0xff/0x260
<4>
[33.386422] page_mkclean+0x9f/0xb0
<4>
[33.386425] clear_page_dirty_for_io+0xa2/0x300
<4>
[33.386429] mpage_submit_page+0x1a/0x70
<4>
[33.386432] mpage_process_page_bufs+0xe7/0x110
<4>
[33.386435] mpage_prepare_extent_to_map+0x1d2/0x2b0
<4>
[33.386438] ext4_writepages+0x592/0x1230
<4>
[33.386441] do_writepages+0x46/0xe0
<4>
[33.386444] __filemap_fdatawrite_range+0xc6/0x100
<4>
[33.386448] file_write_and_wait_range+0x3c/0x90
<4>
[33.386450] ext4_sync_file+0x154/0x500
<4>
[33.386454] do_fsync+0x33/0x60
<4>
[33.386457] __x64_sys_fsync+0xb/0x10
<4>
[33.386459] do_syscall_64+0x4f/0x210
<4>
[33.386462] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>
[33.386465]
-> #0 (&mapping->i_mmap_rwsem)++:
<4>
[33.386470] __lock_acquire+0x1328/0x15d0
<4>
[33.386473] lock_acquire+0xa7/0x1c0
<4>
[33.386476] down_write+0x33/0x70
<4>
[33.386479] unmap_mapping_pages+0x48/0x130
<4>
[33.386518] i915_vma_revoke_mmap+0x81/0x1b0 [i915]
<4>
[33.386558] i915_vma_unbind+0xee/0x4a0 [i915]
<4>
[33.386597] i915_vma_destroy+0x31/0x2f0 [i915]
<4>
[33.386633] __i915_gem_free_objects+0xb8/0x4b0 [i915]
<4>
[33.386637] process_one_work+0x26a/0x620
<4>
[33.386639] worker_thread+0x37/0x380
<4>
[33.386642] kthread+0x119/0x130
<4>
[33.386645] ret_from_fork+0x3a/0x50
<4>
[33.386647]
other info that might help us debug this:
<4>
[33.386651] Chain exists of:
&mapping->i_mmap_rwsem --> mmu_notifier_invalidate_range_start --> &vm->mutex
<4>
[33.386657] Possible unsafe locking scenario:
<4>
[33.386660] CPU0 CPU1
<4>
[33.386662] ---- ----
<4>
[33.386664] lock(&vm->mutex);
<4>
[33.386666] lock(mmu_notifier_invalidate_range_start);
<4>
[33.386671] lock(&vm->mutex);
<4>
[33.386674] lock(&mapping->i_mmap_rwsem);
<4>
[33.386676]
*** DEADLOCK ***