More lockdep issue
Starting to think the dma-fence signaling annotations are more trouble than they are worth as they are incredibly fragile.
Anyways here is the splat:
[ 801.816455] ======================================================
[ 801.822637] WARNING: possible circular locking dependency detected
[ 801.828820] 5.19.0-xe+ #3284 Not tainted
[ 801.832752] ------------------------------------------------------
[ 801.838937] kworker/u16:2/1189 is trying to acquire lock:
[ 801.844338] ffffffff82e09a80 (dma_fence_map){++++}-{0:0}, at: xe_guc_ct_send+0x28/0x90 [xe]
[ 801.852698]
but task is already holding lock:
[ 801.858543] ffff88810e8647d8 (&guc->submission_state.lock){+.+.}-{3:3}, at: xe_guc_submit_start+0x4a/0x250 [xe]
[ 801.868630]
which lock already depends on the new lock.
[ 801.876809]
the existing dependency chain (in reverse order) is:
[ 801.884290]
-> #1 (&guc->submission_state.lock){+.+.}-{3:3}:
[ 801.891428] __mutex_lock+0x97/0xf30
[ 801.895540] xe_guc_submit_init+0x10f/0x190 [xe]
[ 801.900699] xe_uc_init+0x60/0x70 [xe]
[ 801.904989] xe_gt_init+0x2ab/0x490 [xe]
[ 801.909460] xe_device_probe+0x1d8/0x210 [xe]
[ 801.914361] xe_pci_probe+0x216/0x3d0 [xe]
[ 801.919006] pci_device_probe+0xa2/0x150
[ 801.923461] really_probe+0x178/0x360
[ 801.927654] __driver_probe_device+0xfa/0x170
[ 801.932542] driver_probe_device+0x1a/0x90
[ 801.937165] __driver_attach+0x9b/0x180
[ 801.941528] bus_for_each_dev+0x72/0xc0
[ 801.945899] bus_add_driver+0x162/0x220
[ 801.950262] driver_register+0x66/0xc0
[ 801.954538] do_one_initcall+0x53/0x2f0
[ 801.958907] do_init_module+0x45/0x1c0
[ 801.963190] load_module+0x1c3a/0x1e10
[ 801.967469] __do_sys_finit_module+0xaf/0x120
[ 801.972351] do_syscall_64+0x37/0x90
[ 801.976463] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 801.982041]
-> #0 (dma_fence_map){++++}-{0:0}:
[ 801.987963] __lock_acquire+0x15ae/0x2940
[ 801.992500] lock_acquire+0xd3/0x310
[ 801.996609] dma_fence_begin_signalling+0x50/0x60
[ 802.001843] xe_guc_ct_send+0x28/0x90 [xe]
[ 802.006486] __register_engine+0x64/0x90 [xe]
[ 802.011382] guc_engine_run_job+0xa64/0xe20 [xe]
[ 802.016544] drm_sched_resubmit_jobs_ext+0x71/0x200
[ 802.021953] xe_guc_submit_start+0x109/0x250 [xe]
[ 802.027203] xe_guc_start+0xa/0x30 [xe]
[ 802.031580] gt_reset_worker.cold.6+0x18a/0x27c [xe]
[ 802.037087] process_one_work+0x272/0x5c0
[ 802.041629] worker_thread+0x37/0x370
[ 802.045819] kthread+0xed/0x120
[ 802.049493] ret_from_fork+0x1f/0x30
[ 802.053597]
other info that might help us debug this:
[ 802.061593] Possible unsafe locking scenario:
[ 802.067518] CPU0 CPU1
[ 802.072055] ---- ----
[ 802.076590] lock(&guc->submission_state.lock);
[ 802.081215] lock(dma_fence_map);
[ 802.087137] lock(&guc->submission_state.lock);
[ 802.094270] lock(dma_fence_map);
[ 802.097688]
*** DEADLOCK ***