igt@kms_pm_rpm@ subtests - abort - prometheus-node/.* is trying to acquire lock:, at: (__kmalloc|acpi_device_wakeup_disable).*, but task is already holding lock:, at: hwm_energy
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14378/bat-dg2-9/igt@kms_pm_rpm@basic-pci-d3-state.html#dmesg-warnings426
<4> [192.005514]
<4> [192.005519] ======================================================
<4> [192.005521] WARNING: possible circular locking dependency detected
<4> [192.005522] 6.8.0-rc6-CI_DRM_14378-g5f60548dd58e+ #1 Not tainted
<4> [192.005525] ------------------------------------------------------
<4> [192.005526] prometheus-node/5439 is trying to acquire lock:
<4> [192.005528] ffffffff82764d80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x9a/0x350
<4> [192.005538]
but task is already holding lock:
<4> [192.005540] ffff888154108640 (&hwmon->hwmon_lock){+.+.}-{3:3}, at: hwm_energy+0x4b/0x100 [i915]
<4> [192.005838]
which lock already depends on the new lock.
<4> [192.005839]
the existing dependency chain (in reverse order) is:
<4> [192.005841]
-> #2 (&hwmon->hwmon_lock){+.+.}-{3:3}:
<4> [192.005845] lock_acquire+0xd8/0x2d0
<4> [192.005850] __mutex_lock+0x95/0xcd0
<4> [192.005855] i915_hwmon_power_max_disable+0x43/0xb0 [i915]
<4> [192.006118] __uc_init_hw+0x4ff/0x1020 [i915]
<4> [192.006378] intel_gt_init_hw+0xe6/0x270 [i915]
<4> [192.006573] intel_gt_reset+0x383/0x480 [i915]
<4> [192.006778] intel_gt_reset_global+0xeb/0x160 [i915]
<4> [192.006981] intel_gt_handle_error+0x372/0x420 [i915]
<4> [192.007183] intel_gt_debugfs_reset_store+0x5c/0xc0 [i915]
<4> [192.007374] i915_wedged_set+0x1d/0x40 [i915]
<4> [192.007549] simple_attr_write_xsigned.constprop.0+0xb4/0x110
<4> [192.007553] full_proxy_write+0x58/0x80
<4> [192.007557] vfs_write+0xcb/0x560
<4> [192.007560] ksys_write+0x64/0xe0
<4> [192.007562] do_syscall_64+0x6f/0x140
<4> [192.007565] entry_SYSCALL_64_after_hwframe+0x6e/0x76
<4> [192.007570]
-> #1 (>->reset.mutex){+.+.}-{3:3}:
<4> [192.007575] lock_acquire+0xd8/0x2d0
<4> [192.007578] i915_gem_shrinker_taints_mutex+0x31/0x50 [i915]
<4> [192.007805] intel_gt_init_reset+0x65/0x80 [i915]
<4> [192.008010] intel_gt_common_init_early+0xd9/0x120 [i915]
<4> [192.008201] intel_root_gt_init_early+0x5e/0x70 [i915]
<4> [192.008389] i915_driver_probe+0x1e6/0xd40 [i915]
<4> [192.008580] i915_pci_probe+0xd5/0x200 [i915]
<4> [192.008765] pci_device_probe+0x95/0x120
<4> [192.008769] really_probe+0x164/0x3c0
<4> [192.008774] __driver_probe_device+0x73/0x160
<4> [192.008778] driver_probe_device+0x19/0xa0
<4> [192.008782] __driver_attach+0xb6/0x180
<4> [192.008786] bus_for_each_dev+0x77/0xd0
<4> [192.008790] bus_add_driver+0x114/0x210
<4> [192.008793] driver_register+0x5b/0x110
<4> [192.008796] 0xffffffffa03ae033
<4> [192.008808] do_one_initcall+0x57/0x270
<4> [192.008812] do_init_module+0x5f/0x210
<4> [192.008815] load_module+0x1d1a/0x1f80
<4> [192.008818] init_module_from_file+0x86/0xd0
<4> [192.008821] idempotent_init_module+0x17c/0x230
<4> [192.008824] __x64_sys_finit_module+0x56/0xb0
<4> [192.008827] do_syscall_64+0x6f/0x140
<4> [192.008829] entry_SYSCALL_64_after_hwframe+0x6e/0x76
<4> [192.008834]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [192.008839] check_prev_add+0xe9/0xce0
<4> [192.008842] __lock_acquire+0x179f/0x2300
<4> [192.008845] lock_acquire+0xd8/0x2d0
<4> [192.008848] fs_reclaim_acquire+0xa1/0xd0
<4> [192.008853] __kmalloc+0x9a/0x350
<4> [192.008856] acpi_ns_internalize_name.part.0+0x4a/0xb0
<4> [192.008860] acpi_ns_get_node_unlocked+0x60/0xf0
<4> [192.008863] acpi_ns_get_node+0x3b/0x60
<4> [192.008866] acpi_get_handle+0x57/0xb0
<4> [192.008870] acpi_has_method+0x20/0x50
<4> [192.008873] acpi_pci_set_power_state+0x43/0x120
<4> [192.008877] pci_power_up+0x24/0x1c0
<4> [192.008880] pci_pm_default_resume_early+0x9/0x30
<4> [192.008885] pci_pm_runtime_resume+0x2d/0x90
<4> [192.008888] __rpm_callback+0x3c/0x110
<4> [192.008892] rpm_callback+0x58/0x70
<4> [192.008896] rpm_resume+0x51e/0x730
<4> [192.008900] rpm_resume+0x267/0x730
<4> [192.008904] rpm_resume+0x267/0x730
<4> [192.008908] rpm_resume+0x267/0x730
<4> [192.008911] __pm_runtime_resume+0x49/0x90
<4> [192.008915] __intel_runtime_pm_get+0x19/0xa0 [i915]
<4> [192.009105] hwm_energy+0x55/0x100 [i915]
<4> [192.009396] hwm_read+0x9a/0x310 [i915]
<4> [192.009686] hwmon_attr_show+0x36/0x120
<4> [192.009691] dev_attr_show+0x15/0x60
<4> [192.009694] sysfs_kf_seq_show+0xb5/0x100
<4> [192.009700] seq_read_iter+0x111/0x450
<4> [192.009704] vfs_read+0x206/0x340
<4> [192.009707] ksys_read+0x64/0xe0
<4> [192.009710] do_syscall_64+0x6f/0x140
<4> [192.009713] entry_SYSCALL_64_after_hwframe+0x6e/0x76
<4> [192.009717]
other info that might help us debug this:
<4> [192.009719] Chain exists of:
fs_reclaim --> >->reset.mutex --> &hwmon->hwmon_lock
<4> [192.009725] Possible unsafe locking scenario:
<4> [192.009726] CPU0 CPU1
<4> [192.009728] ---- ----
<4> [192.009729] lock(&hwmon->hwmon_lock);
<4> [192.009732] lock(>->reset.mutex);
<4> [192.009735] lock(&hwmon->hwmon_lock);
<4> [192.009738] lock(fs_reclaim);
<4> [192.009740]
*** DEADLOCK ***
<4> [192.009742] 5 locks held by prometheus-node/5439:
<4> [192.009745] #0: ffff88810a4cc888 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x3e/0x60
<4> [192.009755] #1: ffff88810474d3e0 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x54/0x450
<4> [192.009763] #2: ffff8881251d8e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_seq_start+0x24/0xa0
<4> [192.009770] #3: ffff88815428bdf8 (kn->active#204){.+.+}-{0:0}, at: kernfs_seq_start+0x2c/0xa0
<4> [192.009779] #4: ffff888154108640 (&hwmon->hwmon_lock){+.+.}-{3:3}, at: hwm_energy+0x4b/0x100 [i915]
<4> [192.010072]
stack backtrace:
<4> [192.010074] CPU: 3 PID: 5439 Comm: prometheus-node Not tainted 6.8.0-rc6-CI_DRM_14378-g5f60548dd58e+ #1
<4> [192.010079] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
<4> [192.010082] Call Trace:
<4> [192.010084] <TASK>
<4> [192.010086] dump_stack_lvl+0x64/0xb0
<4> [192.010093] check_noncircular+0x15e/0x180
<4> [192.010097] ? kernel_text_address+0x5b/0xc0
<4> [192.010102] ? arch_stack_walk+0x9d/0xf0
<4> [192.010106] check_prev_add+0xe9/0xce0
<4> [192.010110] __lock_acquire+0x179f/0x2300
<4> [192.010115] lock_acquire+0xd8/0x2d0
<4> [192.010119] ? __kmalloc+0x9a/0x350
<4> [192.010124] ? acpi_ns_internalize_name.part.0+0x4a/0xb0
<4> [192.010127] fs_reclaim_acquire+0xa1/0xd0
<4> [192.010132] ? __kmalloc+0x9a/0x350
<4> [192.010135] __kmalloc+0x9a/0x350
<4> [192.010139] ? acpi_ns_internalize_name.part.0+0x4a/0xb0
<4> [192.010142] acpi_ns_internalize_name.part.0+0x4a/0xb0
<4> [192.010146] acpi_ns_get_node_unlocked+0x60/0xf0
<4> [192.010150] ? _raw_spin_unlock_irqrestore+0x58/0x70
<4> [192.010154] ? lockdep_hardirqs_on+0xc3/0x140
<4> [192.010159] ? _raw_spin_unlock_irqrestore+0x41/0x70
<4> [192.010162] ? down_timeout+0x4b/0x70
<4> [192.010167] ? acpi_ns_get_node+0x3b/0x60
<4> [192.010170] acpi_ns_get_node+0x3b/0x60
<4> [192.010174] acpi_get_handle+0x57/0xb0
<4> [192.010178] acpi_has_method+0x20/0x50
<4> [192.010182] acpi_pci_set_power_state+0x43/0x120
<4> [192.010186] pci_power_up+0x24/0x1c0
<4> [192.010189] pci_pm_default_resume_early+0x9/0x30
<4> [192.010194] pci_pm_runtime_resume+0x2d/0x90
<4> [192.010197] ? __pfx_pci_pm_runtime_resume+0x10/0x10
<4> [192.010200] __rpm_callback+0x3c/0x110
<4> [192.010205] ? __pfx_pci_pm_runtime_resume+0x10/0x10
<4> [192.010207] rpm_callback+0x58/0x70
<4> [192.010212] rpm_resume+0x51e/0x730
<4> [192.010216] ? __pfx_autoremove_wake_function+0x10/0x10
<4> [192.010221] rpm_resume+0x267/0x730
<4> [192.010225] ? __pfx_autoremove_wake_function+0x10/0x10
<4> [192.010230] rpm_resume+0x267/0x730
<4> [192.010234] ? __pfx_autoremove_wake_function+0x10/0x10
<4> [192.010238] rpm_resume+0x267/0x730
<4> [192.010243] __pm_runtime_resume+0x49/0x90
<4> [192.010248] __intel_runtime_pm_get+0x19/0xa0 [i915]
<4> [192.010442] hwm_energy+0x55/0x100 [i915]
<4> [192.010734] hwm_read+0x9a/0x310 [i915]
<4> [192.011022] hwmon_attr_show+0x36/0x120
<4> [192.011027] dev_attr_show+0x15/0x60
<4> [192.011030] sysfs_kf_seq_show+0xb5/0x100
<4> [192.011036] seq_read_iter+0x111/0x450
<4> [192.011040] ? rcu_is_watching+0x11/0x50
<4> [192.011044] vfs_read+0x206/0x340
<4> [192.011049] ksys_read+0x64/0xe0
<4> [192.011053] do_syscall_64+0x6f/0x140
<4> [192.011056] entry_SYSCALL_64_after_hwframe+0x6e/0x76
<4> [192.011061] RIP: 0033:0x4b10f0
<4> [192.011064] Code: 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 49 c7 c2 00 00 00 00 49 c7 c0 00 00 00 00 49 c7 c1 00 00 00 00 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
<4> [192.011068] RSP: 002b:000000c00029b790 EFLAGS: 00000206 ORIG_RAX: 0000000000000000
<4> [192.011072] RAX: ffffffffffffffda RBX: 000000c000030000 RCX: 00000000004b10f0
<4> [192.011075] RDX: 0000000000000080 RSI: 000000c0004f7800 RDI: 0000000000000006
<4> [192.011077] RBP: 000000c00029b7e0 R08: 0000000000000000 R09: 0000000000000000
<4> [192.011079] R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff
<4> [192.011081] R13: 0000000000000031 R14: 0000000000000030 R15: 0000000000000040