bad unlock balance detected in i915
Steps to replicate problem:
Performing a reboot cycle. The kernel built with lockdebug flag and it has produced the following log.
Problematic scenario:
- GEN6_READ_HEADER is called with params.mmio_debug = 0
- unclaimed_reg_debug(before = true) is called
- unclaimed_reg_debug return without taking a lock because params.mmio_debug == 0
- other thread modifies params.mmio_debug to 1
- GEN6_READ_FOOTER is called with params.mmio_debug != 0
- unclaimed_reg_debug tries to assert non-taken lock (first WARN)
- unclaimed_reg_debug tries to release non-taken lock (second WARN)
[ 31.462111] ------------[ cut here ]------------
[ 31.467733] WARN_ON(debug_locks && !(lock_is_held(&(&uncore->debug->lock)->dep_map) != 0))
[ 31.467748] WARNING: CPU: 5 PID: 0 at drivers/gpu/drm/i915/intel_uncore.c:537 check_for_unclaimed_mmio+0xcc/0x146
[ 31.488470] Modules linked in: 8021q ccm rfcomm cmac algif_hash algif_skcipher af_alg uinput veth snd_soc_sof_rt5682 snd_soc_hdac_hdmi snd_soc_intel_hda_dsp_common snd_soc_intel_sof_realtek_common snd_soc_intel_sof_maxim_common snd_soc_rt5682 iio_trig_hrtimer snd_soc_rl6231 industrialio_sw_trigger snd_sof_probes industrialio_configfs snd_soc_dmic xt_cgroup snd_hda_codec_hdmi btusb btrtl btintel btmtk btbcm snd_sof_pci_intel_tgl snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_xtensa_dsp intel_ipu6_psys snd_soc_acpi_intel_match snd_soc_acpi snd_hda_ext_core soundwire_bus snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core intel_ipu6_isys videobuf2_dma_contig videobuf2_v4l2 snd_sof_pci snd_sof videobuf2_common videobuf2_memops snd_sof_utils ledtrig_audio xt_MASQUERADE hi556 mei_hdcp v4l2_fwnode at24 cros_ec_typec acpi_als snd_soc_rt5682s intel_pmc_mux roles snd_soc_max98390 typec intel_ipu6
[ 31.488587] ip6table_nat fuse iwlmvm iwl7000_mac80211 iio_trig_sysfs iwlwifi cros_ec_light_prox cros_ec_sensors cros_ec_lid_angle cros_ec_sensors_core bluetooth industrialio_triggered_buffer kfifo_buf industrialio ecdh_generic ecc cfg80211 cros_ec_sensorhub lzo_rle lzo_compress zram joydev
[ 31.615412] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G U
[ 31.640332] RIP: 0010:check_for_unclaimed_mmio+0xcc/0x146
[ 31.646389] Code: 0f 95 c0 41 08 c7 45 84 ff 0f 95 c0 5b 41 5e 41 5f 5d c3 cc cc cc cc 48 c7 c7 fe ef 64 9f 48 c7 c6 fa 2e 6a 9f e8 3a 8c c7 ff <0f> 0b e9 64 ff ff ff 41 83 ff ff 74 57 48 8b 03 b9 00 00 00 80 89
[ 31.667399] RSP: 0018:ffff9be4c0214df0 EFLAGS: 00010046
[ 31.673260] RAX: 0283455448a8df00 RBX: ffff95aec5920fe8 RCX: ffffffff9f87f190
[ 31.681247] RDX: 0000000000000001 RSI: ffff95b61ee1b9a8 RDI: ffffffff9f87f100
[ 31.689237] RBP: ffff9be4c0214e08 R08: 0000000000000000 R09: 0000000000000000
[ 31.697226] R10: 00000000ffffffff R11: 3fffffffffffffff R12: 0000000000000000
[ 31.705223] R13: 0000000000000082 R14: 00000000000c4030 R15: 0000000000000001
[ 31.713212] FS: 0000000000000000(0000) GS:ffff95b61ee00000(0000) knlGS:0000000000000000
[ 31.722271] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 31.728706] CR2: 000059589b09047c CR3: 00000005c4c2a006 CR4: 00000000007f0ee0
[ 31.736693] PKRU: 55555554
[ 31.739730] Call Trace:
[ 31.742477] <IRQ>
[ 31.744739] __unclaimed_reg_debug+0x21/0x8d
[ 31.749532] gen12_fwtable_read32+0x233/0x34d
[ 31.754422] gen8_de_irq_handler+0x84e/0xe60
[ 31.759216] ? gen11_gt_irq_handler+0x250/0x277
[ 31.764302] gen11_display_irq_handler+0x36/0x56
[ 31.769480] gen11_irq_handler+0x5b/0xd2
[ 31.773886] __handle_irq_event_percpu+0x9a/0x296
[ 31.779165] handle_irq_event+0x51/0xa8
[ 31.783472] handle_edge_irq+0xbd/0x222
[ 31.787780] asm_call_irq_on_stack+0xf/0x20
[ 31.792475] </IRQ>
[ 31.794834] common_interrupt+0x158/0x1da
[ 31.799334] asm_common_interrupt+0x1e/0x40
[ 31.804020] RIP: 0010:cpuidle_enter_state+0x122/0x330
[ 31.809679] Code: c1 01 73 0d e8 32 77 23 00 84 c0 0f 84 60 01 00 00 31 ff e8 fa 02 00 00 45 84 ed 0f 85 11 01 00 00 e8 22 81 27 00 fb 45 85 e4 <0f> 88 e8 00 00 00 44 89 e0 48 6b d0 68 49 8b 4c 16 48 4c 2b 7d b8
[ 31.830684] RSP: 0018:ffff9be4c00cfe78 EFLAGS: 00000202
[ 31.836548] RAX: 0283455448a8df00 RBX: ffffbbe4bf022800 RCX: 0000000753442737
[ 31.844539] RDX: 0000000000000005 RSI: 0000000000000006 RDI: ffffffff9de83969
[ 31.852528] RBP: ffff9be4c00cfec0 R08: 0000000000000000 R09: 0000000000000000
[ 31.860520] R10: 0000000000000000 R11: ffffffff9eb5c0a2 R12: 0000000000000004
[ 31.868510] R13: 0000000000000000 R14: ffffffff9f9e4138 R15: 0000000753442737
[ 31.876509] ? acpi_processor_ffh_cstate_enter+0xb9/0xb9
[ 31.882467] ? cpuidle_enter_state+0x11e/0x330
[ 31.887448] ? cpuidle_enter_state+0x11e/0x330
[ 31.892435] cpuidle_enter+0x2c/0x3f
[ 31.896452] do_idle+0x137/0x299
[ 31.900081] cpu_startup_entry+0x1d/0x1f
[ 31.904486] secondary_startup_64_no_verify+0xb1/0xbb
[ 31.910153] irq event stamp: 3607682
[ 31.914168] hardirqs last enabled at (3607679): [<ffffffff9de83969>] cpuidle_enter_state+0x11e/0x330
[ 31.924491] hardirqs last disabled at (3607680): [<ffffffff9eb529a5>] common_interrupt+0x19/0x1da
[ 31.934429] softirqs last enabled at (3607682): [<ffffffff9eb529ad>] common_interrupt+0x21/0x1da
[ 31.944361] softirqs last disabled at (3607681): [<ffffffff9e09e61d>] local_bh_disable+0x4/0x16
[ 31.954105] ---[ end trace 881227f57d07ac8c ]---
[ 31.960942] =====================================
[ 31.966198] WARNING: bad unlock balance detected!
[ 31.980210] -------------------------------------
[ 31.985462] swapper/5/0 is trying to release lock (&mmio_debug->lock) at:
[ 31.993063] [<ffffffff9e4b780c>] gen12_fwtable_read32+0x23f/0x34d
[ 31.999872] but there are no more locks to release!
[ 32.005320]
[ 32.005320] other info that might help us debug this:
[ 32.012621] 1 lock held by swapper/5/0:
[ 32.016905] #0: ffff95aec5921018 (&uncore->lock){-.-.}-{2:2}, at: gen12_fwtable_read32+0x68/0x34d
[ 32.026942]
[ 32.026942] stack backtrace:
[ 32.031816] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G U W
[ 32.056698] Call Trace:
[ 32.059429] <IRQ>
[ 32.061673] dump_stack+0x8d/0xe7
[ 32.065374] ? print_unlock_imbalance_bug+0xd6/0xe1
[ 32.070824] lock_release+0x142/0x3c3
[ 32.074915] _raw_spin_unlock+0x1a/0x39
[ 32.079198] gen12_fwtable_read32+0x23f/0x34d
[ 32.084064] gen8_de_irq_handler+0x84e/0xe60
[ 32.088832] ? gen11_gt_irq_handler+0x250/0x277
[ 32.093891] gen11_display_irq_handler+0x36/0x56
[ 32.099049] gen11_irq_handler+0x5b/0xd2
[ 32.103429] __handle_irq_event_percpu+0x9a/0x296
[ 32.108685] handle_irq_event+0x51/0xa8
[ 32.112970] handle_edge_irq+0xbd/0x222
[ 32.117252] asm_call_irq_on_stack+0xf/0x20
[ 32.121924] </IRQ>
[ 32.124263] common_interrupt+0x158/0x1da
[ 32.128740] asm_common_interrupt+0x1e/0x40
[ 32.133414] RIP: 0010:cpuidle_enter_state+0x122/0x330
[ 32.139057] Code: c1 01 73 0d e8 32 77 23 00 84 c0 0f 84 60 01 00 00 31 ff e8 fa 02 00 00 45 84 ed 0f 85 11 01 00 00 e8 22 81 27 00 fb 45 85 e4 <0f> 88 e8 00 00 00 44 89 e0 48 6b d0 68 49 8b 4c 16 48 4c 2b 7d b8
[ 32.160047] RSP: 0018:ffff9be4c00cfe78 EFLAGS: 00000202
[ 32.165885] RAX: 0283455448a8df00 RBX: ffffbbe4bf022800 RCX: 0000000753442737
[ 32.173868] RDX: 0000000000000005 RSI: 0000000000000006 RDI: ffffffff9de83969
[ 32.181849] RBP: ffff9be4c00cfec0 R08: 0000000000000000 R09: 0000000000000000
[ 32.189830] R10: 0000000000000000 R11: ffffffff9eb5c0a2 R12: 0000000000000004
[ 32.197801] R13: 0000000000000000 R14: ffffffff9f9e4138 R15: 0000000753442737
[ 32.205773] ? acpi_processor_ffh_cstate_enter+0xb9/0xb9
[ 32.211707] ? cpuidle_enter_state+0x11e/0x330
[ 32.216668] ? cpuidle_enter_state+0x11e/0x330
[ 32.221632] cpuidle_enter+0x2c/0x3f
[ 32.225624] do_idle+0x137/0x299
[ 32.229228] cpu_startup_entry+0x1d/0x1f
[ 32.233609] secondary_startup_64_no_verify+0xb1/0xbb
[ 32.239881] BUG: scheduling while atomic: swapper/5/0/0x00000000
[ 32.246624] INFO: lockdep is turned off.
[ 32.251021] Modules linked in: 8021q ccm rfcomm cmac algif_hash algif_skcipher af_alg uinput veth snd_soc_sof_rt5682 snd_soc_hdac_hdmi snd_soc_intel_hda_dsp_common snd_soc_intel_sof_realtek_common snd_soc_intel_sof_maxim_common snd_soc_rt5682 iio_trig_hrtimer snd_soc_rl6231 industrialio_sw_trigger snd_sof_probes industrialio_configfs snd_soc_dmic xt_cgroup snd_hda_codec_hdmi btusb btrtl btintel btmtk btbcm snd_sof_pci_intel_tgl snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_xtensa_dsp intel_ipu6_psys snd_soc_acpi_intel_match snd_soc_acpi snd_hda_ext_core soundwire_bus snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core intel_ipu6_isys videobuf2_dma_contig videobuf2_v4l2 snd_sof_pci snd_sof videobuf2_common videobuf2_memops snd_sof_utils ledtrig_audio xt_MASQUERADE hi556 mei_hdcp v4l2_fwnode at24 cros_ec_typec acpi_als snd_soc_rt5682s intel_pmc_mux roles snd_soc_max98390 typec intel_ipu6
[ 32.251059] ip6table_nat fuse iwlmvm iwl7000_mac80211 iio_trig_sysfs iwlwifi cros_ec_light_prox cros_ec_sensors cros_ec_lid_angle cros_ec_sensors_core bluetooth industrialio_triggered_buffer kfifo_buf industrialio ecdh_generic ecc cfg80211 cros_ec_sensorhub lzo_rle lzo_compress zram joydev
[ 32.377914] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G U W
[ 32.402827] Call Trace:
[ 32.405563] dump_stack+0x8d/0xe7
[ 32.409280] __schedule_bug+0xa0/0xb6
[ 32.413385] __schedule+0x6c1/0x13a9
[ 32.417378] ? __tick_nohz_idle_restart_tick+0x78/0xbe
[ 32.423119] ? _raw_spin_unlock_irqrestore+0x56/0x58
[ 32.428667] schedule_idle+0x20/0x34
[ 32.432660] do_idle+0x19a/0x299
[ 32.436264] cpu_startup_entry+0x1d/0x1f
[ 32.440646] secondary_startup_64_no_verify+0xb1/0xbb
[ 32.821098] reboot: Restarting system
[ 32.825215] reboot: machine restart
[ 32.832830] ACPI MEMORY or I/O RESET_REG.