bat-adlp-4: igt@i915_selftest@live@reset - incomplete - pstore logs - NMI watchdog: Watchdog detected hard LOCKUP on cpu 16, RIP: 0010:qi_submit_sync, Call Trace: iommu_dma_map_sg, qi_flush_iotlb, intel_flush_iotlb_all
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7930/bat-adlp-4/igt@i915_selftest@live@reset.html
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7930/bat-adlp-4/pstore0-1665181839_Panic_1.txt
<0>[ 380.627800] NMI watchdog: Watchdog detected hard LOCKUP on cpu 16
<4>[ 380.627803] Modules linked in: i915(+) drm_display_helper drm_kms_helper vgem drm_shmem_helper drm_ttm_helper gpu_sched snd_hda_codec_hdmi snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers ttm drm_buddy syscopyarea sysfillrect sysimgblt fb_sys_fops fuse x86_pkg_temp_thermal coretemp wmi_bmof r8153_ecm cdc_ether kvm_intel usbnet r8152 mii kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e mei_me ptp i2c_i801 video pps_core i2c_smbus mei intel_lpss_pci wmi [last unloaded: i915]
<4>[ 380.627827] irq event stamp: 2571025
<4>[ 380.627828] hardirqs last enabled at (2571024): [<ffffffff81b5792f>] _raw_spin_unlock_irq+0x1f/0x50
<4>[ 380.627835] hardirqs last disabled at (2571025): [<ffffffff81b5775b>] _raw_spin_lock_irqsave+0x4b/0x50
<4>[ 380.627837] softirqs last enabled at (2570996): [<ffffffff81e00323>] __do_softirq+0x323/0x48e
<4>[ 380.627838] softirqs last disabled at (2571021): [<ffffffff810c16b8>] irq_exit_rcu+0xb8/0xe0
<4>[ 380.627843] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G U 6.0.0-rc7-CI_DRM_12226-gbcc9e3eb1e7b+ #1
<4>[ 380.627845] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P LP5 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 03/25/2022
<4>[ 380.627846] RIP: 0010:qi_submit_sync+0x2e7/0x650
<4>[ 380.627851] Code: 48 b8 00 00 00 00 00 08 00 00 49 85 47 20 0f 95 c3 49 8b 44 24 48 83 c3 04 42 83 3c 28 03 0f 84 35 02 00 00 49 8b 07 8b 68 34 <40> f6 c5 70 0f 85 9f 01 00 00 40 f6 c5 10 74 19 49 8b 07 8b 80 80
<4>[ 380.627852] RSP: 0018:ffffc900004e8d40 EFLAGS: 00000093
<4>[ 380.627854] RAX: ffffc9000007b000 RBX: 0000000000000004 RCX: ffff888100182800
<4>[ 380.627855] RDX: ffff888100182800 RSI: 0000000000000000 RDI: ffff888100086fc8
<4>[ 380.627855] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000fffffffe
<4>[ 380.627856] R10: 00000000f679c848 R11: 00000000a022472f R12: ffff888100086fc8
<4>[ 380.627857] R13: 00000000000000e4 R14: ffff888100086fc8 R15: ffff888100075600
<4>[ 380.627858] FS: 0000000000000000(0000) GS:ffff8882a7600000(0000) knlGS:0000000000000000
<4>[ 380.627858] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 380.627859] CR2: 00005608a73d9030 CR3: 0000000006612004 CR4: 0000000000770ee0
<4>[ 380.627860] PKRU: 55555554
<4>[ 380.627861] Call Trace:
<4>[ 380.627862] <IRQ>
<4>[ 380.627864] ? iommu_dma_map_sg+0x400/0x400
<4>[ 380.627866] qi_flush_iotlb+0x7c/0xa0
<4>[ 380.627868] intel_flush_iotlb_all+0x75/0x110
<4>[ 380.627871] fq_flush_iotlb+0x1d/0x30
<4>[ 380.627873] fq_flush_timeout+0x28/0xc0
<4>[ 380.627874] ? iommu_dma_map_sg+0x400/0x400
<4>[ 380.627875] ? iommu_dma_map_sg+0x400/0x400
<4>[ 380.627876] call_timer_fn+0x9c/0x2c0
<4>[ 380.627879] run_timer_softirq+0x548/0x570
<4>[ 380.627882] __do_softirq+0xda/0x48e
<4>[ 380.627884] irq_exit_rcu+0xb8/0xe0
<4>[ 380.627885] sysvec_apic_timer_interrupt+0x9e/0xc0
<4>[ 380.627888] </IRQ>
<4>[ 380.627888] <TASK>
<4>[ 380.627889] asm_sysvec_apic_timer_interrupt+0x16/0x20
<4>[ 380.627890] RIP: 0010:cpuidle_enter_state+0x104/0x5c0
<4>[ 380.627894] Code: 02 00 00 31 ff e8 fc f0 83 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 31 04 00 00 31 ff e8 c5 e0 8a ff e8 f0 09 8f ff fb 45 85 f6 <0f> 88 b9 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49
<4>[ 380.627895] RSP: 0018:ffffc9000020be88 EFLAGS: 00000206
<4>[ 380.627896] RAX: 0000000000000010 RBX: 0000000000000003 RCX: 0000000000000000
<4>[ 380.627896] RDX: 0000000000000000 RSI: ffffffff823a8cc0 RDI: ffffffff823487d7
<4>[ 380.627897] RBP: ffffe8ffffc37e30 R08: 0000000000000001 R09: 0000000000000001
<4>[ 380.627897] R10: 0000000000000001 R11: ffff8882a763a344 R12: 00000055cd71ac22
<4>[ 380.627898] R13: ffffffff827a85a0 R14: 0000000000000003 R15: 0000000000000000
<4>[ 380.627900] ? cpuidle_enter_state+0x100/0x5c0
<4>[ 380.627902] cpuidle_enter+0x24/0x40
<4>[ 380.627904] do_idle+0x253/0x2a0
<4>[ 380.627906] cpu_startup_entry+0x14/0x20
<4>[ 380.627908] start_secondary+0x10f/0x130
<4>[ 380.627910] secondary_startup_64_no_verify+0xce/0xdb
<4>[ 380.627914] </TASK>
<0>[ 380.627915] Kernel panic - not syncing: Hard LOCKUP
<0>[ 381.415326] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
<4>[ 381.415327] Modules linked in: i915(+) drm_display_helper drm_kms_helper vgem drm_shmem_helper drm_ttm_helper gpu_sched snd_hda_codec_hdmi snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers ttm drm_buddy syscopyarea sysfillrect sysimgblt fb_sys_fops fuse x86_pkg_temp_thermal coretemp wmi_bmof r8153_ecm cdc_ether kvm_intel usbnet r8152 mii kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e mei_me ptp i2c_i801 video pps_core i2c_smbus mei intel_lpss_pci wmi [last unloaded: i915]
<4>[ 381.415342] irq event stamp: 658054
<4>[ 381.415342] hardirqs last enabled at (658053): [<ffffffff8128fd44>] __slab_free+0x794/0x7c0
<4>[ 381.415347] hardirqs last disabled at (658054): [<ffffffff81b5775b>] _raw_spin_lock_irqsave+0x4b/0x50
<4>[ 381.415349] softirqs last enabled at (652946): [<ffffffff81e00323>] __do_softirq+0x323/0x48e
<4>[ 381.415350] softirqs last disabled at (652941): [<ffffffff810c16b8>] irq_exit_rcu+0xb8/0xe0
<4>[ 381.415352] CPU: 4 PID: 419 Comm: irqbalance Tainted: G U 6.0.0-rc7-CI_DRM_12226-gbcc9e3eb1e7b+ #1
<4>[ 381.415354] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P LP5 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 03/25/2022
<4>[ 381.415355] RIP: 0010:queued_spin_lock_slowpath+0x42/0x390
<4>[ 381.415357] Code: 4b f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 25 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 5b 5d 41 5c 41 5d c3 cc
<4>[ 381.415358] RSP: 0018:ffffc9000198fbd8 EFLAGS: 00000002
<4>[ 381.415360] RAX: 0000000000000101 RBX: ffff888100086fc8 RCX: 000000006aa2d6bf
<4>[ 381.415360] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888100086fc8
<4>[ 381.415361] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000fffffffe
<4>[ 381.415362] R10: 0000000089ca1274 R11: 00000000b3ca3dd8 R12: ffff888100086fc8
<4>[ 381.415362] R13: 00000000000000ec R14: ffff888100086fc8 R15: ffff888100075600
<4>[ 381.415363] FS: 00007fe20c3b2c40(0000) GS:ffff8882a7000000(0000) knlGS:0000000000000000
<4>[ 381.415364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 381.415365] CR2: 00005608a73f34b0 CR3: 000000012254e002 CR4: 0000000000770ee0
<4>[ 381.415365] PKRU: 55555554
<4>[ 381.415366] Call Trace:
<4>[ 381.415366] <TASK>
<4>[ 381.415367] do_raw_spin_lock+0xb0/0xc0
<4>[ 381.415369] qi_submit_sync+0x335/0x650
<4>[ 381.415373] qi_flush_iec+0x63/0x90
<4>[ 381.415375] modify_irte.isra.13+0xb6/0x100
<4>[ 381.415377] intel_ir_set_affinity+0x46/0x60
<4>[ 381.415379] msi_domain_set_affinity+0x44/0xb0
<4>[ 381.415382] irq_do_set_affinity+0x196/0x1b0
<4>[ 381.415384] irq_set_affinity_locked+0xfd/0x1a0
<4>[ 381.415386] __irq_set_affinity+0x3c/0x60
<4>[ 381.415388] write_irq_affinity.isra.11+0xbe/0xe0
<4>[ 381.415390] proc_reg_write+0x33/0x80
<4>[ 381.415393] vfs_write+0xe3/0x4e0
<4>[ 381.415397] ksys_write+0x57/0xd0
<4>[ 381.415399] do_syscall_64+0x37/0x90
<4>[ 381.415400] entry_SYSCALL_64_after_hwframe+0x63/0xcd
<4>[ 381.415401] RIP: 0033:0x7fe20c6f10af
<4>[ 381.415403] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 49 65 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 44 24 08 e8 7c 65 f8 ff 48
<4>[ 381.415403] RSP: 002b:00007ffc4e010e30 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
<4>[ 381.415404] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007fe20c6f10af
<4>[ 381.415405] RDX: 0000000000000008 RSI: 000055db5c561530 RDI: 0000000000000006
<4>[ 381.415405] RBP: 000055db5c561530 R08: 0000000000000000 R09: 000055db5bffa020
<4>[ 381.415406] R10: 000055db5bffa029 R11: 0000000000000293 R12: 0000000000000008
<4>[ 381.415407] R13: 000055db5c574220 R14: 00007fe20c7cc4a0 R15: 00007fe20c7cb8a0
<4>[ 381.415409] </TASK>
<4>[ 381.648643]
<4>[ 381.648643] ================================
<4>[ 381.648644] WARNING: inconsistent lock state
<4>[ 381.648644] 6.0.0-rc7-CI_DRM_12226-gbcc9e3eb1e7b+ #1 Tainted: G U
<4>[ 381.648645] --------------------------------
<4>[ 381.648645] inconsistent {INITIAL USE} -> {IN-NMI} usage.
<4>[ 381.648646] swapper/16/0 [HC1[1]:SC1[1]:HE0:SE0] takes:
<4>[ 381.648647] ffffffff8263fe58 (&nmi_desc[0].lock){....}-{2:2}, at: __register_nmi_handler+0x49/0x130
<4>[ 381.648653] {INITIAL USE} state was registered at:
<4>[ 381.648653] lock_acquire+0xd3/0x310
<4>[ 381.648654] _raw_spin_lock_irqsave+0x33/0x50
<4>[ 381.648655] __register_nmi_handler+0x49/0x130
<4>[ 381.648656] init_hw_perf_events+0x148/0x630
<4>[ 381.648660] do_one_initcall+0x53/0x2f0
<4>[ 381.648661] kernel_init_freeable+0xa6/0x1e1
<4>[ 381.648662] kernel_init+0x11/0x120
<4>[ 381.648663] ret_from_fork+0x1f/0x30
<4>[ 381.648664] irq event stamp: 2571025
<4>[ 381.648664] hardirqs last enabled at (2571024): [<ffffffff81b5792f>] _raw_spin_unlock_irq+0x1f/0x50
<4>[ 381.648666] hardirqs last disabled at (2571025): [<ffffffff81b5775b>] _raw_spin_lock_irqsave+0x4b/0x50
<4>[ 381.648667] softirqs last enabled at (2570996): [<ffffffff81e00323>] __do_softirq+0x323/0x48e
<4>[ 381.648668] softirqs last disabled at (2571021): [<ffffffff810c16b8>] irq_exit_rcu+0xb8/0xe0
<4>[ 381.648670]
<4>[ 381.648670] other info that might help us debug this:
<4>[ 381.648670] Possible unsafe locking scenario:
<4>[ 381.648670]
<4>[ 381.648671] CPU0
<4>[ 381.648671] ----
<4>[ 381.648671] lock(&nmi_desc[0].lock);
<4>[ 381.648672] <Interrupt>
<4>[ 381.648672] lock(&nmi_desc[0].lock);
<4>[ 381.648672]
<4>[ 381.648672] *** DEADLOCK ***