general protection fault, probably for non-canonical address 0x18e7000000010a: 0000 [#1] PREEMPT SMP NOPTI
Brief summary of the problem:
I turned on some acpi and PM related kernel args to help debug suspend problems I was seeing in #2539 (closed); afterwards I experienced this new bug, where the whole system locks up, with the following errors printed many times in journal. The error message points to Logseq
, which is an Electron based app I installed via Flatpak, and superproductivity
which is another Electron app installed via Appimage.
In the attached kernel log file the error starts at 15:49:47
May 04 15:50:42 kernel: watchdog: BUG: soft lockup - CPU#13 stuck for 26s! [Logseq:shlo0:11834]
May 04 15:50:42 kernel: Modules linked in: michael_mic uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject
_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr_mhi sunrpc uvcvideo videobuf2_vmalloc vide
obuf2_memops videobuf2_v4l2 videobuf2_common videodev mc binfmt_misc vfat fat qrtr ath11k_pci snd_soc_acp6x_mach snd_acp6x_pdm_dma snd_soc_dmic snd_sof_amd_rembrandt ath11k snd_sof_amd_renoir
snd_sof_amd_acp qmi_helpers snd_sof_pci snd_sof_xtensa_dsp mac80211 snd_sof intel_rapl_msr intel_rapl_common snd_ctl_led snd_hda_codec_realtek edac_mce_amd snd_sof_utils snd_hda_codec_generic
snd_hda_codec_hdmi snd_soc_core kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_compress snd_hda_codec libarc4 ac97_bus snd_pcm_dmaengine kvm snd_pci_ps snd_hda_core snd_rpl_pci_
acp6x snd_pci_acp6x snd_hwdep snd_seq snd_seq_device
May 04 15:50:42 kernel: irqbypass think_lmi rapl pcspkr firmware_attributes_class cfg80211 snd_pcm snd_pci_acp5x joydev wmi_bmof snd_rn_pci_acp3x thunderbolt snd_acp_config snd_soc_acpi i2c_p
iix4 mhi k10temp snd_timer snd_pci_acp3x amd_pmc acpi_tad loop zram dm_crypt amdgpu thinkpad_acpi snd iommu_v2 drm_buddy soundcore gpu_sched ledtrig_audio platform_profile drm_display_helper n
vme nvme_core cec crct10dif_pclmul crc32_pclmul crc32c_intel drm_ttm_helper polyval_clmulni rfkill hid_multitouch ucsi_acpi polyval_generic ghash_clmulni_intel sha512_ssse3 typec_ucsi ccp vide
o serio_raw ttm sp5100_tco nvme_common typec wmi i2c_hid_acpi i2c_hid scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables dm_multipath fuse
May 04 15:50:42 kernel: CPU: 13 PID: 11834 Comm: Logseq:shlo0 Tainted: G D W L 6.2.14-200.fc37.x86_64 #1
May 04 15:50:42 kernel: Hardware name: LENOVO 21CRS0DG00/21CRS0DG00, BIOS R22ET60W (1.30 ) 02/09/2023
May 04 15:50:42 kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x6b/0x2b0
May 04 15:50:42 kernel: Code: 75 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 51 85 c0 74 0e 8b 03 84 c0 74 08 f3 90 8b 03 <84> c0 75 f8 b8 01 00 00 00 66 89 0
3 5b 5d 41 5c 41 5d c3 cc cc cc
May 04 15:50:42 kernel: RSP: 0018:ffffa88389747b60 EFLAGS: 00000202
May 04 15:50:42 kernel: RAX: 0000000000000101 RBX: ffff93b18eceaa2c RCX: 0000000000000000
May 04 15:50:42 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff93b18eceaa2c
May 04 15:50:42 kernel: RBP: ffff93b18eceaa2c R08: 0000000000000020 R09: 0000000000000000
May 04 15:50:42 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff93b18eceaa28
May 04 15:50:42 kernel: R13: ffff93b460360158 R14: 0000000000008013 R15: ffff93b460360000
May 04 15:50:42 kernel: FS: 0000000000000000(0000) GS:ffff93b4af140000(0000) knlGS:0000000000000000
May 04 15:50:42 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 04 15:50:42 kernel: CR2: 000023e500097000 CR3: 0000000291010000 CR4: 0000000000750ee0
May 04 15:50:42 kernel: PKRU: 55555554
May 04 15:50:42 kernel: Call Trace:
May 04 15:50:42 kernel: <TASK>
May 04 15:50:42 kernel: queued_write_lock_slowpath+0x11e/0x124
May 04 15:50:42 kernel: drm_vma_offset_remove+0x14/0x70
May 04 15:50:42 kernel: ttm_bo_release+0x2e3/0x350 [ttm]
May 04 15:50:42 kernel: amdgpu_bo_unref+0x1a/0x30 [amdgpu]
May 04 15:50:42 kernel: amdgpu_driver_postclose_kms+0x169/0x2f0 [amdgpu]
May 04 15:50:42 kernel: drm_file_free.part.0+0x207/0x250
May 04 15:50:42 kernel: drm_release+0x64/0xd0
May 04 15:50:42 kernel: __fput+0x91/0x250
May 04 15:50:42 kernel: task_work_run+0x59/0x90
May 04 15:50:42 kernel: do_exit+0x33e/0xb20
May 04 15:50:42 kernel: ? _raw_spin_unlock+0x15/0x30
May 04 15:50:42 kernel: ? futex_unqueue+0x38/0x60
May 04 15:50:42 kernel: do_group_exit+0x2d/0x80
May 04 15:50:42 kernel: get_signal+0x9b4/0x9f0
May 04 15:50:42 kernel: arch_do_signal_or_restart+0x3a/0x280
May 04 15:50:42 kernel: exit_to_user_mode_prepare+0x18d/0x1f0
May 04 15:50:42 kernel: syscall_exit_to_user_mode+0x17/0x40
May 04 15:50:42 kernel: do_syscall_64+0x67/0x80
May 04 15:50:42 kernel: ? exc_page_fault+0x70/0x170
May 04 15:50:42 kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
May 04 15:50:42 kernel: RIP: 0033:0x7f7f36a8bbb4
May 04 15:50:42 kernel: Code: Unable to access opcode bytes at 0x7f7f36a8bb8a.
May 04 15:50:42 kernel: RSP: 002b:00007f7f2adfb890 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
May 04 15:50:42 kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f7f36a8bbb4
May 04 15:50:42 kernel: RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00000a9c00284848
May 04 15:50:42 kernel: RBP: 00007f7f2adfb8c0 R08: 0000000000000000 R09: 00000000ffffffff
May 04 15:50:42 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
May 04 15:50:42 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 00000a9c00284848
May 04 15:50:42 kernel: </TASK>
May 04 15:50:56 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... 13-.... } 21139 jiffies s: 2665 root: 0x2001/.
May 04 15:50:56 kernel: rcu: blocking rcu_node structures (internal RCU debug):
May 04 15:50:56 kernel: Sending NMI from CPU 8 to CPUs 0:
May 04 15:50:56 kernel: NMI backtrace for cpu 0
May 04 15:50:56 kernel: CPU: 0 PID: 12088 Comm: superproductivi Tainted: G D W L 6.2.14-200.fc37.x86_64 #1
May 04 15:50:56 kernel: Hardware name: LENOVO 21CRS0DG00/21CRS0DG00, BIOS R22ET60W (1.30 ) 02/09/2023
May 04 15:50:56 kernel: RIP: 0010:queued_write_lock_slowpath+0x66/0x124
May 04 15:50:56 kernel: Code: 0f 1f 44 00 00 5b 5d c3 cc cc cc cc f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 8b 03 <3d> 00 01 00 00 75 f5 89 c8 f0 0f b
1 13 74 be eb e2 65 8b 05 ee eb
May 04 15:50:56 kernel: RSP: 0018:ffffa883878f3b40 EFLAGS: 00000206
May 04 15:50:56 kernel: RAX: 00000000000001ff RBX: ffff93b18eceaa28 RCX: 0000000000000100
May 04 15:50:56 kernel: RDX: 00000000000000ff RSI: ffff93b183d95c70 RDI: ffff93b18eceaa28
May 04 15:50:56 kernel: RBP: ffff93b18eceaa2c R08: ffff93b1935e5598 R09: ffff93b1a777bc60
May 04 15:50:56 kernel: R10: 0000000000000000 R11: ffff93b1935e5550 R12: 0000000000000000
May 04 15:50:56 kernel: R13: 0000000000000404 R14: ffff93b183d95c38 R15: ffffa883878f3be8
May 04 15:50:56 kernel: FS: 00007fcb5fc71e40(0000) GS:ffff93b4aee00000(0000) knlGS:0000000000000000
May 04 15:50:56 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 04 15:50:56 kernel: CR2: 00007fcb38fa7040 CR3: 0000000020dbe000 CR4: 0000000000750ef0
May 04 15:50:56 kernel: PKRU: 55555554
May 04 15:50:56 kernel: Call Trace:
May 04 15:50:56 kernel: <TASK>
May 04 15:50:56 kernel: drm_vma_offset_add+0x1c/0x60
May 04 15:50:56 kernel: ttm_bo_init_reserved+0x116/0x1d0 [ttm]
May 04 15:50:56 kernel: amdgpu_bo_create+0x1d0/0x4b0 [amdgpu]
May 04 15:50:56 kernel: ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
May 04 15:50:56 kernel: amdgpu_bo_create_user+0x3c/0x70 [amdgpu]
May 04 15:50:56 kernel: amdgpu_gem_create_ioctl+0x148/0x3c0 [amdgpu]
May 04 15:50:56 kernel: ? __pfx_amdgpu_bo_user_destroy+0x10/0x10 [amdgpu]
May 04 15:50:56 kernel: ? __pfx_amdgpu_gem_create_ioctl+0x10/0x10 [amdgpu]
May 04 15:50:56 kernel: drm_ioctl_kernel+0xc9/0x170
May 04 15:50:56 kernel: drm_ioctl+0x22f/0x410
May 04 15:50:56 kernel: ? __pfx_amdgpu_gem_create_ioctl+0x10/0x10 [amdgpu]
May 04 15:50:56 kernel: amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
May 04 15:50:56 kernel: __x64_sys_ioctl+0x90/0xd0
May 04 15:50:56 kernel: do_syscall_64+0x5b/0x80
May 04 15:50:56 kernel: ? syscall_exit_to_user_mode+0x17/0x40
May 04 15:50:56 kernel: ? do_syscall_64+0x67/0x80
May 04 15:50:56 kernel: ? up_read+0x37/0x70
May 04 15:50:56 kernel: ? do_user_addr_fault+0x1ef/0x710
May 04 15:50:56 kernel: ? __x64_sys_gettimeofday+0xb8/0xd0
May 04 15:50:56 kernel: ? exc_page_fault+0x70/0x170
May 04 15:50:56 kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
May 04 15:50:56 kernel: RIP: 0033:0x7fcb61e22d6f
May 04 15:50:56 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 4
4 24 18 64 48 2b 04 25 28 00 00
May 04 15:50:56 kernel: RSP: 002b:00007ffe367bd530 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May 04 15:50:56 kernel: RAX: ffffffffffffffda RBX: 000022f400bf6760 RCX: 00007fcb61e22d6f
May 04 15:50:56 kernel: RDX: 00007ffe367bd5d0 RSI: 00000000c0206440 RDI: 0000000000000016
May 04 15:50:56 kernel: RBP: 00007ffe367bd5d0 R08: 00000000000000a0 R09: 0000000000000010
May 04 15:50:56 kernel: R10: 00007ffe367f0080 R11: 0000000000000246 R12: 00000000c0206440
May 04 15:50:56 kernel: R13: 0000000000000016 R14: 000022f400267800 R15: 0000000000000013
May 04 15:50:56 kernel: </TASK>
May 04 15:50:56 kernel: INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 2.215 msecs
May 04 15:50:56 kernel: Sending NMI from CPU 8 to CPUs 13:
May 04 15:50:56 kernel: NMI backtrace for cpu 13
...
Hardware description:
Operating System: Fedora Linux 37 KDE Plasma Version: 5.27.4 KDE Frameworks Version: 5.105.0 Qt Version: 5.15.9 Kernel Version: 6.2.14-200.fc37.x86_64 (64-bit) Graphics Platform: Wayland Processors: 16 × AMD Ryzen 7 PRO 6850U with Radeon Graphics Memory: 14,4 GiB of RAM Graphics Processor: AMD Radeon Graphics 680M Manufacturer: LENOVO Product Name: 21CRS0DG00 System Version: ThinkPad T14s Gen 3
- Display(s): 1
- Type of Display Connection: eDP
Kernel args:
rhgb quiet root=UUID=ea0f798c-a587-4568-babd-3b30ac2cd197 rootflags=subvol=@rootacpi.ec_no_wakeup=1 amd_pstate=passive quiet splash rw acpi_mask_gpe=0x0e gpiolib_acpi.ignore_interrupt=AMDI0030:00@18 amd_pmc.dyndbg=+p pm_debug_messages
How to reproduce the issue:
Saw it once so far; the system was having some trouble suspending via closing the lid (kept being woken up by IRQ 7 and 9), then entered into this state on its own while not being interacted with.
Attached files:
In the attached kernel log file the error starts at 15:49:47 s2idle_cpu_fail.txt