AD107 nouveau kernel driver crash after login to gnome
Hi, I'm sorry to report this crash in nouveau driver:
mar 11 07:47:25 migoG17 kernel: ------------[ cut here ]------------
mar 11 07:47:25 migoG17 kernel: WARNING: CPU: 12 PID: 2235 at drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:112 r535_gsp_msgq_wait+0x1b7/0x1e0 [nouveau]
mar 11 07:47:25 migoG17 kernel: Modules linked in: ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt>
mar 11 07:47:25 migoG17 kernel: pps_core videobuf2_v4l2 sha1_ssse3 btmtk snd_pci_acp6x snd_hda_scodec_cs35l41_i2c nouveau aesni_intel snd_hda_scodec_cs35l41 snd_hwdep snd>
mar 11 07:47:25 migoG17 kernel: xhci_pci video cec xhci_pci_renesas nvme_auth wmi
mar 11 07:47:25 migoG17 kernel: CPU: 12 PID: 2235 Comm: fwupd Not tainted 6.7.9-arch1-1 #1 ad54415bbff2f0801422a3b76df850f68e71ecab
mar 11 07:47:25 migoG17 kernel: Hardware name: ASUSTeK COMPUTER INC. ROG Strix G713PV_G713PV/G713PV, BIOS G713PV.329 01/22/2024
mar 11 07:47:25 migoG17 kernel: RIP: 0010:r535_gsp_msgq_wait+0x1b7/0x1e0 [nouveau]
mar 11 07:47:25 migoG17 kernel: Code: 72 36 48 89 da 48 81 c3 ff 0f 00 00 e8 92 ed 26 e4 48 c1 eb 0c 41 01 dd 0f ae f0 49 8b 87 90 08 00 00 44 89 28 e9 8a fe ff ff <0f> 0b>
mar 11 07:47:25 migoG17 kernel: RSP: 0018:ffffb202969cf9a0 EFLAGS: 00010246
mar 11 07:47:25 migoG17 kernel: RAX: 0000000000000000 RBX: 000000000000000b RCX: 000000000000fed0
mar 11 07:47:25 migoG17 kernel: RDX: 0000000000000000 RSI: 0000000055555554 RDI: ffffb202969cf8f8
mar 11 07:47:25 migoG17 kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
mar 11 07:47:25 migoG17 kernel: R10: 0000000000000001 R11: 0000000000000100 R12: 0000000000000020
mar 11 07:47:25 migoG17 kernel: R13: ffffb202969cf9e8 R14: 0000000000000020 R15: ffff9ae0c104f000
mar 11 07:47:25 migoG17 kernel: FS: 0000780b2719ba40(0000) GS:ffff9ae81db00000(0000) knlGS:0000000000000000
mar 11 07:47:25 migoG17 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
mar 11 07:47:25 migoG17 kernel: CR2: 00007ffc6b29ea68 CR3: 000000010611e000 CR4: 0000000000f50ef0
mar 11 07:47:25 migoG17 kernel: PKRU: 55555554
mar 11 07:47:25 migoG17 kernel: Call Trace:
mar 11 07:47:25 migoG17 kernel: <TASK>
mar 11 07:47:25 migoG17 kernel: ? r535_gsp_msgq_wait+0x1b7/0x1e0 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: ? __warn+0x81/0x130
mar 11 07:47:25 migoG17 kernel: ? r535_gsp_msgq_wait+0x1b7/0x1e0 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: ? report_bug+0x171/0x1a0
mar 11 07:47:25 migoG17 kernel: ? handle_bug+0x3c/0x80
mar 11 07:47:25 migoG17 kernel: ? exc_invalid_op+0x17/0x70
mar 11 07:47:25 migoG17 kernel: ? asm_exc_invalid_op+0x1a/0x20
mar 11 07:47:25 migoG17 kernel: ? r535_gsp_msgq_wait+0x1b7/0x1e0 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mar 11 07:47:25 migoG17 kernel: r535_gsp_msg_recv+0x4e/0x230 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: r535_gsp_rpc_send+0x1c6/0x2e0 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: r535_gsp_rpc_push+0x147/0x160 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: r535_gsp_rpc_rm_ctrl_push+0x40/0x130 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: r535_dp_aux_xfer+0x133/0x310 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: nvkm_uoutp_mthd+0x410/0xc00 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: nvkm_ioctl+0x10b/0x250 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: nvif_object_mthd+0xb4/0x200 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: nvif_outp_dp_aux_xfer+0xb6/0x210 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: nouveau_connector_aux_xfer+0xc9/0xf0 [nouveau 2a235b69a6974d1311bb627132569e1b1646c3a2]
mar 11 07:47:25 migoG17 kernel: drm_dp_dpcd_access+0xaa/0x130 [drm_display_helper 617af490bfddeea21de096e4e95ebcec5578ffb1]
mar 11 07:47:25 migoG17 kernel: drm_dp_dpcd_probe+0x43/0xf0 [drm_display_helper 617af490bfddeea21de096e4e95ebcec5578ffb1]
mar 11 07:47:25 migoG17 kernel: drm_dp_dpcd_read+0xc3/0x100 [drm_display_helper 617af490bfddeea21de096e4e95ebcec5578ffb1]
mar 11 07:47:25 migoG17 kernel: auxdev_read_iter+0xa1/0x1b0 [drm_display_helper 617af490bfddeea21de096e4e95ebcec5578ffb1]
mar 11 07:47:25 migoG17 kernel: vfs_read+0x1f3/0x320
mar 11 07:47:25 migoG17 kernel: ksys_read+0x6f/0xf0
mar 11 07:47:25 migoG17 kernel: do_syscall_64+0x61/0xe0
mar 11 07:47:25 migoG17 kernel: ? do_syscall_64+0x70/0xe0
mar 11 07:47:25 migoG17 kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mar 11 07:47:25 migoG17 kernel: ? do_syscall_64+0x70/0xe0
mar 11 07:47:25 migoG17 kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mar 11 07:47:25 migoG17 kernel: ? exc_page_fault+0x7f/0x180
mar 11 07:47:25 migoG17 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0x76
mar 11 07:47:25 migoG17 kernel: RIP: 0033:0x780b28def6bc
mar 11 07:47:25 migoG17 kernel: Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 d9 c0 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d>
mar 11 07:47:25 migoG17 kernel: RSP: 002b:00007ffd4d86c9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
mar 11 07:47:25 migoG17 kernel: RAX: ffffffffffffffda RBX: 0000586cd5107cb0 RCX: 0000780b28def6bc
mar 11 07:47:25 migoG17 kernel: RDX: 000000000000000d RSI: 0000586cd508d990 RDI: 000000000000000e
mar 11 07:47:25 migoG17 kernel: RBP: 000000000000000d R08: 0000000000000000 R09: 0000000000000000
mar 11 07:47:25 migoG17 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000586cd5109b70
mar 11 07:47:25 migoG17 kernel: R13: 0000586cd5111380 R14: 0000000000000000 R15: 000000000000000a
mar 11 07:47:25 migoG17 kernel: </TASK>
mar 11 07:47:25 migoG17 kernel: ---[ end trace 0000000000000000 ]---
dmesg attached:dmesg.log
hwinfo attached:hwinfo.log
root@migoG17:~/scripts# uname -a Linux migoG17 6.7.9-arch1-1 #1 (closed) SMP PREEMPT_DYNAMIC Fri, 08 Mar 2024 01:59:01 +0000 x86_64 GNU/Linux
I can workaround this crash with blacklisting nouveau module (not loading at boot time) and loading it manually from console from gnome session, dmesg attached:dmesg_OK.log
Hybrid GFX is working:
`migo@migoG17:~$ DRI_PRIME=1 MESA_LOADER_DRIVER_OVERRIDE=zink glxinfo | grep "OpenGL renderer"
OpenGL renderer string: zink Vulkan 1.3(NVIDIA GeForce RTX 4060 Laptop GPU (MESA_NVK))
migo@migoG17:~$ DRI_PRIME=1 glxinfo | grep "OpenGL renderer" OpenGL renderer string: NV197
migo@migoG17:~$ DRI_PRIME=1 MESA_LOADER_DRIVER_OVERRIDE=zink glxinfo | grep "OpenGL renderer"
OpenGL renderer string: zink Vulkan 1.3(NVIDIA GeForce RTX 4060 Laptop GPU (MESA_NVK))
migo@migoG17:~$ DRI_PRIME=1 vulkaninfo
VULKANINFO
Vulkan Instance Version: 1.3.279
Instance Extensions: count = 24
VK_EXT_acquire_drm_display : extension revision 1
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_direct_mode_display : extension revision 1
VK_EXT_display_surface_counter : extension revision 1
VK_EXT_headless_surface : extension revision 1
VK_EXT_surface_maintenance1 : extension revision 1
VK_EXT_swapchain_colorspace : extension revision 4
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_surface_protected_capabilities : extension revision 1
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
VK_LUNARG_direct_driver_loading : extension revision 1
Layers: count = 3
VK_LAYER_INTEL_nullhw (INTEL NULL HW) Vulkan version 1.1.73, layer version 1: Layer Extensions: count = 0
Devices: count = 2 GPU id = 0 (NVIDIA GeForce RTX 4060 Laptop GPU) Layer-Device Extensions: count = 0
GPU id = 1 (AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO))
Layer-Device Extensions: count = 0
VK_LAYER_MESA_device_select (Linux device selection layer) Vulkan version 1.3.211, layer version 1: Layer Extensions: count = 0
Devices: count = 2 GPU
id = 0 (NVIDIA GeForce RTX 4060 Laptop GPU) Layer-Device Extensions: count = 0
GPU id = 1 (AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO))
Layer-Device Extensions: count = 0
VK_LAYER_MESA_overlay (Mesa Overlay layer) Vulkan version 1.3.211, layer version 1: Layer Extensions: count = 0
Devices: count = 2 GPU id = 0 (NVIDIA GeForce RTX 4060 Laptop GPU) Layer-Device Extensions: count = 0
GPU id = 1 (AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO))
Layer-Device Extensions: count = 0
` power management is working too:
migo@migoG17:~$ cat /sys/class/drm/card*/device/power_state D3cold D0
During hibernation attempt system freezes.
Thank you very much for your effort to bring us true open source drivers for NVidia!
:( Sorry for the lame formatting...
Milan