Kernel 6.3.9: crash when starting nouveaudrmfb with eDP laptop display
Since the OpenSuse Tumbleweed updated from kernel 6.3.7 to 6.3.9, the screen turns black during the boot process. It happens right at the moment it used to modeset and display a splash screen. (Rest of system boots normally and I have SSH access to help get other important informations; reverting to 6.3.7 the problem goes away).
During the unsuccessful boot, this message from nouveaudrmfb shows up in journalctl:
jun 27 09:57:29 argo kernel: fbcon: nouveaudrmfb (fb0) is primary device
jun 27 09:57:29 argo kernel: ------------[ cut here ]------------
jun 27 09:57:29 argo kernel: WARNING: CPU: 1 PID: 90 at drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c:497 nvkm_dp_acquire+0x50a/0x750 [nouveau]
jun 27 09:57:29 argo kernel: Modules linked in: uas usb_storage nouveau(+) crct10dif_pclmul crc32_pclmul polyval_generic gf128mul ghash_clmulni_intel pcmcia sha512_ssse3 firewire_ohci drm_ttm_helper ehci_pci ttm sdhci_pci i2c_algo_bit mxm_wmi ehci_hcd drm_display_helper aesni_intel cec crypto_simd cqhci firewire_core yenta_socket sdhci pcmcia_rsrc cryptd mmc_core crc_itu_t usbcore rc_core pcmcia_core battery video wmi button serio_raw z3fold lz4hc lz4hc_compress lz4 lz4_compress btrfs blake2b_generic xor raid6_pq libcrc32c crc32c_intel dm_mirror dm_region_hash dm_log v4l2loopback(O) videodev mc sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua ledtrig_timer msr efivarfs
jun 27 09:57:29 argo kernel: CPU: 1 PID: 90 Comm: kworker/u16:4 Tainted: G O 6.3.9-1-default #1 openSUSE Tumbleweed 4b767630dbc263131e96e89ef291fd4fd2951892
jun 27 09:57:29 argo kernel: Hardware name: Dell Inc. Latitude E6510/0N5KHN, BIOS A17 05/12/2017
jun 27 09:57:29 argo kernel: Workqueue: nvkm-disp nv50_disp_super [nouveau]
jun 27 09:57:29 argo kernel: RIP: 0010:nvkm_dp_acquire+0x50a/0x750 [nouveau]
jun 27 09:57:29 argo kernel: Code: 03 0f 85 0a 02 00 00 a8 04 0f 84 02 02 00 00 83 c2 01 39 fa 75 ca 41 8b 85 28 01 00 00 85 c0 0f 85 36 fc ff ff e9 d6 fb ff ff <0f> 0b c1 e8 03 45 88 66 62 44 89 fe 4c 89 ef 48 69 c0 cf 0d d6 26
jun 27 09:57:29 argo kernel: RSP: 0018:ffffa52280417d60 EFLAGS: 00010246
jun 27 09:57:29 argo kernel: RAX: 0000000000041eb0 RBX: 0000000000062b1a RCX: 0000000000041eb0
jun 27 09:57:29 argo kernel: RDX: ffffffffc0b63e00 RSI: 0000000000000002 RDI: ffffa52280417cf0
jun 27 09:57:29 argo kernel: RBP: 00000000ffffffea R08: 0000000000000001 R09: 0000000000005b76
jun 27 09:57:29 argo kernel: R10: 0000000000000009 R11: ffffa52280417de8 R12: 0000000000000001
jun 27 09:57:29 argo kernel: R13: ffff98d28f13b600 R14: ffff98d28218e480 R15: 0000000000000000
jun 27 09:57:29 argo kernel: FS: 0000000000000000(0000) GS:ffff98d3a7a80000(0000) knlGS:0000000000000000
jun 27 09:57:29 argo kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jun 27 09:57:29 argo kernel: CR2: 00007f709c87fe70 CR3: 000000010ac36006 CR4: 00000000000206e0
jun 27 09:57:29 argo kernel: Call Trace:
jun 27 09:57:29 argo kernel: <TASK>
jun 27 09:57:29 argo kernel: ? nvkm_dp_acquire+0x50a/0x750 [nouveau d83fa5c1e9d0d2d5a178e32fe1324e871c232aae]
jun 27 09:57:29 argo kernel: ? __warn+0x81/0x130
jun 27 09:57:29 argo kernel: ? nvkm_dp_acquire+0x50a/0x750 [nouveau d83fa5c1e9d0d2d5a178e32fe1324e871c232aae]
jun 27 09:57:29 argo kernel: ? report_bug+0x171/0x1a0
jun 27 09:57:29 argo kernel: ? handle_bug+0x3c/0x80
jun 27 09:57:29 argo kernel: ? exc_invalid_op+0x17/0x70
jun 27 09:57:29 argo kernel: ? asm_exc_invalid_op+0x1a/0x20
jun 27 09:57:29 argo kernel: ? __pfx_init_done+0x10/0x10 [nouveau d83fa5c1e9d0d2d5a178e32fe1324e871c232aae]
jun 27 09:57:29 argo kernel: ? nvkm_dp_acquire+0x50a/0x750 [nouveau d83fa5c1e9d0d2d5a178e32fe1324e871c232aae]
jun 27 09:57:29 argo kernel: nv50_disp_super_2_2+0x6d/0x430 [nouveau d83fa5c1e9d0d2d5a178e32fe1324e871c232aae]
jun 27 09:57:29 argo kernel: nv50_disp_super+0x117/0x230 [nouveau d83fa5c1e9d0d2d5a178e32fe1324e871c232aae]
jun 27 09:57:29 argo kernel: process_one_work+0x20a/0x420
jun 27 09:57:29 argo kernel: worker_thread+0x4e/0x3b0
jun 27 09:57:29 argo kernel: ? __pfx_worker_thread+0x10/0x10
jun 27 09:57:29 argo kernel: kthread+0xde/0x110
jun 27 09:57:29 argo kernel: ? __pfx_kthread+0x10/0x10
jun 27 09:57:29 argo kernel: ret_from_fork+0x2c/0x50
jun 27 09:57:29 argo kernel: </TASK>
jun 27 09:57:29 argo kernel: ---[ end trace 0000000000000000 ]---
jun 27 09:57:29 argo kernel: Console: switching to colour frame buffer device 200x56
jun 27 09:57:29 argo kernel: nouveau 0000:01:00.0: [drm] fb0: nouveaudrmfb frame buffer device
whereas, with the older kernel:
jun 27 10:04:17 argo kernel: fbcon: nouveaudrmfb (fb0) is primary device
jun 27 10:04:17 argo kernel: Console: switching to colour frame buffer device 200x56
jun 27 10:04:17 argo kernel: nouveau 0000:01:00.0: [drm] fb0: nouveaudrmfb frame buffer device
Graphic cars is a very old one:
# lspci
…
01:00.0 VGA compatible controller: NVIDIA Corporation GT218M [NVS 3100M] (rev a2)
and from the kernel:
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: vgaarb: deactivate vga console
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: NVIDIA GT218 (0a8600b1)
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: bios: version 70.18.53.01.07
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: fb: 512 MiB DDR3
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: VRAM: 512 MiB
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: TMDS table version 2.0
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB version 4.0
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB outp 00: 048003b6 0f220014
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB outp 01: 02033300 00000000
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB outp 02: 028113a6 0f220010
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB outp 03: 02011362 00020010
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB outp 04: 088223c6 0f220010
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB outp 05: 08022382 00020010
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB conn 00: 00002047
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB conn 01: 00101146
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB conn 02: 00410246
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: DCB conn 03: 00000300
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
jun 27 09:57:28 argo kernel: nouveau 0000:01:00.0: DRM: Skipping nv_backlight registration
jun 27 09:57:28 argo kernel: [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
Xorg's log show no error message. So it seems it's specifically the mode setting that is broken.
What other information would be useful for this bug?
Update: this only happens when trying to initialise the laptop's eDP display. Other displays initialise and display normally. And desktop user bellow report failure to replicate.