[regression 6.0.12->6.1.1] Multiple amdgpu crashes (update_mst_stream_alloc_table/drm_dp_atomic_find_time_slots)
Brief summary of the problem:
Booting the system with 2 external monitors attached (via USB adapter) causes the system to freeze. The first external monitor shows a frozen mouse cursor on a black background. I cannot switch to a text console (Ctrl-Alt-F1) but I can log in via SSH.
[ 4.133233] ------------[ cut here ]------------
[ 4.133237] WARNING: CPU: 4 PID: 106 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:3533 update_mst_stream_alloc_table+0x150/0x160 [amdgpu]
[ 4.133608] Modules linked in: bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj usbhid amdgpu drm_ttm_helper ttm gpu_sched rtsx_pci_sdmmc serio_raw drm_buddy atkbd mmc_core libps2 vivaldi_fmap drm_display_helper crc32c_intel xhci_pci video cec xhci_pci_renesas i8042 rtsx_pci serio wmi
[ 4.133632] CPU: 4 PID: 106 Comm: kworker/4:1 Not tainted 6.1.1-arch1-1 #1 9bd09188b430be630e611f984454e4f3c489be77
[ 4.133636] Hardware name: HP HP ProBook 445 G6/85D9, BIOS R80 Ver. 01.21.01 07/28/2022
[ 4.133638] Workqueue: events_long drm_dp_mst_link_probe_work [drm_display_helper]
[ 4.133655] RIP: 0010:update_mst_stream_alloc_table+0x150/0x160 [amdgpu]
[ 4.134013] Code: 00 00 75 2d 48 81 c4 98 00 00 00 5b 5d 41 5c e9 ba fb 54 d7 41 0f b7 40 04 4d 89 19 49 89 59 08 66 41 89 41 10 e9 71 ff ff ff <0f> 0b e9 fe fe ff ff e8 24 c6 16 d7 0f 1f 40 00 0f 1f 44 00 00 55
[ 4.134015] RSP: 0018:ffffc1b980553680 EFLAGS: 00010202
[ 4.134018] RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000000
[ 4.134019] RDX: 0000000000000000 RSI: ffffc1b980553680 RDI: ffffc1b980553710
[ 4.134020] RBP: ffffa10043100aa0 R08: ffffc1b980553740 R09: ffffc1b980553488
[ 4.134022] R10: ffffa1004b8d2c00 R11: ffffa1004ca9d540 R12: 0000000000000002
[ 4.134023] R13: ffffa1004b6b5800 R14: ffffffffc0c1c4c0 R15: 0000000000000000
[ 4.134024] FS: 0000000000000000(0000) GS:ffffa1035ff00000(0000) knlGS:0000000000000000
[ 4.134026] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4.134027] CR2: 00007f5a354cf178 CR3: 00000002c2a10000 CR4: 00000000003506e0
[ 4.134029] Call Trace:
[ 4.134032] <TASK>
[ 4.134036] dc_link_allocate_mst_payload+0x99/0x2a0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.134393] core_link_enable_stream+0x7d0/0x980 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.134748] ? optc1_set_drr+0x13a/0x1e0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.135110] dce110_apply_ctx_to_hw+0x67b/0x720 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.135465] ? dm_read_reg_func+0x3b/0xb0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.135831] dc_commit_state_no_check+0x38c/0xc70 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.136186] dc_commit_state+0x96/0x110 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.136541] amdgpu_dm_atomic_commit_tail+0x4a4/0x2ae0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
[ 4.136903] ? ktime_get_raw+0x35/0x90
[ 4.136909] ? __alloc_pages+0xf8/0x250
[ 4.136914] ? allocate_slab+0x25d/0x4a0
[ 4.136919] ? drm_atomic_helper_setup_commit+0x1c0/0x840
[ 4.136924] ? dma_resv_iter_first_unlocked+0x66/0x70
[ 4.136927] ? dma_resv_get_fences+0x61/0x220
[ 4.136931] ? wait_for_completion_timeout+0x13e/0x170
[ 4.136935] ? wait_for_completion_interruptible+0x139/0x1e0
[ 4.136938] commit_tail+0x94/0x130
[ 4.136942] drm_atomic_helper_commit+0x116/0x140
[ 4.136946] drm_atomic_commit+0x7b/0x100
[ 4.136949] ? drm_plane_get_damage_clips.cold+0x1c/0x1c
[ 4.136953] drm_client_modeset_commit_atomic+0x206/0x250
[ 4.136958] drm_client_modeset_commit_locked+0x5a/0x160
[ 4.136962] drm_client_modeset_commit+0x25/0x40
[ 4.136965] drm_fb_helper_set_par+0xa2/0xe0
[ 4.136968] drm_fb_helper_hotplug_event+0xa3/0xf0
[ 4.136971] drm_kms_helper_hotplug_event+0x2a/0x40
[ 4.136974] process_one_work+0x1c7/0x380
[ 4.136979] worker_thread+0x51/0x390
[ 4.136982] ? rescuer_thread+0x3b0/0x3b0
[ 4.136985] kthread+0xde/0x110
[ 4.136988] ? kthread_complete_and_exit+0x20/0x20
[ 4.136991] ret_from_fork+0x22/0x30
[ 4.136997] </TASK>
[ 4.136998] ---[ end trace 0000000000000000 ]---
Hardware description:
- CPU: AMD Ryzen 2500u
- GPU: 04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev c4)
- System Memory: 16 GB DDR4(?)
- Display(s): internal 1080p panel (HP), 2x external 1440p (Lenovo)
- Type of Display Connection: USB-C to dual DisplayPort adapter (Monoprice)
System information:
- Distro name and Version: Arch Linux
- Kernel version: Linux version 6.1.1-arch1-1 (linux@archlinux) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39.0) #1 (closed) SMP PREEMPT_DYNAMIC Wed, 21 Dec 2022 22:27:55 +0000
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
Simply booting the system with 2 external monitors attached (via USB adapter). I use an X11 display manager which starts automatically.
Attached files:
Log files (for system lockups / game freezes / crashes)
I will attach these shortly (I need to reboot back into 6.1.1).
Designs
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Author
The backtrace varies from one boot to the next. Here is another:(edit: Actually there is a second backtrace later in the log that I didn't notice before.)[ 4.103331] ------------[ cut here ]------------ [ 4.103336] WARNING: CPU: 7 PID: 108 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:3533 update_mst_stream_alloc_table+0x150/0x160 [amdgpu] [ 4.104042] Modules linked in: bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj usbhid amdgpu rtsx_pci_sdmmc drm_ttm_helper serio_raw atkbd ttm libps2 mmc_core vivaldi_fmap gpu_sched drm_buddy crc32c_intel drm_display_helper xhci_pci cec i8042 xhci_pci_renesas rtsx_pci video serio wmi [ 4.104087] CPU: 7 PID: 108 Comm: kworker/7:1 Not tainted 6.1.1-arch1-1 #1 9bd09188b430be630e611f984454e4f3c489be77 [ 4.104092] Hardware name: HP HP ProBook 445 G6/85D9, BIOS R80 Ver. 01.21.01 07/28/2022 [ 4.104094] Workqueue: events_long drm_dp_mst_link_probe_work [drm_display_helper] [ 4.104116] RIP: 0010:update_mst_stream_alloc_table+0x150/0x160 [amdgpu] [ 4.104528] Code: 00 00 75 2d 48 81 c4 98 00 00 00 5b 5d 41 5c e9 ba 2b 2d e0 41 0f b7 40 04 4d 89 19 49 89 59 08 66 41 89 41 10 e9 71 ff ff ff <0f> 0b e9 fe fe ff ff e8 24 f6 ee df 0f 1f 40 00 0f 1f 44 00 00 55 [ 4.104530] RSP: 0018:ffffc15d40543680 EFLAGS: 00010202 [ 4.104533] RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000000 [ 4.104534] RDX: 0000000000000000 RSI: ffffc15d40543680 RDI: ffffc15d40543710 [ 4.104535] RBP: ffff9c4349dc0aa0 R08: ffffc15d40543740 R09: ffffc15d40543488 [ 4.104536] R10: ffff9c4340b01400 R11: ffff9c43491f9180 R12: 0000000000000002 [ 4.104538] R13: ffff9c4347b86800 R14: ffffffffc0c994c0 R15: 0000000000000000 [ 4.104539] FS: 0000000000000000(0000) GS:ffff9c465ffc0000(0000) knlGS:0000000000000000 [ 4.104541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.104542] CR2: 000056090f39a050 CR3: 000000029b410000 CR4: 00000000003506e0 [ 4.104544] Call Trace: [ 4.104547] <TASK> [ 4.104551] dc_link_allocate_mst_payload+0x99/0x2a0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.104909] core_link_enable_stream+0x7d0/0x980 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.105263] ? optc1_set_drr+0x13a/0x1e0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.105633] dce110_apply_ctx_to_hw+0x67b/0x720 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.105987] ? dm_read_reg_func+0x3b/0xb0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.106348] dc_commit_state_no_check+0x38c/0xc70 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.106704] dc_commit_state+0x96/0x110 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.107058] amdgpu_dm_atomic_commit_tail+0x4a4/0x2ae0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.107418] ? recalibrate_cpu_khz+0x10/0x10 [ 4.107424] ? ktime_get_raw+0x35/0x90 [ 4.107428] ? dcn_validate_bandwidth+0x19b5/0x1f20 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.107794] ? dc_fpu_end+0x97/0xb0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.108152] ? dcn10_validate_bandwidth+0x47/0x60 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.108514] ? dc_validate_global_state+0x310/0x3e0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 4.108868] ? dma_resv_iter_first_unlocked+0x66/0x70 [ 4.108872] ? dma_resv_get_fences+0x61/0x220 [ 4.108876] ? wait_for_completion_timeout+0x13e/0x170 [ 4.108880] ? wait_for_completion_interruptible+0x139/0x1e0 [ 4.108884] commit_tail+0x94/0x130 [ 4.108889] drm_atomic_helper_commit+0x116/0x140 [ 4.108892] drm_atomic_commit+0x7b/0x100 [ 4.108896] ? drm_plane_get_damage_clips.cold+0x1c/0x1c [ 4.108900] drm_client_modeset_commit_atomic+0x206/0x250 [ 4.108905] drm_client_modeset_commit_locked+0x5a/0x160 [ 4.108909] drm_client_modeset_commit+0x25/0x40 [ 4.108912] drm_fb_helper_set_par+0xa2/0xe0 [ 4.108916] drm_fb_helper_hotplug_event+0xa3/0xf0 [ 4.108918] drm_kms_helper_hotplug_event+0x2a/0x40 [ 4.108922] process_one_work+0x1c7/0x380 [ 4.108927] worker_thread+0x51/0x390 [ 4.108933] ? rescuer_thread+0x3b0/0x3b0 [ 4.108936] kthread+0xde/0x110 [ 4.108939] ? kthread_complete_and_exit+0x20/0x20 [ 4.108942] ret_from_fork+0x22/0x30 [ 4.108948] </TASK> [ 4.108949] ---[ end trace 0000000000000000 ]--- ... (messages unrelated to amdgpu) ... [ 7.305065] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 7.305076] #PF: supervisor read access in kernel mode [ 7.305080] #PF: error_code(0x0000) - not-present page [ 7.305084] PGD 0 P4D 0 [ 7.305091] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 7.305097] CPU: 7 PID: 550 Comm: Xorg Tainted: G W OE 6.1.1-arch1-1 #1 9bd09188b430be630e611f984454e4f3c489be77 [ 7.305105] Hardware name: HP HP ProBook 445 G6/85D9, BIOS R80 Ver. 01.21.01 07/28/2022 [ 7.305109] RIP: 0010:drm_dp_atomic_find_time_slots+0x61/0x2a0 [drm_display_helper] [ 7.305146] Code: 00 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d ce 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 [ 7.305152] RSP: 0018:ffffc15d40f2b710 EFLAGS: 00010293 [ 7.305157] RAX: 0000000000000000 RBX: ffff9c434c292b00 RCX: 0000000000000214 [ 7.305161] RDX: ffff9c4342b2b600 RSI: ffff9c4349994540 RDI: ffff9c434c292b00 [ 7.305164] RBP: ffff9c4347b84800 R08: 0000000000000001 R09: ffff9c4349bc5050 [ 7.305167] R10: ffffc15d40f2b830 R11: 0000000042b8f9c0 R12: 026d60dce16e8423 [ 7.305171] R13: ffff9c4342b8f9c0 R14: ffff9c4349994540 R15: 0000000000000214 [ 7.305174] FS: 00007f9611c09400(0000) GS:ffff9c465ffc0000(0000) knlGS:0000000000000000 [ 7.305179] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7.305182] CR2: 0000000000000008 CR3: 0000000119cc8000 CR4: 00000000003506e0 [ 7.305186] Call Trace: [ 7.305191] <TASK> [ 7.305199] compute_mst_dsc_configs_for_link+0x31d/0x9d0 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 7.305929] ? __alloc_pages+0x50/0x250 [ 7.305938] ? __mod_lruvec_page_state+0x10d/0x140 [ 7.305946] ? get_page_from_freelist+0x1508/0x1680 [ 7.305962] compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 7.306574] amdgpu_dm_atomic_check+0xf81/0x1230 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 7.307282] drm_atomic_check_only+0x537/0xba0 [ 7.307292] drm_atomic_commit+0x5c/0x100 [ 7.307298] ? drm_plane_get_damage_clips.cold+0x1c/0x1c [ 7.307306] drm_atomic_helper_set_config+0x74/0xb0 [ 7.307314] drm_mode_setcrtc+0x43d/0x860 [ 7.307325] ? drm_mode_getcrtc+0x180/0x180 [ 7.307331] drm_ioctl_kernel+0xcd/0x170 [ 7.307338] drm_ioctl+0x1eb/0x450 [ 7.307343] ? drm_mode_getcrtc+0x180/0x180 [ 7.307353] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 7.307940] __x64_sys_ioctl+0x94/0xd0 [ 7.307948] do_syscall_64+0x5f/0x90 [ 7.307957] ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6] [ 7.308521] ? __x64_sys_ioctl+0xac/0xd0 [ 7.308528] ? syscall_exit_to_user_mode+0x1b/0x40 [ 7.308536] ? do_syscall_64+0x6b/0x90 [ 7.308543] ? do_syscall_64+0x6b/0x90 [ 7.308550] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 7.308559] RIP: 0033:0x7f9612586c0f [ 7.308588] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 7.308592] RSP: 002b:00007ffd07cb2820 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 7.308598] RAX: ffffffffffffffda RBX: 0000560ae5d23cf0 RCX: 00007f9612586c0f [ 7.308602] RDX: 00007ffd07cb28b0 RSI: 00000000c06864a2 RDI: 000000000000000f [ 7.308605] RBP: 00007ffd07cb28b0 R08: 0000000000000000 R09: 0000560ae5e9cbc0 [ 7.308608] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c06864a2 [ 7.308612] R13: 000000000000000f R14: 0000560ae5e9cbc0 R15: 0000000000000000 [ 7.308621] </TASK> [ 7.308624] Modules linked in: intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic rtw88_8822be kvm snd_hda_codec_hdmi rtw88_8822b ledtrig_audio irqbypass crct10dif_pclmul rtw88_pci snd_hda_intel uvcvideo crc32_pclmul videobuf2_vmalloc rtw88_core snd_intel_dspcfg polyval_clmulni polyval_generic snd_intel_sdw_acpi videobuf2_memops gf128mul btusb videobuf2_v4l2 ghash_clmulni_intel btrtl hid_multitouch snd_hda_codec sha512_ssse3 mac80211 r8169 btbcm videobuf2_common aesni_intel snd_hda_core nls_iso8859_1 btintel snd_hwdep libarc4 ucsi_acpi vfat crypto_simd fat videodev btmtk snd_pcm realtek cryptd cfg80211 rapl hp_wmi typec_ucsi sp5100_tco snd_timer mdio_devres sparse_keymap bluetooth psmouse mc joydev mousedev ecdh_generic platform_profile wmi_bmof typec k10temp i2c_piix4 snd libphy rfkill ccp soundcore roles hp_accel i2c_hid_acpi i2c_amd_mp2_plat i2c_hid wireless_hotkey lis3lv02d i2c_amd_mp2_pci acpi_cpufreq mac_hid vboxnetflt(OE) [ 7.308748] vboxnetadp(OE) vboxdrv(OE) sg fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj usbhid amdgpu rtsx_pci_sdmmc drm_ttm_helper serio_raw atkbd ttm libps2 mmc_core vivaldi_fmap gpu_sched drm_buddy crc32c_intel drm_display_helper xhci_pci cec i8042 xhci_pci_renesas rtsx_pci video serio wmi [ 7.308803] CR2: 0000000000000008 [ 7.308807] ---[ end trace 0000000000000000 ]--- [ 7.308810] RIP: 0010:drm_dp_atomic_find_time_slots+0x61/0x2a0 [drm_display_helper] [ 7.308841] Code: 00 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d ce 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 [ 7.308846] RSP: 0018:ffffc15d40f2b710 EFLAGS: 00010293 [ 7.308850] RAX: 0000000000000000 RBX: ffff9c434c292b00 RCX: 0000000000000214 [ 7.308853] RDX: ffff9c4342b2b600 RSI: ffff9c4349994540 RDI: ffff9c434c292b00 [ 7.308857] RBP: ffff9c4347b84800 R08: 0000000000000001 R09: ffff9c4349bc5050 [ 7.308860] R10: ffffc15d40f2b830 R11: 0000000042b8f9c0 R12: 026d60dce16e8423 [ 7.308863] R13: ffff9c4342b8f9c0 R14: ffff9c4349994540 R15: 0000000000000214 [ 7.308866] FS: 00007f9611c09400(0000) GS:ffff9c465ffc0000(0000) knlGS:0000000000000000 [ 7.308871] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7.308874] CR2: 0000000000000008 CR3: 0000000119cc8000 CR4: 00000000003506e0
Edited by John Lindgren - Author
Duplicate. See #2171 (closed)
- Author
@moson-mo Thanks for the info. #2171 (closed) has a really long comment thread. Is there a known fix?
@jlindgren90 ATM just downgrading the kernel. DP-MST regresses almost every time a new kernel gets released unfortunately.
Edited by Niccolò Belli 1- Author
@moson-mo The second backtrace mentioned in #2314 (comment 1697491) still occurs with 6.1.9, and displays are not functional. So I don't think this is strictly a duplicate of #2171 (closed).
[ 6.876646] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 6.876653] #PF: supervisor read access in kernel mode [ 6.876655] #PF: error_code(0x0000) - not-present page [ 6.876657] PGD 0 P4D 0 [ 6.876661] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 6.876665] CPU: 2 PID: 555 Comm: Xorg Tainted: G OE 6.1.9-arch1-1 #1 194fa354eabeed4c73825d1889ee170f7f8942d0 [ 6.876669] Hardware name: HP HP ProBook 445 G6/85D9, BIOS R80 Ver. 01.21.01 07/28/2022 [ 6.876671] RIP: 0010:drm_dp_atomic_find_time_slots+0x61/0x2a0 [drm_display_helper] [ 6.876690] Code: 00 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d ce 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 [ 6.876693] RSP: 0018:ffffad1ec0fd7730 EFLAGS: 00010293 [ 6.876696] RAX: 0000000000000000 RBX: ffff89b0c6f23480 RCX: 0000000000000214 [ 6.876698] RDX: ffff89b0c9a76200 RSI: ffff89b0caa32540 RDI: ffff89b0c6f23480 [ 6.876699] RBP: ffff89b0c9641000 R08: 0000000000000001 R09: ffff89b0c7ac2850 [ 6.876701] R10: ffffad1ec0fd7850 R11: 00000000c7b8bf60 R12: ffff89b0c6f23480 [ 6.876703] R13: ffff89b0c7b8bf60 R14: ffff89b0caa32540 R15: 0000000000000214 [ 6.876705] FS: 00007fa088ac2400(0000) GS:ffff89b3dfe80000(0000) knlGS:0000000000000000 [ 6.876707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.876709] CR2: 0000000000000008 CR3: 0000000105032000 CR4: 00000000003506e0 [ 6.876711] Call Trace: [ 6.876714] <TASK> [ 6.876719] compute_mst_dsc_configs_for_link+0x2d4/0x9b0 [amdgpu 3fbb80360ac59eb17c1b8a540570f0250dbff1ea] [ 6.877088] ? __mod_lruvec_page_state+0x10d/0x140 [ 6.877094] ? get_page_from_freelist+0x1508/0x1680 [ 6.877104] compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu 3fbb80360ac59eb17c1b8a540570f0250dbff1ea] [ 6.877461] amdgpu_dm_atomic_check+0x1067/0x12e0 [amdgpu 3fbb80360ac59eb17c1b8a540570f0250dbff1ea] [ 6.877818] drm_atomic_check_only+0x537/0xba0 [ 6.877824] drm_atomic_commit+0x5c/0x100 [ 6.877827] ? drm_plane_get_damage_clips.cold+0x1c/0x1c [ 6.877832] drm_atomic_helper_set_config+0x74/0xb0 [ 6.877837] drm_mode_setcrtc+0x43d/0x860 [ 6.877843] ? drm_mode_getcrtc+0x180/0x180 [ 6.877847] drm_ioctl_kernel+0xcd/0x170 [ 6.877849] ? _copy_to_user+0x25/0x30 [ 6.877855] drm_ioctl+0x1eb/0x450 [ 6.877857] ? drm_mode_getcrtc+0x180/0x180 [ 6.877862] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 3fbb80360ac59eb17c1b8a540570f0250dbff1ea] [ 6.878156] __x64_sys_ioctl+0x94/0xd0 [ 6.878161] do_syscall_64+0x5f/0x90 [ 6.878165] ? __x64_sys_ioctl+0xac/0xd0 [ 6.878169] ? syscall_exit_to_user_mode+0x1b/0x40 [ 6.878172] ? do_syscall_64+0x6b/0x90 [ 6.878174] ? syscall_exit_to_user_mode+0x1b/0x40 [ 6.878177] ? do_syscall_64+0x6b/0x90 [ 6.878180] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 6.878185] RIP: 0033:0x7fa08943eecf [ 6.878202] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 6.878204] RSP: 002b:00007ffff6f4cb30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 6.878207] RAX: ffffffffffffffda RBX: 0000558fc30ce780 RCX: 00007fa08943eecf [ 6.878209] RDX: 00007ffff6f4cbc0 RSI: 00000000c06864a2 RDI: 000000000000000f [ 6.878210] RBP: 00007ffff6f4cbc0 R08: 0000000000000000 R09: 0000558fc310e460 [ 6.878212] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c06864a2 [ 6.878213] R13: 000000000000000f R14: 0000558fc310e460 R15: 0000000000000000 [ 6.878218] </TASK> [ 6.878220] Modules linked in: intel_rapl_msr intel_rapl_common rtw88_8822be edac_mce_amd rtw88_8822b uvcvideo btusb rtw88_pci videobuf2_vmalloc kvm_amd snd_hda_codec_realtek btrtl rtw88_core btbcm snd_hda_codec_generic videobuf2_memops kvm ledtrig_audio snd_hda_codec_hdmi btintel irqbypass videobuf2_v4l2 snd_hda_intel btmtk mac80211 crct10dif_pclmul snd_intel_dspcfg crc32_pclmul snd_intel_sdw_acpi videobuf2_common nls_iso8859_1 libarc4 polyval_clmulni bluetooth snd_hda_codec videodev vfat polyval_generic gf128mul r8169 snd_hda_core ghash_clmulni_intel mc ecdh_generic fat hid_multitouch realtek sha512_ssse3 snd_hwdep cfg80211 snd_pcm aesni_intel ucsi_acpi typec_ucsi hp_wmi snd_timer sparse_keymap sp5100_tco mdio_devres crypto_simd cryptd snd rapl mousedev joydev psmouse platform_profile typec wmi_bmof i2c_piix4 hp_accel i2c_amd_mp2_plat soundcore ccp k10temp rfkill libphy roles i2c_amd_mp2_pci lis3lv02d i2c_hid_acpi i2c_hid wireless_hotkey acpi_cpufreq mac_hid vboxnetflt(OE) [ 6.878284] vboxnetadp(OE) vboxdrv(OE) sg fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj usbhid amdgpu serio_raw rtsx_pci_sdmmc atkbd drm_ttm_helper libps2 ttm mmc_core vivaldi_fmap gpu_sched drm_buddy crc32c_intel drm_display_helper xhci_pci cec rtsx_pci xhci_pci_renesas i8042 video serio wmi [ 6.878313] CR2: 0000000000000008 [ 6.878315] ---[ end trace 0000000000000000 ]--- [ 6.878316] RIP: 0010:drm_dp_atomic_find_time_slots+0x61/0x2a0 [drm_display_helper] [ 6.878332] Code: 00 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d ce 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 [ 6.878334] RSP: 0018:ffffad1ec0fd7730 EFLAGS: 00010293 [ 6.878336] RAX: 0000000000000000 RBX: ffff89b0c6f23480 RCX: 0000000000000214 [ 6.878338] RDX: ffff89b0c9a76200 RSI: ffff89b0caa32540 RDI: ffff89b0c6f23480 [ 6.878339] RBP: ffff89b0c9641000 R08: 0000000000000001 R09: ffff89b0c7ac2850 [ 6.878341] R10: ffffad1ec0fd7850 R11: 00000000c7b8bf60 R12: ffff89b0c6f23480 [ 6.878342] R13: ffff89b0c7b8bf60 R14: ffff89b0caa32540 R15: 0000000000000214 [ 6.878344] FS: 00007fa088ac2400(0000) GS:ffff89b3dfe80000(0000) knlGS:0000000000000000 [ 6.878346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.878348] CR2: 0000000000000008 CR3: 0000000105032000 CR4: 00000000003506e0
Collapse replies So I don't think this is strictly a duplicate
Well. Not anymore you could say. Despite the patches that came up with #2171 (closed) there are still issues indeed.
Edited by Mario Oenning- Developer
What's the context of this trace you got in 6.1.9? Booting, hotplugging, suspending?
- Author
Just booting (with the USB dock attached).
Edited by John Lindgren @jzuo So for me, booting without the hub connected works, and connecting the hub without any monitors connected also works fine. However, the system freezes as soon as I connect the first monitor to the hub. At this point, the system cannot recover anymore and won't even respond to SysRq.
- John Lindgren mentioned in issue #2171 (closed)
mentioned in issue #2171 (closed)
- John Lindgren changed title from [regression 6.0.12->6.1.1] amdgpu crash in update_mst_stream_alloc_table to [regression 6.0.12->6.1.1] Multiple amdgpu crashes (update_mst_stream_alloc_table/drm_dp_atomic_find_time_slots)
changed title from [regression 6.0.12->6.1.1] amdgpu crash in update_mst_stream_alloc_table to [regression 6.0.12->6.1.1] Multiple amdgpu crashes (update_mst_stream_alloc_table/drm_dp_atomic_find_time_slots)
- Mario Limonciello added DC label
added DC label
I confirm I still have issues as well with a similar setup. I have an HP EliteBook G5 with a Ryzen 2500U and a Vega 8 and an HP USB-C Dock G5. Hotplugging the dock after logging in (using KDE Plasma on X11) has the exact same issues as with kernel versions 6.1.0 to 6.1.9: the built-in screen goes black and I can only see it "projected" on one of the external screens (since the laptop's screen has a lower resolution, it takes up only a part of the picture), but it doesn't react to any inputs. In the logs I see this:
Feb 11 17:17:47 Sleipnir kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008 Feb 11 17:17:47 Sleipnir kernel: #PF: supervisor read access in kernel mode Feb 11 17:17:47 Sleipnir kernel: #PF: error_code(0x0000) - not-present page
I don't see any backtrace, though. I haven't tried to boot with the dock already plugged in, if I get any different results I will update this comment.
Hello,
I still have similar issues on the current
6.2-rc8
mainline kernel. I usually have my Thinkpad T495 (Ryzen 3500U with Vega 8 graphics) connected to a Gen2 Thinkpad USB-C Dock with two external monitors. Once I've logged in and switch into my window manager, the laptop freezes and only a hard-reset helps to recover.I've noticed that the issue isn't present on the
6.0
kernel. On some later kernels (seemingly before the fixes in #2171 (closed) were introduced) the device already freezes in theinitramfs
during early KMS.I spent the last few days bisecting this issue and found
4d07b0bc403403438d9cf88450506240c5faf92f
to be the commit that first introduced the bug. Here's thebisect
log:git bisect start # status: waiting for both good and bad commits # bad: [ceaa837f96adb69c0df0397937cd74991d5d821a] Linux 6.2-rc8 git bisect bad ceaa837f96adb69c0df0397937cd74991d5d821a # status: waiting for good commit(s), bad commit known # good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0 git bisect good 4fe89d07dcc2804c8b562f6c7896a45643d34b2f # bad: [14b651b22224251b35618259da714adb0b5f10ee] drm/amdgpu/dm/dp_mst: Don't grab mst_mgr->lock when computing DSC state git bisect bad 14b651b22224251b35618259da714adb0b5f10ee # bad: [247f34f7b80357943234f93f247a1ae6b6c3a740] Linux 6.1-rc2 git bisect bad 247f34f7b80357943234f93f247a1ae6b6c3a740 # bad: [4ae9f874dc1d662ce7bfdb8144903608bcc3706b] Merge tag 'drm-misc-next-2022-09-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-next git bisect bad 4ae9f874dc1d662ce7bfdb8144903608bcc3706b # good: [1c23f9e627a7b412978b4e852793c5e3c3efc555] Linux 6.0-rc2 git bisect good 1c23f9e627a7b412978b4e852793c5e3c3efc555 # bad: [02bcbd6bfc5932d4300b017dcd2ba7e7bbbffe79] drm/amd/display: Simplify bool conversion git bisect bad 02bcbd6bfc5932d4300b017dcd2ba7e7bbbffe79 # bad: [55453c0914d9b81e75c5c83adb2dd9382da2c79d] drm/bridge: ps8640: Add double reset T4 and T5 to power-on sequence git bisect bad 55453c0914d9b81e75c5c83adb2dd9382da2c79d # good: [0f877398d30e1df657a31a62f7c7de1869b072b5] drm/virtio: Unlock reservations on dma_resv_reserve_fences() error git bisect good 0f877398d30e1df657a31a62f7c7de1869b072b5 # good: [802fd5750faca181cade177642e0e5233ff25f85] drm/simpledrm: Remove pdev field from device structure git bisect good 802fd5750faca181cade177642e0e5233ff25f85 # good: [fcfd3e5fb2f052f6f466285107f449d462277a99] drm/lcdif: Clean up headers git bisect good fcfd3e5fb2f052f6f466285107f449d462277a99 # good: [2482fceed27b6951287e92e9f733533a657c2923] drm/display/dp_mst: Drop all ports from topology on CSNs before queueing link address work git bisect good 2482fceed27b6951287e92e9f733533a657c2923 # bad: [eb7de496451bd969e203f02f66585131228ba4ae] drm: fix drm_mipi_dbi build errors git bisect bad eb7de496451bd969e203f02f66585131228ba4ae # bad: [6acb416bf49f818dbf0aa71aee9f6cae93a505a4] drm/vc4: plane: protect device resources after removal git bisect bad 6acb416bf49f818dbf0aa71aee9f6cae93a505a4 # good: [01ad1d9c2888d51f2fb5b5ac88af8bd47d76937e] drm/radeon: Drop legacy MST support git bisect good 01ad1d9c2888d51f2fb5b5ac88af8bd47d76937e # bad: [227295df4e37de66b61bbb3d1f10436f0acd33cc] drm/vc4: hdmi: unlock mutex when device is unplugged git bisect bad 227295df4e37de66b61bbb3d1f10436f0acd33cc # bad: [4d07b0bc403403438d9cf88450506240c5faf92f] drm/display/dp_mst: Move all payload info into the atomic state git bisect bad 4d07b0bc403403438d9cf88450506240c5faf92f # first bad commit: [4d07b0bc403403438d9cf88450506240c5faf92f] drm/display/dp_mst: Move all payload info into the atomic state
Similarly to others still encountering this issue, the problem seems to occur in
drm_dp_atomic_find_time_slots
. This is the relevant part of the kernel log:Feb 22 10:43:01 kevin-t495 kernel: [drm] Downstream port present 1, type 2 Feb 22 10:43:02 kevin-t495 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008 Feb 22 10:43:02 kevin-t495 kernel: #PF: supervisor read access in kernel mode Feb 22 10:43:02 kevin-t495 kernel: #PF: error_code(0x0000) - not-present page Feb 22 10:43:02 kevin-t495 kernel: PGD 0 P4D 0 Feb 22 10:43:02 kevin-t495 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI Feb 22 10:43:02 kevin-t495 kernel: CPU: 3 PID: 1041 Comm: sway Not tainted 6.2.0-1-mainline #1 50b6fe34c84fd50e30f7900827999bc076f5f647 Feb 22 10:43:02 kevin-t495 kernel: Hardware name: LENOVO 20NKS01Y00/20NKS01Y00, BIOS R12ET61W(1.31 ) 07/28/2022 Feb 22 10:43:02 kevin-t495 kernel: RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper] Feb 22 10:43:02 kevin-t495 kernel: Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 Feb 22 10:43:02 kevin-t495 kernel: RSP: 0018:ffffbf6682b1b748 EFLAGS: 00010293 Feb 22 10:43:02 kevin-t495 kernel: RAX: 0000000000000000 RBX: ffffa0bd7cb0a300 RCX: 0000000000000214 Feb 22 10:43:02 kevin-t495 kernel: RDX: ffffa0bd40d9d200 RSI: ffffa0bd4afb4568 RDI: ffffa0bd7cb0a300 Feb 22 10:43:02 kevin-t495 kernel: RBP: ffffa0bd4d5e1800 R08: 0000000000000001 R09: ffffa0bd786da050 Feb 22 10:43:02 kevin-t495 kernel: R10: ffffbf6682b1b868 R11: 0000000000000000 R12: ffffa0bd7cb0a300 Feb 22 10:43:02 kevin-t495 kernel: R13: ffffa0bd4b1c34e0 R14: ffffa0bd4afb4568 R15: 0000000000000214 Feb 22 10:43:02 kevin-t495 kernel: FS: 00007f1edd991980(0000) GS:ffffa0bff0ac0000(0000) knlGS:0000000000000000 Feb 22 10:43:02 kevin-t495 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 22 10:43:02 kevin-t495 kernel: CR2: 0000000000000008 CR3: 000000010e5d0000 CR4: 00000000003506e0 Feb 22 10:43:02 kevin-t495 kernel: Call Trace: Feb 22 10:43:02 kevin-t495 kernel: <TASK> Feb 22 10:43:02 kevin-t495 kernel: compute_mst_dsc_configs_for_link+0x2d4/0x9b0 [amdgpu 231be2e9fd2fd013a38e0573982fa0a0e0c113fc] Feb 22 10:43:02 kevin-t495 kernel: compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu 231be2e9fd2fd013a38e0573982fa0a0e0c113fc] Feb 22 10:43:02 kevin-t495 kernel: amdgpu_dm_atomic_check+0xf33/0x11b0 [amdgpu 231be2e9fd2fd013a38e0573982fa0a0e0c113fc] Feb 22 10:43:02 kevin-t495 kernel: drm_atomic_check_only+0x5c0/0xa30 Feb 22 10:43:02 kevin-t495 kernel: drm_mode_atomic_ioctl+0x744/0xb70 Feb 22 10:43:02 kevin-t495 kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 Feb 22 10:43:02 kevin-t495 kernel: drm_ioctl_kernel+0xcd/0x170 Feb 22 10:43:02 kevin-t495 kernel: drm_ioctl+0x233/0x410 Feb 22 10:43:02 kevin-t495 kernel: ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 Feb 22 10:43:02 kevin-t495 kernel: amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 231be2e9fd2fd013a38e0573982fa0a0e0c113fc] Feb 22 10:43:02 kevin-t495 kernel: __x64_sys_ioctl+0x94/0xd0 Feb 22 10:43:02 kevin-t495 kernel: do_syscall_64+0x5f/0x90 Feb 22 10:43:02 kevin-t495 kernel: ? handle_mm_fault+0x103/0x300 Feb 22 10:43:02 kevin-t495 kernel: ? do_user_addr_fault+0x1e0/0x6a0 Feb 22 10:43:02 kevin-t495 kernel: ? do_syscall_64+0x6b/0x90 Feb 22 10:43:02 kevin-t495 kernel: ? exc_page_fault+0x74/0x170 Feb 22 10:43:02 kevin-t495 kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Feb 22 10:43:02 kevin-t495 kernel: RIP: 0033:0x7f1ede75753f Feb 22 10:43:02 kevin-t495 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 Feb 22 10:43:02 kevin-t495 kernel: RSP: 002b:00007ffe1f55d4d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Feb 22 10:43:02 kevin-t495 kernel: RAX: ffffffffffffffda RBX: 000055c9120336b0 RCX: 00007f1ede75753f Feb 22 10:43:02 kevin-t495 kernel: RDX: 00007ffe1f55d570 RSI: 00000000c03864bc RDI: 000000000000000b Feb 22 10:43:02 kevin-t495 kernel: RBP: 00007ffe1f55d570 R08: 0000000000000003 R09: 0000000000000003 Feb 22 10:43:02 kevin-t495 kernel: R10: 000055c910f23010 R11: 0000000000000246 R12: 00000000c03864bc Feb 22 10:43:02 kevin-t495 kernel: R13: 000000000000000b R14: 000055c911dad890 R15: 000055c911f71110 Feb 22 10:43:02 kevin-t495 kernel: </TASK> Feb 22 10:43:02 kevin-t495 kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq ccm cmac algif_hash algif_skcipher af_alg bnep snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci iwlmvm snd_sof_xtensa_dsp snd_ctl_led snd_sof snd_hda_codec_realtek snd_hda_codec_generic snd_sof_utils snd_hda_codec_hdmi joydev intel_rapl_msr mac80211 snd_soc_core snd_compress snd_hda_intel ac97_bus mousedev libarc4 snd_pcm_dmaengine snd_intel_dspcfg intel_rapl_common snd_usb_audio snd_pci_ps btusb snd_usbmidi_lib btrtl uvcvideo snd_rpl_pci_acp6x edac_mce_amd videobuf2_vmalloc btbcm snd_intel_sdw_acpi videobuf2_memops snd_rawmidi snd_acp_pci snd_hda_codec videobuf2_v4l2 btintel kvm_amd btmtk snd_seq_device cdc_ether kvm snd_pci_acp6x snd_hda_core snd_pci_acp5x irqbypass videodev iwlwifi usbnet snd_rn_pci_acp3x r8152 bluetooth snd_acp_config videobuf2_common sp5100_tco vfat snd_hwdep mc mii fat rapl snd_soc_acpi snd_pcm ecdh_generic r8169 think_lmi ucsi_acpi psmouse pcspkr snd_pci_acp3x wmi_bmof Feb 22 10:43:02 kevin-t495 kernel: firmware_attributes_class realtek typec_ucsi mdio_devres k10temp i2c_piix4 snd_timer typec cfg80211 ipmi_devintf libphy ipmi_msghandler roles i2c_scmi acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod usbhid amdgpu crct10dif_pclmul drm_ttm_helper crc32_pclmul crc32c_intel polyval_clmulni thinkpad_acpi ttm serio_raw polyval_generic sdhci_pci drm_buddy atkbd gf128mul libps2 vivaldi_fmap ghash_clmulni_intel sha512_ssse3 gpu_sched aesni_intel cqhci ledtrig_audio platform_profile crypto_simd nvme sdhci snd cryptd drm_display_helper nvme_core soundcore xhci_pci mmc_core rfkill xhci_pci_renesas cec ccp nvme_common video i8042 serio wmi Feb 22 10:43:02 kevin-t495 kernel: CR2: 0000000000000008 Feb 22 10:43:02 kevin-t495 kernel: ---[ end trace 0000000000000000 ]--- Feb 22 10:43:02 kevin-t495 kernel: RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper] Feb 22 10:43:02 kevin-t495 kernel: Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 Feb 22 10:43:02 kevin-t495 kernel: RSP: 0018:ffffbf6682b1b748 EFLAGS: 00010293 Feb 22 10:43:02 kevin-t495 kernel: RAX: 0000000000000000 RBX: ffffa0bd7cb0a300 RCX: 0000000000000214 Feb 22 10:43:02 kevin-t495 kernel: RDX: ffffa0bd40d9d200 RSI: ffffa0bd4afb4568 RDI: ffffa0bd7cb0a300 Feb 22 10:43:02 kevin-t495 kernel: RBP: ffffa0bd4d5e1800 R08: 0000000000000001 R09: ffffa0bd786da050 Feb 22 10:43:02 kevin-t495 kernel: R10: ffffbf6682b1b868 R11: 0000000000000000 R12: ffffa0bd7cb0a300 Feb 22 10:43:02 kevin-t495 kernel: R13: ffffa0bd4b1c34e0 R14: ffffa0bd4afb4568 R15: 0000000000000214 Feb 22 10:43:02 kevin-t495 kernel: FS: 00007f1edd991980(0000) GS:ffffa0bff0ac0000(0000) knlGS:0000000000000000 Feb 22 10:43:02 kevin-t495 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 22 10:43:02 kevin-t495 kernel: CR2: 0000000000000008 CR3: 000000010e5d0000 CR4: 00000000003506e0
I can also upload the system journal for all "bad" kernels in case that helps.
This regression is also present in the latest
amd-drm-next
.I've been trying to track down this bug for quite a while. Up until now I've worked around it by sticking to the 5.15 LTS kernel, but I'd be glad to help if there's anything else I could do to help resolve this issue.
For the current
amd-drm-next
kernel I've also attached a log with DRM debug logging and the stacktrace in the Oops decoded: trace.decoded.logEdited by Kevin- Alex Deucher mentioned in issue #2424
mentioned in issue #2424
- Author
@pschyska You are probably seeing a different since this one appeared in 6.1 and did not exist in older kernels. The backtraces are different as well. Please open a new issue ticket.
My issue still exists on 6.2.1. New log attached.dmesg-6.2.1.log
Collapse replies @jlindgren90 You are probably right. I removed my comments to reduce the confusion.
- Alex Deucher mentioned in issue #2435
mentioned in issue #2435
- Developer
Can you guys please have a try this branch?
https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci/mst-6.3-backports-6.1/
It has some commits from 6.3 backported to 6.1.14. If they help (or at least don't make things worse) I'll suggest them to stable.
Collapse replies Thanks for the response :) The backported commits don't seem to make anything worse for me. Unfortunately, the issue is still the same though :/
BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 5 PID: 1066 Comm: sway Not tainted 6.1.14-1-amd-drm-next #5 db639ce94b1aaa652c99546804a02d2d9d703e1e Hardware name: LENOVO 20NKS01Y00/20NKS01Y00, BIOS R12ET61W(1.31 ) 07/28/2022 RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper] Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 RSP: 0018:ffffab0502cef728 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff94681d903080 RCX: 0000000000000214 RDX: ffff946805d85400 RSI: ffff94680e7ca540 RDI: ffff94681d903080 RBP: ffff946811351800 R08: 0000000000000001 R09: ffff9468241c5050 R10: ffffab0502cef848 R11: 0000000000000000 R12: ffff94681d903080 R13: ffff946822fa1300 R14: ffff94680e7ca540 R15: 0000000000000214 FS: 00007f77c6a46980(0000) GS:ffff946ab0b40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 00000001051f8000 CR4: 00000000003506e0 Call Trace: <TASK> compute_mst_dsc_configs_for_link+0x2d4/0x9b0 [amdgpu 96c0c063362dc3768b1d779a9994ecb331cd5344] ? update_stream_scaling_settings+0xd1/0x140 [amdgpu 96c0c063362dc3768b1d779a9994ecb331cd5344] compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu 96c0c063362dc3768b1d779a9994ecb331cd5344] amdgpu_dm_atomic_check+0xf33/0x11b0 [amdgpu 96c0c063362dc3768b1d779a9994ecb331cd5344] drm_atomic_check_only+0x5c0/0xa30 drm_mode_atomic_ioctl+0x744/0xb70 ? ttm_eu_backoff_reservation+0x5e/0x80 [ttm a7933f105093641a9a0ac47c9f88e25535020e98] ? drm_atomic_set_property+0xb40/0xb40 drm_ioctl_kernel+0xcd/0x170 drm_ioctl+0x233/0x410 ? drm_atomic_set_property+0xb40/0xb40 amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 96c0c063362dc3768b1d779a9994ecb331cd5344] __x64_sys_ioctl+0x94/0xd0 do_syscall_64+0x5f/0x90 ? __pm_runtime_suspend+0x6e/0x100 ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu 96c0c063362dc3768b1d779a9994ecb331cd5344] ? __x64_sys_ioctl+0xac/0xd0 ? syscall_exit_to_user_mode+0x1b/0x40 ? do_syscall_64+0x6b/0x90 ? do_syscall_64+0x6b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f77c780c53f Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007ffd5599a230 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000055d689eb8680 RCX: 00007f77c780c53f RDX: 00007ffd5599a2d0 RSI: 00000000c03864bc RDI: 000000000000000b RBP: 00007ffd5599a2d0 R08: 0000000000000003 R09: 0000000000000003 R10: 000055d688ecc010 R11: 0000000000000246 R12: 00000000c03864bc R13: 000000000000000b R14: 000055d689df7960 R15: 000055d68a041cd0 </TASK> Modules linked in: rfcomm ccm snd_seq_dummy snd_hrtimer snd_seq cmac algif_hash algif_skcipher af_alg bnep snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof snd_sof_utils snd_soc_core snd_compress iwlmvm ac97_bus snd_usb_audio snd_ctl_led snd_pcm_dmaengine uvcvideo snd_hda_codec_realtek snd_usbmidi_lib snd_pci_ps snd_hda_codec_hdmi joydev mac80211 snd_hda_codec_generic btusb snd_rawmidi videobuf2_vmalloc intel_rapl_msr videobuf2_memops btrtl libarc4 mousedev snd_seq_device intel_rapl_common snd_hda_intel snd_intel_dspcfg cdc_ether snd_intel_sdw_acpi edac_mce_amd btbcm videobuf2_v4l2 usbnet snd_rpl_pci_acp6x btintel snd_hda_codec videobuf2_common btmtk snd_acp_pci r8152 videodev kvm_amd snd_hda_core bluetooth iwlwifi snd_pci_acp6x mii mc snd_hwdep vfat kvm fat irqbypass snd_pci_acp5x ecdh_generic r8169 ucsi_acpi snd_rn_pci_acp3x snd_pcm realtek think_lmi sp5100_tco psmouse rapl pcspkr firmware_attributes_class wmi_bmof k10temp typec_ucsi cfg80211 mdio_devres snd_acp_config ipmi_devintf snd_soc_acpi i2c_piix4 typec snd_timer libphy snd_pci_acp3x ipmi_msghandler roles i2c_scmi acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod usbhid amdgpu serio_raw atkbd crct10dif_pclmul crc32_pclmul libps2 crc32c_intel vivaldi_fmap polyval_clmulni polyval_generic thinkpad_acpi drm_ttm_helper gf128mul ledtrig_audio sdhci_pci ttm cqhci platform_profile gpu_sched ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd drm_buddy sdhci snd xhci_pci nvme drm_display_helper ccp soundcore xhci_pci_renesas mmc_core nvme_core cec rfkill nvme_common i8042 video serio wmi CR2: 0000000000000008 ---[ end trace 0000000000000000 ]--- RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper] Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41 RSP: 0018:ffffab0502cef728 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff94681d903080 RCX: 0000000000000214 RDX: ffff946805d85400 RSI: ffff94680e7ca540 RDI: ffff94681d903080 RBP: ffff946811351800 R08: 0000000000000001 R09: ffff9468241c5050 R10: ffffab0502cef848 R11: 0000000000000000 R12: ffff94681d903080 R13: ffff946822fa1300 R14: ffff94680e7ca540 R15: 0000000000000214 FS: 00007f77c6a46980(0000) GS:ffff946ab0b40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 00000001051f8000 CR4: 00000000003506e0
Edited by Kevin- Developer
Thanks for the response :) The backported commits don't seem to make anything worse for me. Unfortunately, the issue is still the same though :/
OK thanks. I'll see if I can identify anything else pertinent.
- Author
For me, the issue bisects to a5c2c0d164e96d24f73faffcd3b7bbb607e701a9.
The backtrace at that commit is the
drm_dp_atomic_find_time_slots
one, same as what I posted for 6.1.9. - Author
Possible fix:
diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c index d701e5b819b8..dc09bb89a94d 100644 --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c @@ -4392,7 +4392,8 @@ int drm_dp_atomic_find_time_slots(struct drm_atomic_state *state, return PTR_ERR(topology_state); conn_state = drm_atomic_get_new_connector_state(state, port->connector); - topology_state->pending_crtc_mask |= drm_crtc_mask(conn_state->crtc); + if (conn_state) + topology_state->pending_crtc_mask |= drm_crtc_mask(conn_state->crtc); /* Find the current allocation for this port, if any */ payload = drm_atomic_get_mst_payload_state(topology_state, port); @@ -4480,7 +4481,8 @@ int drm_dp_atomic_release_time_slots(struct drm_atomic_state *state, return PTR_ERR(topology_state); conn_state = drm_atomic_get_old_connector_state(state, port->connector); - topology_state->pending_crtc_mask |= drm_crtc_mask(conn_state->crtc); + if (conn_state) + topology_state->pending_crtc_mask |= drm_crtc_mask(conn_state->crtc); payload = drm_atomic_get_mst_payload_state(topology_state, port); if (WARN_ON(!payload)) {
It fixes the immediate issue (null pointer dereference) but not sure yet if other issues remain.
- Author
This patch is working for me with 6.2.1:
https://github.com/jlindgren90/linux/commit/8cf17c25e2d2644fa6dfc3d7de6b3b35689d4db0No idea if it's the correct fix or not, but it fixes the
drm_dp_atomic_find_time_slots
crash and gets the external displays working.edit: Hotplug works as well.
Edited by John Lindgren Collapse replies Thanks for your effort :) I've added the patch on top of the branch https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci/mst-6.3-backports-6.1/ linked above and can confirm that it mostly solves the issue for me as well.
For some reason, after booting or waking up from sleep all displays (including the built-in notebook display) still remain black, but reconnecting the dock helps in that case. At least the kernel doesn't freeze anymore, which is a lot of progress :D Hotplugging (with the notebook powered on) works reliably for me too.
- Author
For some reason, after booting or waking up from sleep all displays (including the built-in notebook display) still remain black, but reconnecting the dock helps in that case.
With more testing, I've started seeing something similar to that also. I'll continue to investigate.
As @superm1 suggested in response to my comment on a different issue (#2181 (comment 2016373)), I've tried applying this patch on top of the 6.4.7 kernel. This brought back the freeze in
drm_dp_atomic_find_time_slots
for me.However, applying your patch as well seems to finally solve the docking issue for me. Booting up while connected to the dock works, just as sleep and hot-plugging does, and I get output on both external monitors.
- Author
For the remaining issue of the external displays not coming up at boot (until reconnecting the dock), simply reverting 4d07b0bc4034, along with the four commits that were supposed to "fix" it (see #2171 (closed)), fixes the issue for me.
It's not a clean revert (as of 6.2.1); there are several merge conflicts, and I didn't attempt to fix the ones in nouveau.
@doesnotcompete if you like, you can try the commits I've added to https://github.com/jlindgren90/linux/commits/v6.2.1-fixes and see if they help for you as well.
Collapse replies Hey, thanks a lot again :) Your branch works nicely for me. My two external displays now come up right after signing into my window manager session.
The only pretty minor remaining issue is that for me my external monitors don't work before that, e.g. in the initramfs with early KMS for entering the disk encryption password, and in
greetd
, which is my login manager. Only the internal display can be used at this point. Output on external displays at these stages worked in 5.15 for me, but this can still very well be due to my particular setup.Apart from that, everything works flawlessly