Suspend/resume issue for RX5600M/4800H (Dell G5 SE)
Brief summary of the problem:
Suspend resume hangs
Hardware description:
- Laptop model: Dell G5 SE 5505 with SmartShift
- CPU: AMD Ryzen 4800H
- GPU: RX5600M + Renoir
- System Memory: 16GB
- Display(s): Dell U2720Q
- Type of Diplay Connection: USB-C
System infomration:
- Distro name and Version: Linux Mint 20 (based on Ubuntu 20.04)
- Kernel version: 5.8-rc5 from mainline PPA (happens also with 5.7.8)
- AMD package version: No package
How to reproduce the issue:
The easiest way to reproduce would be by running the Fedora-Workstation-Live Rawhide on the Dell G5 SE 5505 with the amdgpu.runpm=0 kernel parameter to avoid crashing (due to another bug), and trying to suspend twice. The first suspend will fail to suspend, and the second suspend will completely hang. This happens regardless of if an external monitor is connected or not.
Attached files:
dmesgXorg.0.log Xorg.0.log.old
Also, machine lock ups a couple of time a day with the amdgpu.runpm=0 kernel parameter. Without it it's worse. Confirmed by other users as well: https://www.reddit.com/r/linuxhardware/comments/gu0ge2/dell_g5_15_se_amd_linux_compatibility_dgpu/
There doesn't seem to be an easy way to disable the dedicated GPU boot time. Ideally there would be a kernel parameter that would allow disabling the specific PCI ID. Nor is there a way to remove it, as hotplug removal is not supported.
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Shai Coleman changed the description
changed the description
- Shai Coleman changed the description
changed the description
- Shai Coleman changed the description
changed the description
- Shai Coleman changed the description
changed the description
- Shai Coleman changed the description
changed the description
I'm so happy that we're giving some love to this laptop (we're now quite a few reporting and fixing issues, both on kernel and on Dell's end). Do you have BIOS 1.3.0 that just came out? Do you use the latest linux-firmware (very important)? I personally don't need amdgpu.runpm=0 for everything to work fine here. But I have to remain on AC.
- Author
I've reproduced it again with the following: I've updated the BIOS to 1.3.0, and extracted the latest firmware from: https://drivers.amd.com/drivers/linux/amdgpu-pro-20.20-1098277-ubuntu-20.04.tar.xz
to /lib/firmware/amdgpu
Updating the BIOS and the updated firmware didn't fix any of the stability issues I was having.I'm attaching another kernel log dmesg.txt
Just running "sensors" twice would hard lock the machine.An easy way to replicate these issues is to start Fedora Rawhide live USB, and launch Firefox. It will immediately hang. With amdgpu.runpm=0, it's more stable, but still it hard hangs after suspend.
Kernels 5.7.x/5.8.x would hang a couple of times per day. The only way I was able to get the machine stable was to revert to kernel 5.6.19, amdgpu.runpm=0, and to avoid suspending the machine.
Edited by Shai Coleman Hey, I just wanted to tack on that I just got this laptop as well and am having the same issues on kernel 5.8.5 and BIOS 1.3.0. As of now the problems I have found are:
-
amdgpu.runpm=0
is mandatory to avoid near-immediate lockup - Resuming from suspend seems to work in that the screen comes back, it reconnects to wifi but attempting to run any program causes it to lock up
- HDMI to an external monitor is unusably slow - workaround for this is to disable vsync, but that comes with the downside of, well, not having vsync
- Wifi card seems to be flaky. Can't maintain a stable connection to my VPN and transfer speed drops majorly after extended use. This could be a configuration issue though, need more testing
Other than those issues though, seems to work fine!
i8kutils
works well for setting a custom fan curve to keep the temps low.Edited by matoro-
- Author
@Matoro , how did you disable vsync?
Collapse replies From https://old.reddit.com/r/linuxhardware/comments/gu0ge2/dell_g5_15_se_amd_linux_compatibility_dgpu/fwq0yiu/ , add to
/etc/drirc
:<driconf> <device screen="0" driver="dri2"> <application name="Default"> <option name="vblank_mode" value="0" /> </application> </device> </driconf>
- Owner
Does setting pcie_port_pm=force on the kernel command line in grub allow you to remove amdgpu.runpm=0?
Collapse replies - Author
I tried it for a short while, and
pcie_port_pm=force
seems stable. However, it does seem to consume more power (10-25W compared to 10-11W withamdgpu.runpm=0
) when running thesensors
command. Thesensors
command also intermittently hangs for half a second or so with that settingEdited by Shai Coleman
- Owner
For someone comfortable with building their own kernel, in the function pci_bridge_d3_possible() in drivers/pci/pci.c, can you check which case is returning false for the upstream and downstream pcie ports on the dGPU?
Collapse replies Okay, I actually ended up patching latest git instead of 5.8. The behavior is slightly different on the 5.9 trunk - I was able to open terminals, but opening Firefox was just a blank window and did not render. Here's what the debug spits out:
[ 0.693277] aaaa pci_bridge_d3_possible 0x1478 [ 0.693550] aaaa pci_bridge_d3_possible 0x1479 [ 0.696309] aaaa pci_bridge_d3_possible 0x1478 [ 0.696310] aaaa pci_bridge_d3_possible 0x1479 [ 0.696310] aaaa pci_bridge_d3_possible 0x1479 [ 1.487273] aaaa pci_bridge_d3_possible 0x1478 [ 1.487427] aaaa pci_bridge_d3_possible 0x1479
And here's the full dmesg from when I suspend and attempt to resume:
[ 105.251396] PM: suspend entry (s2idle) [ 105.254969] Filesystems sync: 0.003 seconds [ 105.255538] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 105.256671] OOM killer disabled. [ 105.256671] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 105.257822] printk: Suspending console(s) (use no_console_suspend to debug) [ 105.258088] wlan0: deauthenticating from 10:33:bf:67:de:30 by local choice (Reason: 3=DEAUTH_LEAVING) [ 105.349926] [drm] free PSP TMR buffer [ 105.399950] [drm] free PSP TMR buffer [ 105.441306] ACPI: EC: interrupt blocked [ 105.493500] ACPI: EC: interrupt unblocked [ 105.656663] amdgpu 0000:03:00.0: refused to change power state from D3hot to D0 [ 105.656835] snd_hda_intel 0000:03:00.1: refused to change power state from D3hot to D0 [ 105.658762] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000). [ 105.658793] [drm] PSP is resuming... [ 105.658857] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). [ 105.658890] [drm] PSP is resuming... [ 105.678836] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR [ 105.690162] [drm] reserve 0x900000 from 0x800f400000 for PSP TMR [ 105.717127] nvme nvme0: 7/0/0 default/read/poll queues [ 105.796889] r8169 0000:05:00.0 enp5s0: Link is Down [ 105.860280] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 105.880375] amdgpu 0000:03:00.0: amdgpu: SMU is resuming... [ 105.970330] ata1: SATA link down (SStatus 0 SControl 300) [ 105.970761] ata2: SATA link down (SStatus 0 SControl 300) [ 105.996609] amdgpu 0000:07:00.0: amdgpu: SMU is resuming... [ 105.996653] amdgpu 0000:07:00.0: amdgpu: dpm has been disabled [ 105.996671] amdgpu 0000:07:00.0: amdgpu: SMU is resumed successfully! [ 106.176768] [drm] kiq ring mec 2 pipe 1 q 0 [ 106.194932] [drm] DMUB hardware initialized: version=0x01000000 [ 106.318989] [drm:mod_hdcp_add_display_to_topology [amdgpu]] *ERROR* Failed to add display topology, DTM TA is not initialized. [ 106.318991] [drm] [Link 0] WARNING MOD_HDCP_STATUS_FAILURE IN STATE HDCP_UNINITIALIZED STAY COUNT 0 [ 106.338225] [drm] VCN decode and encode initialized successfully(under DPG Mode). [ 106.338623] [drm] JPEG decode initialized successfully. [ 106.338833] amdgpu 0000:07:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0 [ 106.338833] amdgpu 0000:07:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [ 106.338834] amdgpu 0000:07:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [ 106.338835] amdgpu 0000:07:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 [ 106.338836] amdgpu 0000:07:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 [ 106.338836] amdgpu 0000:07:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 [ 106.338837] amdgpu 0000:07:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 [ 106.338837] amdgpu 0000:07:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 [ 106.338838] amdgpu 0000:07:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 [ 106.338839] amdgpu 0000:07:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 [ 106.338840] amdgpu 0000:07:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1 [ 106.338840] amdgpu 0000:07:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1 [ 106.338841] amdgpu 0000:07:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1 [ 106.338841] amdgpu 0000:07:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1 [ 106.338842] amdgpu 0000:07:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1 [ 107.996660] amdgpu 0000:03:00.0: amdgpu: failed send message: RunBtc (58) param: 0x00000000 response 0xffffffc2 [ 107.996662] amdgpu 0000:03:00.0: amdgpu: RunBtc failed! [ 107.996662] amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw! [ 107.996713] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62 [ 107.996751] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-62). [ 107.996756] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -62 [ 107.996759] PM: Device 0000:03:00.0 failed to resume async: error -62 [ 107.997498] OOM killer enabled. [ 107.997499] Restarting tasks ... done. [ 108.036905] PM: suspend exit [ 108.451202] wlan0: authenticate with 10:33:bf:67:de:30 [ 108.455470] wlan0: send auth to 10:33:bf:67:de:30 (try 1/3) [ 108.482061] wlan0: authenticated [ 108.483240] wlan0: associate with 10:33:bf:67:de:30 (try 1/3) [ 108.485758] wlan0: RX AssocResp from 10:33:bf:67:de:30 (capab=0x1511 status=0 aid=2) [ 108.490012] wlan0: associated [ 108.526965] wlan0: Limiting TX power to 127 (127 - 0) dBm as advertised by 10:33:bf:67:de:30 [ 143.090167] urxvt[6217]: segfault at 55822d0403e4 ip 00007f9b843394ed sp 00007fffa8ed2ec0 error 4 in libc-2.32.so[7f9b84320000+14d000] [ 143.090176] Code: fe ff 49 83 c5 02 41 0f b7 6d fe 49 89 c6 4c 8d 78 fe 4d 85 e4 75 13 eb cc 0f 1f 40 00 4c 8b 63 08 48 83 c3 08 4d 85 e4 74 bb <66> 41 3b 2c 24 75 ec 49 8d 7c 24 02 4c 89 fa 4c 89 ee e8 9c 70 fe [ 153.964740] urxvt[6516]: segfault at 558648fb8827 ip 00007f6dcecbc4ed sp 00007ffd906680b0 error 4 in libc-2.32.so[7f6dceca3000+14d000] [ 153.964749] Code: fe ff 49 83 c5 02 41 0f b7 6d fe 49 89 c6 4c 8d 78 fe 4d 85 e4 75 13 eb cc 0f 1f 40 00 4c 8b 63 08 48 83 c3 08 4d 85 e4 74 bb <66> 41 3b 2c 24 75 ec 49 8d 7c 24 02 4c 89 fa 4c 89 ee e8 9c 70 fe [ 166.273407] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=353, emitted seq=355 [ 166.273517] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 [ 166.273525] amdgpu 0000:03:00.0: amdgpu: GPU reset begin! [ 166.273558] ------------[ cut here ]------------ [ 166.273558] kernel BUG at mm/slub.c:304! [ 166.273567] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 166.273571] CPU: 2 PID: 4838 Comm: kworker/2:3 Tainted: G W 5.9.0-rc5-1-git-00044-g4cbffc461ec9 #1 [ 166.273573] Hardware name: Dell Inc. G5 5505/0YHTJ7, BIOS 1.3.0 06/11/2020 [ 166.273578] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 166.273586] RIP: 0010:__slab_free+0x2a9/0x4a0 [ 166.273588] Code: 4c 24 30 e8 29 fb ff ff 4c 8b 54 24 30 85 c0 0f 85 9c fd ff ff eb c6 41 f7 46 08 00 0d 21 00 0f 85 f2 fe ff ff e9 e4 fe ff ff <0f> 0b 80 4c 24 6b 80 45 31 ff e9 e3 fd ff ff 48 8d 65 d8 4c 89 e6 [ 166.273590] RSP: 0018:ffff929bc8b67c40 EFLAGS: 00010246 [ 166.273592] RAX: ffff8e575d2edcf0 RBX: ffff8e575d2edce0 RCX: ffff8e575d2edce0 [ 166.273594] RDX: 000000008080007f RSI: ffffc0331f74bb40 RDI: ffff8e576c043c00 [ 166.273595] RBP: ffff929bc8b67ce8 R08: 0000000000000001 R09: ffffffffc03dce46 [ 166.273596] R10: ffff8e575d2edce0 R11: 0000000000000001 R12: ffffc0331f74bb40 [ 166.273598] R13: ffff8e575d2edce0 R14: ffff8e576c043c00 R15: ffff8e576a9e00b0 [ 166.273600] FS: 0000000000000000(0000) GS:ffff8e576f480000(0000) knlGS:0000000000000000 [ 166.273601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 166.273603] CR2: 00007f79eedbe020 CR3: 00000007d148c000 CR4: 0000000000350ee0 [ 166.273604] Call Trace: [ 166.273612] ? __flush_work.isra.0+0x189/0x210 [ 166.273714] ? kfd_gtt_sa_free+0x56/0x80 [amdgpu] [ 166.273717] kfree+0x24e/0x270 [ 166.273814] kfd_gtt_sa_free+0x56/0x80 [amdgpu] [ 166.273914] stop_cpsch+0x87/0xc0 [amdgpu] [ 166.274011] kgd2kfd_suspend.part.0+0x2f/0x40 [amdgpu] [ 166.274104] kgd2kfd_pre_reset+0x35/0x50 [amdgpu] [ 166.274226] amdgpu_device_gpu_recover.cold+0x208/0xefc [amdgpu] [ 166.274331] amdgpu_job_timedout+0x121/0x140 [amdgpu] [ 166.274337] drm_sched_job_timedout+0x64/0xe0 [gpu_sched] [ 166.274341] process_one_work+0x1da/0x3d0 [ 166.274344] worker_thread+0x4d/0x3d0 [ 166.274346] ? rescuer_thread+0x410/0x410 [ 166.274349] kthread+0x142/0x160 [ 166.274352] ? __kthread_bind_mask+0x60/0x60 [ 166.274357] ret_from_fork+0x1f/0x30 [ 166.274360] Modules linked in: rfcomm bnep usbhid btusb btrtl btbcm btintel bluetooth ecdh_generic ecc uvcvideo ccm videobuf2_vmalloc algif_aead videobuf2_memops videobuf2_v4l2 videobuf2_common des_generic videodev libdes algif_skcipher mc cmac 8021q md4 garp algif_hash mrp joydev stp mousedev af_alg llc msr iwlmvm edac_mce_amd snd_hda_codec_realtek hid_multitouch dell_wmi mac80211 dell_laptop hid_generic kvm_amd snd_hda_codec_generic dell_smbios uas alienware_wmi wmi_bmof sparse_keymap dell_wmi_descriptor usb_storage dcdbas libarc4 ledtrig_audio snd_hda_codec_hdmi kvm snd_hda_intel snd_intel_dspcfg snd_hda_codec iwlwifi snd_hda_core irqbypass r8169 crct10dif_pclmul crc32_pclmul snd_hwdep nls_iso8859_1 realtek snd_pcm ghash_clmulni_intel nls_cp437 aesni_intel mdio_devres vfat snd_timer of_mdio crypto_simd cryptd fixed_phy glue_helper snd sp5100_tco psmouse rapl input_leds ccp fat i2c_piix4 cfg80211 k10temp libphy snd_pci_acp3x soundcore tpm_crb ucsi_acpi typec_ucsi typec wmi tpm_tis evdev [ 166.274418] i2c_hid tpm_tis_core battery mac_hid tpm dell_rbtn hid rfkill rng_core pinctrl_amd acpi_cpufreq acpi_tad ac pkcs8_key_parser dell_smm_hwmon crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 serio_raw atkbd libps2 crc32c_intel xhci_pci xhci_hcd i8042 serio amdgpu gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm agpgart [ 166.274445] ---[ end trace 4654c7902ca3b020 ]--- [ 166.274449] RIP: 0010:__slab_free+0x2a9/0x4a0 [ 166.274451] Code: 4c 24 30 e8 29 fb ff ff 4c 8b 54 24 30 85 c0 0f 85 9c fd ff ff eb c6 41 f7 46 08 00 0d 21 00 0f 85 f2 fe ff ff e9 e4 fe ff ff <0f> 0b 80 4c 24 6b 80 45 31 ff e9 e3 fd ff ff 48 8d 65 d8 4c 89 e6 [ 166.274452] RSP: 0018:ffff929bc8b67c40 EFLAGS: 00010246 [ 166.274454] RAX: ffff8e575d2edcf0 RBX: ffff8e575d2edce0 RCX: ffff8e575d2edce0 [ 166.274455] RDX: 000000008080007f RSI: ffffc0331f74bb40 RDI: ffff8e576c043c00 [ 166.274457] RBP: ffff929bc8b67ce8 R08: 0000000000000001 R09: ffffffffc03dce46 [ 166.274458] R10: ffff8e575d2edce0 R11: 0000000000000001 R12: ffffc0331f74bb40 [ 166.274459] R13: ffff8e575d2edce0 R14: ffff8e576c043c00 R15: ffff8e576a9e00b0 [ 166.274461] FS: 0000000000000000(0000) GS:ffff8e576f480000(0000) knlGS:0000000000000000 [ 166.274462] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 166.274464] CR2: 00007f79eedbe020 CR3: 00000007d148c000 CR4: 0000000000350ee0 [ 171.397994] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=43, emitted seq=45 [ 171.399955] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 [ 171.399983] amdgpu 0000:03:00.0: amdgpu: GPU reset begin! [ 171.399986] [drm] Bailing on TDR for s_job:2b, as another already in progress [ 171.399994] BUG: kernel NULL pointer dereference, address: 0000000000000020 [ 171.399996] #PF: supervisor write access in kernel mode [ 171.399997] #PF: error_code(0x0002) - not-present page [ 171.399998] PGD 0 P4D 0 [ 171.400001] Oops: 0002 [#2] PREEMPT SMP NOPTI [ 171.400003] CPU: 3 PID: 576 Comm: kworker/3:2 Tainted: G D W 5.9.0-rc5-1-git-00044-g4cbffc461ec9 #1 [ 171.400004] Hardware name: Dell Inc. G5 5505/0YHTJ7, BIOS 1.3.0 06/11/2020 [ 171.400012] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 171.400019] RIP: 0010:mutex_unlock+0x13/0x30 [ 171.400021] Code: 48 8b 43 10 48 89 e7 48 8b 70 10 e8 a7 eb 6a ff eb a1 e8 70 47 ff ff 0f 1f 44 00 00 65 48 8b 14 25 c0 7b 01 00 31 c9 48 89 d0 <f0> 48 0f b1 0f 48 39 c2 74 05 e9 ce fe ff ff c3 66 66 2e 0f 1f 84 [ 171.400023] RSP: 0018:ffff929bc3e33d98 EFLAGS: 00010246 [ 171.400025] RAX: ffff8e575b8ddb80 RBX: ffff8e5732acac00 RCX: 0000000000000000 [ 171.400028] RDX: ffff8e575b8ddb80 RSI: ffffffff9293c903 RDI: 0000000000000020 [ 171.400029] RBP: 0000000000000000 R08: 00000027e83c3ba9 R09: 0000000000000041 [ 171.400030] R10: 0000000000000240 R11: ffffffff9293c8e3 R12: ffff8e575dc00000 [ 171.400031] R13: ffff8e575dc00000 R14: ffff8e5732acac00 R15: 0000000000000000 [ 171.400033] FS: 0000000000000000(0000) GS:ffff8e576f4c0000(0000) knlGS:0000000000000000 [ 171.400035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 171.400036] CR2: 0000000000000020 CR3: 00000007b4faa000 CR4: 0000000000350ee0 [ 171.400038] Call Trace: [ 171.400122] amdgpu_device_gpu_recover.cold+0x3c6/0xefc [amdgpu] [ 171.400215] amdgpu_job_timedout+0x121/0x140 [amdgpu] [ 171.400224] drm_sched_job_timedout+0x64/0xe0 [gpu_sched] [ 171.400230] process_one_work+0x1da/0x3d0 [ 171.400233] worker_thread+0x4d/0x3d0 [ 171.400235] ? rescuer_thread+0x410/0x410 [ 171.400236] kthread+0x142/0x160 [ 171.400240] ? __kthread_bind_mask+0x60/0x60 [ 171.400243] ret_from_fork+0x1f/0x30 [ 171.400246] Modules linked in: rfcomm bnep usbhid btusb btrtl btbcm btintel bluetooth ecdh_generic ecc uvcvideo ccm videobuf2_vmalloc algif_aead videobuf2_memops videobuf2_v4l2 videobuf2_common des_generic videodev libdes algif_skcipher mc cmac 8021q md4 garp algif_hash mrp joydev stp mousedev af_alg llc msr iwlmvm edac_mce_amd snd_hda_codec_realtek hid_multitouch dell_wmi mac80211 dell_laptop hid_generic kvm_amd snd_hda_codec_generic dell_smbios uas alienware_wmi wmi_bmof sparse_keymap dell_wmi_descriptor usb_storage dcdbas libarc4 ledtrig_audio snd_hda_codec_hdmi kvm snd_hda_intel snd_intel_dspcfg snd_hda_codec iwlwifi snd_hda_core irqbypass r8169 crct10dif_pclmul crc32_pclmul snd_hwdep nls_iso8859_1 realtek snd_pcm ghash_clmulni_intel nls_cp437 aesni_intel mdio_devres vfat snd_timer of_mdio crypto_simd cryptd fixed_phy glue_helper snd sp5100_tco psmouse rapl input_leds ccp fat i2c_piix4 cfg80211 k10temp libphy snd_pci_acp3x soundcore tpm_crb ucsi_acpi typec_ucsi typec wmi tpm_tis evdev [ 171.400299] i2c_hid tpm_tis_core battery mac_hid tpm dell_rbtn hid rfkill rng_core pinctrl_amd acpi_cpufreq acpi_tad ac pkcs8_key_parser dell_smm_hwmon crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 serio_raw atkbd libps2 crc32c_intel xhci_pci xhci_hcd i8042 serio amdgpu gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm agpgart [ 171.400325] CR2: 0000000000000020 [ 171.400329] ---[ end trace 4654c7902ca3b021 ]--- [ 171.400333] RIP: 0010:__slab_free+0x2a9/0x4a0 [ 171.400336] Code: 4c 24 30 e8 29 fb ff ff 4c 8b 54 24 30 85 c0 0f 85 9c fd ff ff eb c6 41 f7 46 08 00 0d 21 00 0f 85 f2 fe ff ff e9 e4 fe ff ff <0f> 0b 80 4c 24 6b 80 45 31 ff e9 e3 fd ff ff 48 8d 65 d8 4c 89 e6 [ 171.400338] RSP: 0018:ffff929bc8b67c40 EFLAGS: 00010246 [ 171.400340] RAX: ffff8e575d2edcf0 RBX: ffff8e575d2edce0 RCX: ffff8e575d2edce0 [ 171.400341] RDX: 000000008080007f RSI: ffffc0331f74bb40 RDI: ffff8e576c043c00 [ 171.400343] RBP: ffff929bc8b67ce8 R08: 0000000000000001 R09: ffffffffc03dce46 [ 171.400343] R10: ffff8e575d2edce0 R11: 0000000000000001 R12: ffffc0331f74bb40 [ 171.400347] R13: ffff8e575d2edce0 R14: ffff8e576c043c00 R15: ffff8e576a9e00b0 [ 171.400349] FS: 0000000000000000(0000) GS:ffff8e576f4c0000(0000) knlGS:0000000000000000 [ 171.400350] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 171.400351] CR2: 0000000000000020 CR3: 00000007b4faa000 CR4: 0000000000350ee0
- Owner
Did you test without pcie_port_pm=force? If not, can you? If so, please try this patch without pcie_port_pm=force.
Here is my command line for both runs:
initrd=\amd-ucode.img initrd=\initramfs-linux-git.img root="UUID=204e5984-ae7f-41ab-9bbd-03f2ba2109d1" rw amdgpu.runpm=0 mitigations=off
And here's the results from v2 patch:
[ 0.688125] aaaa 0x1633 is_hotplug_bridge [ 0.690427] aaaa 0x731f default [ 0.690619] aaaa 0xab38 default [ 0.690899] aaaa 0x5762 default [ 0.691186] aaaa 0x8168 default [ 0.691573] aaaa 0x2723 default [ 0.691905] aaaa 0x1636 default [ 0.692026] aaaa 0x1637 default [ 0.692127] aaaa 0x15df default [ 0.692232] aaaa 0x1639 default [ 0.692341] aaaa 0x1639 default [ 0.692439] aaaa 0x15e2 default [ 0.692538] aaaa 0x15e3 default [ 0.692724] aaaa 0x7901 default [ 0.692833] aaaa 0x7901 default [ 0.692925] aaaa 0x1633 is_hotplug_bridge [ 1.483069] aaaa 0x1633 is_hotplug_bridge
- Owner
Does the attached patch fix it?
1 Collapse replies It fixes this problem for me, I can suspend and resume without any problems now.
1@Matoro: Does this alternative patch work for you as well?
- Owner
Can you provide an ACPI dump for your system?
$ acpidump > acpidump.out
Collapse replies
- Coleman Kane mentioned in issue #1307 (closed)
mentioned in issue #1307 (closed)