Failed to terminate hdcp ta during suspend (s3) on Xen
NOTE: I tried mailing this is, but not sure if the mailing list is picked up by the correct people? https://www.spinics.net/lists/amd-gfx/msg70998.html
Brief summary of the problem:
I'm trying to solve problems during Suspend/Resume on Qubes OS (which is running Xen).
What happens is that the resume works, but the screen blanks out each time I type a letter on the keyboard and then returns again, then after a while the screen just goes black.
If I boot the same kernel without Xen this issue is not present.
Running almost Linux tip with Xen 4.14.3-4 I'm getting the following errors/warnings:
dom0 kernel: [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x0)
dom0 kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x0)
dom0 kernel: [drm:psp_suspend [amdgpu]] *ERROR* Failed to terminate hdcp ta
dom0 kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <psp> failed -22
[...]
dom0 kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
dom0 kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
dom0 kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
dom0 kernel: amdgpu 0000:07:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
dom0 kernel: PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -62
dom0 kernel: amdgpu 0000:07:00.0: PM: failed to resume async: error -62
I have done my best to follow the source code and my current analysis is ( assuming we follow 'Failed to terminate hdcp ta' ) is:
The psp gfx warn (amdgpu_psp.c: psp_cmd_submit_buf) :
L491: DRM_WARN("psp gfx command %s(0x%X) failed and response
status is (0x%X)\n",
L492: psp_gfx_cmd_name(psp->cmd_buf_mem->cmd_id),
psp->cmd_buf_mem->cmd_id,
L493: psp->cmd_buf_mem->resp.status);
Last to set ret (amdgpu_psp.c: psp_cmd_submit_buf):
L452: ret = psp_ring_cmd_submit(psp, psp->cmd_buf_mc_addr,
fence_mc_addr, index);
The line setting -EINVAL (amdgpu_psp.c: psp_ring_cmd_submit):
L2767: if ((write_frame < ring_buffer_start) || (ring_buffer_end <
write_frame)) {
L2768: DRM_ERROR("ring_buffer_start = %p; ring_buffer_end = %p;
write_frame = %p\n",
L2769: ring_buffer_start, ring_buffer_end, write_frame);
L2770: DRM_ERROR("write_frame is pointing to address out of
bounds\n");
L2771: return -EINVAL;
The thing that leads up to this issue is that I'm not getting those DRM_ERROR in my dmesg. I see that they have existed for 2 years so I really should have them.
Any clues? Anything I can test to add? printk's somewhere that should help diagnose?
Hardware description:
- CPU: AMD Ryzen 7 Pro 4800H
- GPU: Renoir
- System Memory: 48G
- Type of Display Connection: eDP
System information:
- Distro name and Version: Qubes OS R4.1
- Kernel version: 5.16-rc1
- Custom kernel: -
- AMD official driver version: -
How to reproduce the issue:
systemctl suspend
Attached files:
Log files (for system lockups / game freezes / crashes)
- Dmesg log
dom0 kernel: PM: suspend entry (deep)
dom0 kernel: Filesystems sync: 0.010 seconds
dom0 kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
dom0 kernel: OOM killer disabled.
dom0 kernel: Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
dom0 kernel: printk: Suspending console(s) (use no_console_suspend to debug)
dom0 kernel: [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x0)
dom0 kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x0)
dom0 kernel: [drm:psp_suspend [amdgpu]] *ERROR* Failed to terminate hdcp ta
dom0 kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <psp> failed -22
dom0 kernel: PM: suspend devices took 4.705 seconds
dom0 kernel: ACPI: EC: interrupt blocked
dom0 kernel: ACPI: PM: Preparing to enter system sleep state S3
dom0 kernel: ACPI: EC: event blocked
dom0 kernel: ACPI: EC: EC stopped
dom0 kernel: ACPI: PM: Saving platform NVS memory
dom0 kernel: Disabling non-boot CPUs ...
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 1036aa000, (ffff8881036aa000, 0)
dom0 kernel: smpboot: CPU 1 is now offline
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 1036aa000, (ffff8881036aa000, 0)
dom0 kernel: smpboot: CPU 2 is now offline
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 1036aa000, (ffff8881036aa000, 0)
dom0 kernel: smpboot: CPU 3 is now offline
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 51f4000, (ffff8880051f4000, 0)
dom0 kernel: smpboot: CPU 4 is now offline
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 1036aa000, (ffff8881036aa000, 0)
dom0 kernel: smpboot: CPU 5 is now offline
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 1036aa000, (ffff8881036aa000, 0)
dom0 kernel: smpboot: CPU 6 is now offline
dom0 kernel: josef-debug: cr3: 2810000, build_cr3: 1036aa000, (ffff8881036aa000, 0)
dom0 kernel: smpboot: CPU 7 is now offline
dom0 kernel: ACPI: PM: Low-level resume complete
dom0 kernel: ACPI: EC: EC started
dom0 kernel: ACPI: PM: Restoring platform NVS memory
dom0 kernel: xen_acpi_processor: Uploading Xen processor PM info
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU1
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU3
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU5
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU7
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU9
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU11
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU13
dom0 kernel: xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU15
dom0 kernel: Enabling non-boot CPUs ...
dom0 kernel: installing Xen timer for CPU 1
dom0 kernel: cpu 1 spinlock event irq 67
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C001: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU1 is up
dom0 kernel: installing Xen timer for CPU 2
dom0 kernel: cpu 2 spinlock event irq 73
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C002: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU2 is up
dom0 kernel: installing Xen timer for CPU 3
dom0 kernel: cpu 3 spinlock event irq 79
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C003: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU3 is up
dom0 kernel: installing Xen timer for CPU 4
dom0 kernel: cpu 4 spinlock event irq 85
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C004: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU4 is up
dom0 kernel: installing Xen timer for CPU 5
dom0 kernel: cpu 5 spinlock event irq 91
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C005: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU5 is up
dom0 kernel: installing Xen timer for CPU 6
dom0 kernel: cpu 6 spinlock event irq 97
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C006: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU6 is up
dom0 kernel: installing Xen timer for CPU 7
dom0 kernel: cpu 7 spinlock event irq 103
dom0 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
dom0 kernel: ACPI: \_SB_.PLTF.C007: Found 3 idle states
dom0 kernel: ACPI: FW issue: working around C-state latencies out of order
dom0 kernel: CPU7 is up
dom0 kernel: ACPI: PM: Waking up from system sleep state S3
dom0 kernel: ACPI: EC: interrupt unblocked
dom0 kernel: ACPI: EC: event unblocked
dom0 kernel: [drm] PCIE GART of 1024M enabled.
dom0 kernel: [drm] PTB located at 0x000000F400900000
dom0 kernel: [drm] PSP is resuming...
dom0 kernel: usb usb2: root hub lost power or was reset
dom0 kernel: usb usb3: root hub lost power or was reset
dom0 kernel: nvme nvme0: 8/0/0 default/read/poll queues
dom0 kernel: usb 4-4: reset full-speed USB device number 5 using xhci_hcd
dom0 kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
dom0 kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
dom0 kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
dom0 kernel: amdgpu 0000:07:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
dom0 kernel: PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -62
dom0 kernel: amdgpu 0000:07:00.0: PM: failed to resume async: error -62
dom0 kernel: usb 2-2: reset high-speed USB device number 2 using xhci_hcd
dom0 kernel: PM: resume devices took 0.546 seconds
dom0 kernel: OOM killer enabled.