CML-R + TGP-H all video output lost after resume from S3

Add one comments. The issue can be duplicated on Intel RVP with below details This issue can be reproduced on Intel RVP. (CML CPU + Generic Ubuntu + 5.10.0-6008-oem) Disable RVP modern standby : Intel Advanced Menu / ACPI Settings / Low Power S0 Idle Capability to Disable.

assigned to @hariomp

We are not able to reproduce this issue on Intel-RVP with CML-S CPU + TGP PCH. Can I request a conf call with the test/dev engineer from your side to understand the steps to reproduce.

For us, the Intel RVP is going into S3 & resuming fine with no loss of video/audio.

@hariomp We have daily call and you can join that call. BTW, previously you mentioned on your RVP, you will see the system auto resume, and the latest comments from you is S3 works good, any test enviroments change here?

@sunnycrown

I have a local build drm-tip kernel debs with video and audio patch, could you ask ODM to test it on Intel RVP to see if we are aligned.

linux-image-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb linux-headers-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb

The md5sum of both debs: 26b3d8fdf7fd84a38f142dd26a921c24 linux-headers-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb b9151d25d2d46b435c89d9055667fb3c linux-image-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb

@kaichuan.hsieh With this test kernel, today Wistron reproduced the S3 issue on Intel's RVP.

@hariomp

hello,

I create a kernel deb, linux-image-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd.deb + linux-headers-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb.

It is drm-tip, based on commit drm-tip: 2021y-01m-12d-13h-59m-44s UTC integration manifest + 0001-ALSA-hda-Add-Cometlake-R-PCI-ID.patch + 0002-drm-i915-gen9_bc-Add-TGP-PCH-support.patch.

ODM reports they can reproduce the issue on RVP with my test deb, if you think the patch is valid, can you also try it on your RVP? I also attach my .config for building kernel, build-config

Please help us to clarify, since you said it works on your RVP, maybe we can align the environment to see what might be missing on Dell's CML-S refreshed + TGP platform.

Thanks,

Hi @kaichuan.hsieh

Could you please double confirm on GF AIO? I just tried the patch but found there is no display on external monitor on RKL config.

@yu081030

may I know which kernel debs you said?

These are patch and purpose:

linux-image-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd.deb + linux-headers-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb

This is for testing Intel RVP, not Dell project, it doesn't contain patch for AIO RKL.

https://launchpad.net/~kchsieh/+archive/ubuntu/bug-1905814

This the kernel we are going to release AIO IEVR image, are you talking about this one, because I just verify it works okay on Giraffe UMA built-in and external display.

https://bugs.launchpad.net/somerville/+bug/1905814/comments/25

This is our HWE built kernel, but only contain video patch, I uses the patch and add audio patch to construct 2.

Please specify which build you are referring to.

Thanks,

Can we document the topology of the RVP setup? Like what monitor is connected to which port, are there dongles used, etc.? And maybe also state the version of the BIOS?

It seems that when we use the same OS image on 2 different RVPs, and one can reproduce and the other one can't, that we should make sure that everything else is the same too.

@hariomp @cjechlitschek @sunnycrown Here is ODM steps & BIOS setting on Intel RVP.gitlab_2915_ODM_steps_on_RVP.pdf

@hariomp @cjechlitschek Can you reproduce issue on Intel RVP? We can reproduce on Intel RVP "Rocket Lake S UDIMM 6L RVP" with "CML-R/CML" CPU.

I cannot reproduce this issue on SO DIMM RVP, here's my result on HDMI & references. pw:intel.
kernel is 1009 (ppa:kchsieh/bug-1905814), top on 20.04 generic.

SODIMM_RVP_result.7z

** Note: DP port status still needs to be tested**

Updated: DP port resuming from S3 shows black-screen too on SO DIMM RVP.

Hi Kai-chuan,

Thanks for letting me know, I was using # 1 , loos like I should use #2.

@kaichuan.hsieh Could you help to share what I need to apply from your PPA ? @sunnycrown would like me to double check too.
https://launchpad.net/~kchsieh/+archive/ubuntu/bug-1905814

@yu081030

Please disable secure boot, since the kernel is not signed.

Steps:

$ sudo add-apt-repository ppa:kchsieh/bug-1905814

$ sudo apt-get update

$ sudo apt install linux-oem-5.10-headers-5.10.0-1009

$ sudo apt install linux-headers-5.10.0-1009-oem

$ sudo apt install linux-modules-5.10.0-1009-oem

$ sudo apt install linux-image-unsigned-5.10.0-1009-oem

$ sudo reboot

@sunnycrown

Could you describe the detail about the RVP test environment? this is request from #2915 (comment 765480)

Thanks,

@kaichuan.hsieh , could you please help to apply the debug patch on top of these 2 changes (0001-ALSA-hda-Add-Cometlake-R-PCI-ID.patch, 0002-drm-i915-gen9_bc-Add-TGP-PCH-support.patch) and share kernel log for me? 0001-debug-drm-i915-add-more-printk-for-debug.patch

@ShawnC

Here is the kernel log with your patch. journalctl.log

mentioned in issue #2925 (closed)

Updates on our observation on Intel RVP...

This issue does not happens with HDMI but it is observed with only DP port (this observation is matching to log received by Kaichuan in this ticket). Want @kaichuan.hsieh to confirm that they have not seen this issue on HDMI port ever?

From the primary analysis, we see that connector state is not getting restored properly in case of DP across suspend-resume cycle.

Below is the line from Canonical log which shows DP connector status is being changed from connected to disconnected during resume, hence resulting in blank display. 08 11:23:13 CANONICALID kernel: [drm:drm_helper_hpd_irq_event [drm_kms_helper]] [CONNECTOR:112:DP-2] status updated from connected to disconnected

We observe the same in the logs collected on our RVP for DP case. Jan 15 15:51:10 rkl-Rocket-Lake-Client-Platform kernel: [ 56.437496] [drm:drm_helper_hpd_irq_event] [CONNECTOR:116:DP-2] status updated from connected to disconnected

Issue is still under analysis to identify the root cause of this. I will keep this ticket updated.

@sunnycrown

Could you please request ODM to see if we are aligned with @hariomp ?

Thanks,

@hariomp

ODM reports that HDMI port of RVP can also reproduce this issue.

Consolidate status: Wistron RVP(UDIMM): Both DP and HDMI port can observe this issue. Pegatron RVP(SODIMM): HDMI port works well; DP port can observe this issue.

@hariomp Any further update?

@kaichuan.hsieh On Wistron's RVP, if enable Modern Standby(BIOS Default setting), both HDMI and DP can't reproduce; If disable Modern Standby(switch to legacy S3), both HDMI and DP can reproduce this issue.

please share logs when you reproduce issue(earlier log shared seem to have only DP connected). Also add display_info. One thing to confirm, when you see issue getting reproduced, is it pre-OS or post-OS or both? I believe in both cases(disable/enable Modern Standby(switch to legacy S3)) you are doing suspend/resume sequence, just to confirm.

@hariomp and @tejaskux Are you able to reproduce the issue on the Seal and Carribou proto systems? Do you need help in setting them up?

@hariomp Are you available later today to briefly chat with Zorro @sunnycrown ?

@jeffkao - Hi Jeffrey - The issue is still under debug from our side but no leads yet. I am available to discuss but no new progress to share as we are still finding the fix for this issue. Allow me time till tomorrow to do some more analysis and then we can discuss our findings

Can you help with below information!

Does the same usecase work under Windows?
Which BIOS version you are using? Please provide exact BIOS details.

Thanks, Tejas

Zorro @sunnycrown and Fanny @Fanny.C , spoke with @hariomp this afternoon. He will call into the daily call today. One request is to move all communications on this issue to gitlab going forward for easier tracking by his team. Thank you.

@tejaskux @jeffkao @hariomp

Yes, same usecase work under Windows.
We follow WW03 and WW02 BKC. Cons: Rocket lake-S/Comet lake-S + RKL PCH-H Consumer Production Version (PV) Best-Known Configuration (BKC) Software Package (Microsoft Windows* 10 – 64 bit 20H2) WW03’2021 Corp: Rocket lake-S/Cometlake-S + RKL PCH-H Consumer Production Candidate(PC) Software Package (Microsoft Windows* 10 -64 bit 20H2) WW02’2021

Update as on 21-Jan: (i) There are 2 code changes identified towards fixing this issue (ii) Both changes are in drm driver & will be available via 1 patch (iii) A workable patch can be provided by tomorrow (22Jan) for validation

gen9tgps3Resume.patch

Please try attached patch, it works on Intel UDIMM/SODIMM RKL RVP with CML CPU and S3 resume.

Thanks, Tejas

@tejaskux

your patch solve the problem, I can have DP-1, DP-2, DP-3 works after S3 on CRBU-MTE-C7 CML-R hardware.

Here is the kernel deb to people who wants to try: linux-headers-5.11.0-rc3-hpd-after-s3+_5.11.0-rc3-hpd-after-s3+-10.00.Custom_amd64.deb linux-image-5.11.0-rc3-hpd-after-s3+_5.11.0-rc3-hpd-after-s3+-10.00.Custom_amd64.deb

Thanks,

@kaichuan.hsieh @sunnycrown I checked on all DP ports, including optional DP, this issue is solved on RDT + CML.

But is this patch suitable for AIO? I checked on GF UMA + RKL, there is no display either from DP or HDMI out.

Thanks for update. Can you please give details of AIO setup? which CPU/RVP/ports/usecase you are trying on?

@yu081030

I think you don't have to worry about AIO RKL, since it is not the kernel we are going to release image. It is drm-tip kernel tree, our oem kernel works for AIO RKL successfully, then I think it won't impact RKL AIO's result. Thanks for your testing.

If you want to check if the pure drm-tip works on RKL AIO, you can use kernel here https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ to test.

Thanks,

@kaichuan.hsieh thanks for the confirmation, I was thinking about the same thing. @sunnycrown I will update RDT + RKL result later.

@yu081030 Here is the updated test kernel including Tejas' patch - gen9tgps3Resume.patch: The test kernel can support RKL/CML for both desktp and AIO. https://people.canonical.com/~acelan/bugs/lp1912745

@emily.chien @sunnycrown RKL CPU S3 resume issue PASS on Caribou and seal.

@Fanny.C Thank you. @tejaskux Per talked, please help submit the patch by today, thank you.

https://patchwork.freedesktop.org/patch/416162/ upstreamed patch on patchwork.

@sunnycrown @yu081030 @Fanny.C

We have built an image including the #416162 patch that @tejaskux submitted to upstream: http://162.213.32.53/LPCdfcS1ioKIbw1boDmiNUOE/fossa-cubone-rkl/X69/

We need ODM to use this image to verify if the video output before/after s3 works as expected on both RKL and CML platform. Once this is confirmed working, we will start backport the patches to Ubuntu oem kernel.

@emily.chien Thanks, BTW, does Canoincal have baisc sanity check with this image?

@sunnycrown Yes, we have verified the image can work at our side. Need test result from ODM to proceed next.

@emily.chien Let me call you to have a quick sync up.

@Fanny.C @yu081030 , Please download the image with Intel's submitted patch and have a quick test, Canonical will based on the result to start the SRU process. Thanks.

@sunnycrown X69 image release, S3 issue all pass on CML/RKL with Caribou/SEAL

@tejaskux @kaichuan.hsieh @sunnycrown Caribou DP1 DP2 and Seal VGA test result are pass.

@hariomp @tejaskux Would you please help submit the patch based on ODM's result? Thanks.

As I understand for both RKL and CML CPU, S3 suspend/resume does not have any issues, please help me to confirm so I can start upstreaming patch.

Thanks, Tejas

assigned to @tejaskux and unassigned @hariomp

Could we please get an update on the upstreaming of this patch? It's been sitting on the mailing list for a while now with no updates, despite having review comments from other folks at Intel.

Hi Paul, we need to wait to allow time for some more people to give review comments, before responding to them to avoid multiple versions of patches.

Thanks, Tejas

Just an FYI - if it takes multiple days to get a response, it's typically acceptable to poke someone again to see what's holding things up - especially because many times answers to various questions you ask on a mailing list might have just been missed by accident due to the amount of email most folks have to go through. I've definitely been guilty of this a couple of times. This actually ended up being the case with your gen9bc suspend/resume patch, and I was able to get an answer from Anshuman (I'll copy it here, because it accidentally didn't get forwarded to the ML I believe):

"My sincere apology, I had missed this thread. We have decided to keep the alternative WA i.e setting/clearing 0xC2000 bit #7 (closed) before entering after exiting s0ix to fix the deeper s0ix power consumption issues on ICL_PCH families platforms. This alternative WA was added to B.Spec on our request. But on TGL_PCH first alternative WA logic i.e in irq_reset() was working to attain deeper s0ix residencies so we haven't changed that."

Also, the other gen9_bc patch you had for adding HPD/DDI pin/recognizing gen9/TGP PCH combos was waiting on the review comments it already had to be addressed. I've already respun the second patch and have reviews on all but the last patch for adding the STRAP workarounds:

https://patchwork.freedesktop.org/series/86918/

I poked Vivi in response because I think they might have misread the code around the last patch. From a glance at intel_setup_outputs(), it seems you're probably right with forcing the straps here - as it does appear that real gen9 did have these strap registers for all but PORT_A and PORT_E. So I'm fairly sure that forcing the straps is the correct way of implementing this, but I've poked Vivi on IRC to confirm (if they don't respond I'm probably going to poke them on the ML as well, since I know not everyone has their IRC clients setup to catch highlights at all times of the day).

Anyway, I'm going to push the first three patches in that series and wait until I get confirmation on the fourth before pushing it.

agh. Was double checking things before I pushed anything and I noticed that patch 3 technically hasn't been tested since I respun it, so I am going to hold off on pushing patch 3 as well until I can get confirmation that this has been tested. Patches 1 and 2 should be fine as the only thing that I changed was splitting them out of the original TGP PCH support patch (along with renaming one function).

I've already poked the OEM that asked for these patches to retest, in the mean time though if anyone has access to one of these systems could you please test that the patch series I linked to in #2915 (comment 798144) still fixes all of the hotplugging issues except for the suspend/resume issues talked about here? Just to be clear, this is just a newer version of the patch series for adding TGP PCH support that was linked in the description of this issue.

Thanks Lyude for taking actions on reviewing it faster. I think OEM and canonical can test and confirm is the best way to go.

I agree with your last 2 comments and I saw your patches they do not change much so results would be same. Please merge the patches and we can merge rest of them post test for suspend S3.

Unfortunately now I'm waiting on an OK from Rodrigo Vivi right now for patch 3, as they wanted Jani Nakula to take a look at it before pushing it

Whoops! Just talked to Rodrigo and it looks like I misunderstood, they were talking about an entirely different patch series. I'm going to go ahead and push the last two patches in this series, which just leaves the gen9bc suspend/resume fix

Lyude thank you so much for helping to get patches in.

Hey everyone! I still don't have access to this hardware unfortunately (looks like I was likely sent the wrong machine by mistake), would anyone be able to test the latest revision of the suspend/resume fixes here?

https://patchwork.freedesktop.org/series/87148/

@lyudess @tejaskux

ODM reply that the patch verify pass on CML-R platform, I think you can proceed upstream the commit.

Thanks a lot.

@kaichuan.hsieh I think maybe you are not familiar with ODM's term, VP means the issue was reproduced:(

@sunnycrown

ah, got it.

@lyudess @tejaskux

Please ignore my message, ODM's result is negative, please hold the submit. I'll try to find a hardware to validate it.

Thanks,

@lyudess @tejaskux

The DP connector doesn't report active after S3 with the patch https://patchwork.freedesktop.org/series/87148/ Here is the kernel log: kernel.log

Thanks,

OK-unfortunately I still don't have access to any gen9bc systems so I just had to stare at the screen until I noticed

hack.patch

If I'm interpreting the code right this is the only real change I think there is between the two series, could you apply this patch on top of the fixes on the ML and see if things work then?

Also just to confirm - the original patches from Tejas did fix your issue correct?

@lyudess

At comment #2915 (comment 816387), I didn't apply https://patchwork.freedesktop.org/patch/421198/?series=87148&rev=2, but only apply https://patchwork.freedesktop.org/patch/421258/?series=87148&rev=2. May I know if they are both required?

Thanks,

Here is the kernel debs, it has applied https://patchwork.freedesktop.org/patch/421258/?series=87148&rev=2 + https://gitlab.freedesktop.org/drm/intel/uploads/064e472db9de93662e82642c4990050c/hack.patch, and based on drm-tip: 2021y-02m-17d-17h-34m-06s UTC integration manifest.

linux-headers-5.11.0+2915+hack+_5.11.0+2915+hack+-10.00.Custom_amd64.deb linux-image-5.11.0+2915+hack+_5.11.0+2915+hack+-10.00.Custom_amd64.deb

@kaichuan.hsieh yes both https://patchwork.freedesktop.org/patch/421198/?series=87148&rev=2 and https://patchwork.freedesktop.org/patch/421258/?series=87148&rev=2 are required - can you test with both of those patches + drm-tip then? (don't worry about the hack.patch that I posted, just use the two patchwork links without it)

@lyudess

I test kernel in #2915 (comment 821875), it works. I'll next verify all patch in https://patchwork.freedesktop.org/series/87148/.

Thanks,

@kaichuan.hsieh Please also releases it to me and I will let ODM to do the verification on various platforms. Thanks.

@lyudess

I test two patches in https://patchwork.freedesktop.org/series/87148/, and it works.

@sunnycrown

The test kernel:

linux-headers-5.11.0+2915+v3+v4+_5.11.0+2915+v3+v4+-10.00.Custom_amd64.deb linux-image-5.11.0+2915+v3+v4+_5.11.0+2915+v3+v4+-10.00.Custom_amd64.deb

ODM tested with the latest patch and it's VNP, thanks.

Hi everyone! Thank you for providing testing for this, I've just pushed the fixes for this upstream. As such I think this issue can be closed, but please feel free to re-open if I'm wrong. Thanks!

closed

https://cgit.freedesktop.org/drm-tip/commit/?id=59b7cb44cffde6ab5b8ace1aef9b79f50be1c3eb https://cgit.freedesktop.org/drm-tip/commit/?id=cec3295b246b5555f6de7570d25a13a2754de245

Posting merged commits which resolved this issue just for record.

CML-R + TGP-H all video output lost after resume from S3

Child items 0

Activity

Admin message

CML-R + TGP-H all video output lost after resume from S3

Activity