Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Project 'drm/intel' was moved to 'drm/i915/kernel'. Please update any links and bookmarks that may still have the old path.
CML-R + TGP-H all video output lost after resume from S3
Add one comments. The issue can be duplicated on Intel RVP with below details
This issue can be reproduced on Intel RVP. (CML CPU + Generic Ubuntu + 5.10.0-6008-oem)
Disable RVP modern standby : Intel Advanced Menu / ACPI Settings / Low Power S0 Idle Capability to Disable.
We are not able to reproduce this issue on Intel-RVP with CML-S CPU + TGP PCH. Can I request a conf call with the test/dev engineer from your side to understand the steps to reproduce.
For us, the Intel RVP is going into S3 & resuming fine with no loss of video/audio.
@hariomp We have daily call and you can join that call.
BTW, previously you mentioned on your RVP, you will see the system auto resume, and the latest comments from you is S3 works good, any test enviroments change here?
The md5sum of both debs:
26b3d8fdf7fd84a38f142dd26a921c24 linux-headers-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb
b9151d25d2d46b435c89d9055667fb3c linux-image-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb
I create a kernel deb, linux-image-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd.deb + linux-headers-5.11.0-rc3-cmlr.audio+_5.11.0-rc3-cmlr.audio+-10.00.Custom_amd64.deb.
ODM reports they can reproduce the issue on RVP with my test deb, if you think the patch is valid, can you also try it on your RVP? I also attach my .config for building kernel, build-config
Please help us to clarify, since you said it works on your RVP, maybe we can align the environment to see what might be missing on Dell's CML-S refreshed + TGP platform.
This the kernel we are going to release AIO IEVR image, are you talking about this one, because I just verify it works okay on Giraffe UMA built-in and external display.
Can we document the topology of the RVP setup? Like what monitor is connected to which port, are there dongles used, etc.? And maybe also state the version of the BIOS?
It seems that when we use the same OS image on 2 different RVPs, and one can reproduce and the other one can't, that we should make sure that everything else is the same too.
I cannot reproduce this issue on SO DIMM RVP, here's my result on HDMI & references. pw:intel.
kernel is 1009 (ppa:kchsieh/bug-1905814), top on 20.04 generic.
@kaichuan.hsieh , could you please help to apply the debug patch on top of these 2 changes (0001-ALSA-hda-Add-Cometlake-R-PCI-ID.patch, 0002-drm-i915-gen9_bc-Add-TGP-PCH-support.patch) and share kernel log for me?
0001-debug-drm-i915-add-more-printk-for-debug.patch
This issue does not happens with HDMI but it is observed with only DP port (this observation is matching to log received by Kaichuan in this ticket). Want @kaichuan.hsieh to confirm that they have not seen this issue on HDMI port ever?
From the primary analysis, we see that connector state is not getting restored properly in case of DP across suspend-resume cycle.
Below is the line from Canonical log which shows DP connector status is being changed from connected to disconnected during resume, hence resulting in blank display. 08 11:23:13 CANONICALID kernel: [drm:drm_helper_hpd_irq_event [drm_kms_helper]] [CONNECTOR:112:DP-2] status updated from connected to disconnected
We observe the same in the logs collected on our RVP for DP case. Jan 15 15:51:10 rkl-Rocket-Lake-Client-Platform kernel: [ 56.437496] [drm:drm_helper_hpd_irq_event] [CONNECTOR:116:DP-2] status updated from connected to disconnected
Issue is still under analysis to identify the root cause of this. I will keep this ticket updated.
Consolidate status:
Wistron RVP(UDIMM): Both DP and HDMI port can observe this issue.
Pegatron RVP(SODIMM): HDMI port works well; DP port can observe this issue.
@kaichuan.hsieh On Wistron's RVP, if enable Modern Standby(BIOS Default setting), both HDMI and DP can't reproduce; If disable Modern Standby(switch to legacy S3), both HDMI and DP can reproduce this issue.
please share logs when you reproduce issue(earlier log shared seem to have only DP connected). Also add display_info. One thing to confirm, when you see issue getting reproduced, is it pre-OS or post-OS or both? I believe in both cases(disable/enable Modern Standby(switch to legacy S3)) you are doing suspend/resume sequence, just to confirm.
@jeffkao - Hi Jeffrey - The issue is still under debug from our side but no leads yet. I am available to discuss but no new progress to share as we are still finding the fix for this issue. Allow me time till tomorrow to do some more analysis and then we can discuss our findings
Zorro @sunnycrown and Fanny @Fanny.C , spoke with @hariomp this afternoon. He will call into the daily call today. One request is to move all communications on this issue to gitlab going forward for easier tracking by his team. Thank you.
Update as on 21-Jan:
(i) There are 2 code changes identified towards fixing this issue
(ii) Both changes are in drm driver & will be available via 1 patch
(iii) A workable patch can be provided by tomorrow (22Jan) for validation
I think you don't have to worry about AIO RKL, since it is not the kernel we are going to release image. It is drm-tip kernel tree, our oem kernel works for AIO RKL successfully, then I think it won't impact RKL AIO's result. Thanks for your testing.
We need ODM to use this image to verify if the video output before/after s3 works as expected on both RKL and CML platform. Once this is confirmed working, we will start backport the patches to Ubuntu oem kernel.
@Fanny.C@yu081030 , Please download the image with Intel's submitted patch and have a quick test, Canonical will based on the result to start the SRU process.
Thanks.
Could we please get an update on the upstreaming of this patch? It's been sitting on the mailing list for a while now with no updates, despite having review comments from other folks at Intel.
Just an FYI - if it takes multiple days to get a response, it's typically acceptable to poke someone again to see what's holding things up - especially because many times answers to various questions you ask on a mailing list might have just been missed by accident due to the amount of email most folks have to go through. I've definitely been guilty of this a couple of times. This actually ended up being the case with your gen9bc suspend/resume patch, and I was able to get an answer from Anshuman (I'll copy it here, because it accidentally didn't get forwarded to the ML I believe):
"My sincere apology, I had missed this thread.
We have decided to keep the alternative WA i.e setting/clearing 0xC2000 bit #7 (closed)
before entering after exiting s0ix to fix the deeper s0ix power consumption issues on ICL_PCH
families platforms. This alternative WA was added to B.Spec on our request.
But on TGL_PCH first alternative WA logic i.e in irq_reset() was working to attain deeper s0ix residencies so we haven't changed that."
Also, the other gen9_bc patch you had for adding HPD/DDI pin/recognizing gen9/TGP PCH combos was waiting on the review comments it already had to be addressed. I've already respun the second patch and have reviews on all but the last patch for adding the STRAP workarounds:
I poked Vivi in response because I think they might have misread the code around the last patch. From a glance at intel_setup_outputs(), it seems you're probably right with forcing the straps here - as it does appear that real gen9 did have these strap registers for all but PORT_A and PORT_E. So I'm fairly sure that forcing the straps is the correct way of implementing this, but I've poked Vivi on IRC to confirm (if they don't respond I'm probably going to poke them on the ML as well, since I know not everyone has their IRC clients setup to catch highlights at all times of the day).
Anyway, I'm going to push the first three patches in that series and wait until I get confirmation on the fourth before pushing it.
agh. Was double checking things before I pushed anything and I noticed that patch 3 technically hasn't been tested since I respun it, so I am going to hold off on pushing patch 3 as well until I can get confirmation that this has been tested. Patches 1 and 2 should be fine as the only thing that I changed was splitting them out of the original TGP PCH support patch (along with renaming one function).
I've already poked the OEM that asked for these patches to retest, in the mean time though if anyone has access to one of these systems could you please test that the patch series I linked to in #2915 (comment 798144) still fixes all of the hotplugging issues except for the suspend/resume issues talked about here? Just to be clear, this is just a newer version of the patch series for adding TGP PCH support that was linked in the description of this issue.
Thanks Lyude for taking actions on reviewing it faster. I think OEM and canonical can test and confirm is the best way to go.
I agree with your last 2 comments and I saw your patches they do not change much so results would be same. Please merge the patches and we can merge rest of them post test for suspend S3.
Whoops! Just talked to Rodrigo and it looks like I misunderstood, they were talking about an entirely different patch series. I'm going to go ahead and push the last two patches in this series, which just leaves the gen9bc suspend/resume fix
Hey everyone! I still don't have access to this hardware unfortunately (looks like I was likely sent the wrong machine by mistake), would anyone be able to test the latest revision of the suspend/resume fixes here?
If I'm interpreting the code right this is the only real change I think there is between the two series, could you apply this patch on top of the fixes on the ML and see if things work then?
Also just to confirm - the original patches from Tejas did fix your issue correct?
Hi everyone! Thank you for providing testing for this, I've just pushed the fixes for this upstream. As such I think this issue can be closed, but please feel free to re-open if I'm wrong. Thanks!