Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
DisplayPort connected monitor wakes from DPMS sleep with a blank screen
On both Plasma and GNOME Wayland after I lock the screen and my monitor goes into sleep mode, about 10-15 seconds later the monitor wakes up on its own and shows a blank screen; only the backlight is on.
I noticed that after the monitor sleeps, the graphics card seems to "turn off" (the red RADEON light on my card turns off). A few seconds later, just prior to the monitor waking, the RADEON light turns back on and the display wakes with a blank screen until you move the mouse or do something to get the lock screen to display.
Since I thought this may be power management related, I added the following kernel option:
amdgpu.runpm=0
This seems to resolve the issue. The RADEON light on my card stays on and the display stays asleep.
I'm not exactly sure what would help troubleshoot this, so please let me know what logs or other information is needed to help debug.
I think this is what's happening based on my observation:
DPMS signal is sent to monitor.
Monitor goes into "deep-sleep", which drops the DP link. (as I understand it, LG monitors are notorious for this)
amdgpu thinks nothing is connected to the card, so it turns it off (hence the relation to runpm).
The monitor detects that something changed with DP and tries to renegotiate.
Something bugs in amdgpu and the card turns back on, monitor turns back on, and you get a blank screen.
Alright, got my source tree back up to 5.15.10-arch1 and reverted e3b39825ed0813f787cb3ebdc5ecaa5131623647.
(I do not have the patch applied to this build, I only reverted the commit)
Monitor stays off after going into sleep mode. Left it for a few minutes and it seems to stay off.
A couple things I noticed when testing:
Applying the patch caused the GPU to power down when the monitor went into sleep mode (RADEON light turned off on the GPU).
Reverting the commit caused the GPU to stay powered up when the monitor went into sleep mode (RADEON light stayed on)
Either way, both methods allowed the monitor to stay asleep as it should.
What I think is happening is that when the displays are turned off and the GPU is idle, the driver enters runtime suspend and powers down the GPU. At some point, some process queries the GPU which causes the driver to runtime resume in order to wake the GPU up. As part of that runtime resume, the driver sends a hotplug event to userspace in case something on the display side was changed while the GPU was off. The desktop then sees the hotplug event and re-probes the displays and lights them up again. e3b39825ed0813f787cb3ebdc5ecaa5131623647 fixed a bug which prevented runtime pm from being enabled at all efifb never released its runtime reference so the device never runtime suspended and runtime resumed which would have never resulted in the hotplug event.
Had some time to go back and re-test all three patches individually. I found that all three patches still expose the issue in some form.
Wayland seems to expose it the best. When I do something like "xset dpms force suspend" in XOrg like in the other ticket, the display stays off.
If I let the desktop (GNOME or KDE) lock the screen (either Wayland or XOrg), the GPU powers down and comes back on within 10-15 seconds and powers up the monitor.
Not sure why the first patch (bug215203.diff) worked the first few times I tried it.
I inserted a couple dev_info lines into amdgpu_kms.c to see if is_fw_fb returning anything:
dev_info(adev->dev, "We are running custom module\n");# Not sure if I did the following right, but it returns?dev_info(adev->dev, "FW FB: %d \n", is_fw_fb); if (is_fw_fb){ dev_info(adev->dev, "Primary adapter detected, disabling runtime pm\n"); adev->runpm = false;}if (adev->runpm) dev_info(adev->dev, "Using BACO for runtime pm\n");
Here's the dmesg output I get:
[ 11.511682] amdgpu 0000:08:00.0: amdgpu: We are running custom module[ 11.511683] amdgpu 0000:08:00.0: amdgpu: FW FB: 0 [ 11.511684] amdgpu 0000:08:00.0: amdgpu: Using BACO for runtime pm
It looks like amdgpu_is_conflicting_framebuffer(base, size) isn't returning anything, since "Primary adapter detected..." doesn't show up in the log, and FW FB is 0 (but I'm not sure I did that right...). Hopefully that is helpful?
One additional thought, don't we want "is_fw_fb" to be false to disable runpm? If I'm reading the code right (I'm no programmer...but I know a little C/C++) is_fw_fb -> amdgpu_is_conflicting_framebuffer -> is_conflicting_framebuffer returning false would be correct, since there isn't another GPU in my system.
We want to disable runtime pm if the board is the same one used by efifb. It restores the behavior that there with e3b39825ed0813f787cb3ebdc5ecaa5131623647 applied (i.e., runtime pm never kicks in because efifb never dropped it's runtime pm reference).
We want to disable runtime pm if the board is the same one used by efifb.
Why is that, though? amdgpu supplants efifb. Why does the GPU driven by amdgpu need to stay on just because it used to be driven by efifb?
This smells like a convoluted workaround for the real issue (which could presumably still happen with non-primary GPUs), which is that monitors stay on when they should be off after runpm resume.
BTW, I wonder if #662 might be fundamentally the same issue after all. In both cases, monitors stay on when they should be off after a hotplug event. Maybe amdgpu accidentally causes the monitors to turn on as part of the hotplug event or as part of the resulting probe of display connections?
It's just a workaround for 5.16 to restore the previous behavior rather than reverting e3b39825ed0813f787cb3ebdc5ecaa5131623647 until we can sort out a better solution.
Hmmm, care you sure it got applied correctly? Can you add some debugging output to see what is_firmware_framebuffer() returns in amdgpu_is_fw_framebuffer()?
Good news! I swapped over to Arch, compiled a fresh mainline 5.16-rc7 build with these two patches applied and it does indeed work. GPU stays on, monitors sleep properly.
On Fedora, I double checked everything and made sure the patches were applied - couldn't figure it out. Only thought I had was that they're in the middle of switching over to using simpledrm/fbdev emulation for F36 and it was being used in the 5.16-rc6 kernel I pulled from their git tree (you can see it in the dmesg log).
That would be why the patches didn’t work on Fedora. On their 5.16rc6 branch, they have CONFIG_FB_EFI disabled in the config because of the changes they’re making for Fedora 36 and disabling the legacy fbdev driver (which will probably make this workaround not work when it’s released).
I will try again when I have a minute and recompile on Fedora using a config from Fedora 35 with CONFIG_FB_EFI=y. I imagine it should work fine.
Usually on bootup, the BAR 0 of AMD GPU will be assigned to efifb. Then later, amdgpudrmfb will take over the framebuffer from efifb. However, by checking the log posted by @trilantis , i did not see such process. Instead amdgpudrmfb seems the 1st and unique owner of the framebuffer. Kind of weird...
Hmm, maybe some update(place the is_fw_fb check after amdgpu_device_init() where amdgpudrmfb takes over the framebuffer) from Alex's original patch sets can work for this. @trilantis can you give the patches below a try on your Fedora and Arch system?
@equan I don't think the patches need to be changed - I think the reason it didn't work on Fedora was due to CONFIG_FB_EFI=n in the Fedora kernel on their 5.16rc configs. See my response to Alex above.
Going to test again using a Fedora 35 .config and make sure CONFIG_FB_EFI=y.
hmm, looking at the runtime pm documentation more closely, I don't think these patches will help. If the driver enables autosuspend, the idle callback is not used. It sounds like maybe this is a race between driver init and runtime suspend. Does the attached patch help?
I have a similar issue, but I'm not sure if it is the exact same. Recently, my monitor (connected via DP) has started exhibiting this blanking behavior after shutdowns. I actually have to physically disconnect power from the monitor (and reconnect it) to get it in working order again. In fact, the on screen display normally activated by a button on the back of the monitor isn't even functional when it is put into this "zombie" power state at a shutdown. No reboots, hot or cold help: only cutting power to the monitor. Has anyone in this thread has this issue? This occurs only with Linux. On a shutdown of Windows or macOS no such issue happens.
Experiencing the same issue after upgrading GPU from an nvidia to amd (polaris). I have multiple monitors, one is connected via HDMI to the AMD card and the other one to iGPU. Both monitors wake up a few seconds after triggering dpms suspend. Blacklisting amdgpu kernel module and just booting with just i915 makes the issue go away, but only igpu works in that case.
I have been having the same issue that @justinkb mentioned #1840 (comment 1219268). After shutdown, my main DP monitor stays on (just the backlight), and the only way to fix it is to unplug and re-plug the monitor.