Black screen when detaching HDMI cable (AMD A10-9620P)

📎 Carlo Caione uploaded an attachment:

This is what we have in the log when we give 'xset dpms force off; xset dpms force on'.

Attachment 132062, "dmps_off_on":
log2

💬 Michel Dänzer @daenzer said:

Jun 19 17:09:49 endless kernel: [drm] Detected VRAM RAM=32M, BAR=32M

I suspect the core problem is that there's only 32 MB of VRAM available. Is it possible to increase this in the BIOS setup?

💬 Carlo Caione said:

I suspect the core problem is that there's only 32 MB of VRAM available.
Is it possible to increase this in the BIOS setup?
It is not. There is nothing in the BIOS related to VRAM.

💬 Michel Dänzer @daenzer said:

(In reply to Carlo Caione from comment 3)

There is nothing in the BIOS related to VRAM.

FWIW, it wouldn't say "VRAM" but rather "integrated graphics memory" or something like that.

💬 Carlo Caione said:

FWIW, it wouldn't say "VRAM" but rather "integrated graphics memory" or
something like that.
Yeah :) Let me put this way: there is nothing in the BIOS related to graphic controller / GPU / video in general.

FWIW the BIOS is InsydeH20 v0.09

💬 Carlo Caione said:

Also I guess you are looking at the wrong controller:

[ 2.111381] amdgpu 0000:00:01.0: VRAM: 32M 0x000000F400000000 - 0x000000F401FFFFFF (32M used)
[ 2.111390] [drm] Detected VRAM RAM=32M, BAR=32M
[ 2.111511] [drm] amdgpu: 32M of VRAM memory ready
[ 6.560772] amdgpu 0000:03:00.0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[ 6.560785] [drm] Detected VRAM RAM=2048M, BAR=256M
[ 6.560805] [drm] amdgpu: 2048M of VRAM memory ready

So I guess in this laptop there is an integrated controller with 32MB of VRAM and the GPU with 2GB?

💬 Michel Dänzer @daenzer said:

There are two GPUs, the integrated one in the APU (Carrizo family) and a dedicated one (Polaris 12 family). Xorg is using the integrated one, and there's no way around that, because only the integrated GPU has display outputs hooked up. The dedicated GPU could only be used via PRIME render offloading.

💬 Carlo Caione said:

Interesting. Ok then, back to square one: no BIOS options to tweak the integrated graphics memory / controller.

On a side note: is it normal so many error messages in the journal? Like:

kernel: amdgpu 0000:00:01.0: ffff9cccade4d800 pin failed
kernel: [drm:amdgpu_crtc_page_flip_target [amdgpu]] ERROR failed to pin new abo buffer before flip
gdm-Xorg-:0[672]: (WW) AMDGPU(0): flip queue failed: Cannot allocate memory
gdm-Xorg-:0[672]: (WW) AMDGPU(0): Page flip failed: Cannot allocate memory
gdm-Xorg-:0[672]: (EE) AMDGPU(0): present flip failed
...
gdm-Xorg-:0[672]: (WW) AMDGPU(0): get vblank counter failed: Invalid argument

or

kernel: amdgpu: [powerplay] min_core_set_clock not set

💬 Michel Dänzer @daenzer said:

(In reply to Carlo Caione from comment 8)

On a side note: is it normal so many error messages in the journal? Like:

kernel: amdgpu 0000:00:01.0: ffff9cccade4d800 pin failed
kernel: [drm:amdgpu_crtc_page_flip_target [amdgpu]] ERROR failed to pin
new abo buffer before flip
gdm-Xorg-:0[672]: (WW) AMDGPU(0): flip queue failed: Cannot allocate memory
gdm-Xorg-:0[672]: (WW) AMDGPU(0): Page flip failed: Cannot allocate memory
gdm-Xorg-:0[672]: (EE) AMDGPU(0): present flip failed
...
gdm-Xorg-:0[672]: (WW) AMDGPU(0): get vblank counter failed: Invalid argument

I think all of these are triggered by VRAM being too small to fit the scanout buffers covering the laptop panel + external monitor.

> kernel: amdgpu: [powerplay] min_core_set_clock not set

Not sure about this one, might be harmless.

💬 Carlo Caione said:

I think all of these are triggered by VRAM being too small to fit the
scanout buffers covering the laptop panel + external monitor.
Probably I'm missing something, but when the HDMI is connected everything works fine, with the scanout buffer correctly displayed on the laptop panel + external monitor. The problem starts when we disconnect the HDMI cable.

Also if it was a problem with VRAM being too small, why toggling the DPMS makes the laptop panel working fine again?

💬 Michel Dänzer @daenzer said:

(In reply to Carlo Caione from comment 10)

Probably I'm missing something, but when the HDMI is connected everything
works fine, with the scanout buffer correctly displayed on the laptop panel

external monitor. The problem starts when we disconnect the HDMI cable.

At least some of the errors you referenced in comment 8 already happen before that. They're related to failed attempts at page flipping. xf86-video-amdgpu manages to chug along regardless.

When you unplug the HDMI cable is presumably when

> Jun 19 17:10:31 endless gdm-Xorg-:0[672]: (EE) AMDGPU(0): failed to set mode: Invalid argument

appears, i.e. drmModeSetCrtc() fails, presumably (not 100% sure about this part though) because the new, smaller scanout buffer cannot fit into VRAM while the old, larger one is still being scanned out.

> Also if it was a problem with VRAM being too small, why toggling the DPMS
> makes the laptop panel working fine again?

Toggling DPMS off disables scanout, which allows the old scanout buffer to be moved out of VRAM, so the new one can be moved in.

Some details might differ from the above, but that should be roughly what's happening.

📎 Carlo Caione uploaded an attachment:

Interesting. Thank you for explaining this and your time.

I just tried the HEAD of xf86-video-amdgpu and now instead of having a black screen I have the image corruption as shown in the picture.

Anything I can do to debug / have this fixed? Is it something I need to fix with the ODM (acer)?

Attachment 132081, "Corruption using xf86-video-amdgpu HEAD":

💬 Carlo Caione said:

I just tried the HEAD of xf86-video-amdgpu and now instead of
having a black screen I have the image corruption as shown
in the picture.
Just FYI this is due to commit b09fde0d81 ("Use reference counting for tracking KMS framebuffer lifetimes").

💬 Michel Dänzer @daenzer said:

(In reply to Carlo Caione from comment 12)

I just tried the HEAD of xf86-video-amdgpu and now instead of having a black
screen I have the image corruption as shown in the picture.

Without seeing the corresponding Xorg log, I guess that's just a different symptom triggered by the same issue.

> Anything I can do to debug / have this fixed?

We need to make scanout work with buffers outside of VRAM somehow. I've kicked off an internal discussion about this.

With an amd-staging-* kernel branch and DC enabled, you can try tweaking dce_v11_0_crtc_do_set_base to pass AMDGPU_GEM_DOMAIN_GTT instead of / in addition to AMDGPU_GEM_DOMAIN_VRAM to amdgpu_bo_pin. The DC code should already handle this correctly, but we're not sure whether or not there are additional constraints on system memory used for scanout. If there are, it probably won't work correctly yet.

> Is it something I need to fix with the ODM (acer)?

If you can get a BIOS which allows setting up larger VRAM, that might allow you to move forward faster.

📎 Carlo Caione uploaded an attachment:

Without seeing the corresponding Xorg log, I guess that's just
a different symptom triggered by the same issue.
Attached the log. Yeah, not much different.

> With an amd-staging-* kernel branch and DC enabled, you can try tweaking
> dce_v11_0_crtc_do_set_base to pass AMDGPU_GEM_DOMAIN_GTT instead of / in
> addition to AMDGPU_GEM_DOMAIN_VRAM to amdgpu_bo_pin. The DC code should
> already handle this correctly, but we're not sure whether or not there are
> additional constraints on system memory used for scanout. If there are, it
> probably won't work correctly yet.
I tried amd-staging-4.11 and interestingly dce_v11_0_crtc_do_set_base is called only when DC is disabled. When DRM_AMD_DC=y the function is never called.

I tried also to make the s/AMDGPU_GEM_DOMAIN_VRAM/AMDGPU_GEM_DOMAIN_GTT/ change with DC disabled. What I get is that I have some kind of intermittent display corruption when _connecting_ the HDMI cable on both the screens but on detaching at least everything is fine on the laptop panel.

Attachment 132118, "journal_HDMI_detaching_corruption":
log

💬 Michel Dänzer @daenzer said:

(In reply to Carlo Caione from comment 15)

With an amd-staging-* kernel branch and DC enabled, you can try tweaking
dce_v11_0_crtc_do_set_base to pass AMDGPU_GEM_DOMAIN_GTT instead of / in
addition to AMDGPU_GEM_DOMAIN_VRAM to amdgpu_bo_pin. [...]
I tried amd-staging-4.11 and interestingly dce_v11_0_crtc_do_set_base is
called only when DC is disabled.

Right, sorry, with DC you need to tweak dm_plane_helper_prepare_fb instead.

📎 Carlo Caione uploaded an attachment:

Right, sorry, with DC you need to tweak dm_plane_helper_prepare_fb instead.
Yup, I tried this but not much luck. In attachment what I get when using AMDGPU_GEM_DOMAIN_GTT. Using AMDGPU_GEM_DOMAIN_GTT | AMDGPU_GEM_DOMAIN_VRAM is pretty much the same but the cursor is correctly displayed (even though I cannot move it).

The log is filled with:

[drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer

/me scratch his head

~~Attachment 132130~~, "dm_plane_helper_prepare_fb with AMDGPU_GEM_DOMAIN_GTT":

💬 Carlo Caione said:

We have found another laptop with exactly the same issue (and again 32MB of VRAM for the embedded video controller). We have also requested to ACER a new BIOS with a bigger size of VRAM, waiting to receive it (hopefully).

Any news about the internal discussion about this issue? We are available to test any fix / workaround / proposal :)

Thanks,

💬 Michel Dänzer @daenzer said:

No news I'm afraid.

(In reply to Carlo Caione from comment 17)

In attachment what I get when using AMDGPU_GEM_DOMAIN_GTT.
[...]
[drm:dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer

Do you also get those errors with AMDGPU_GEM_DOMAIN_GTT? If so, it might be interesting to track down the origin of the errors. Otherwise, it looks like there's still something missing.

Black screen when detaching HDMI cable (AMD A10-9620P)

Submitted by Carlo Caione

Description

Designs

Child items 0

Activity

Admin message

Admin message

Black screen when detaching HDMI cable (AMD A10-9620P)

Submitted by Carlo Caione

Description

Activity