r600: “*ERROR* UVD not responding” at boot then computer hangs when plugging screen
I get this in dmesg
at boot time with a Radeon HD 6870 (TeraScale 2, Barts XT):
[ 5.310031] [drm] radeon: irq initialized.
[ 5.326821] [drm] ring test on 0 succeeded in 2 usecs
[ 5.327234] [drm] ring test on 3 succeeded in 6 usecs
[ 6.499632] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 7.508851] [drm] amdgpu kernel modesetting enabled.
[ 7.516132] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 7.518899] amdgpu: CRAT table disabled by module option
[ 7.519186] amdgpu: Virtual CRAT table created for CPU
[ 7.519469] amdgpu: Topology: Add CPU node
[ 8.532609] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 9.549149] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 10.565684] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 11.582238] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 12.598796] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 13.615375] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 14.631897] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 15.648469] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 15.669136] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, giving up!!!
[ 15.669932] radeon 0000:01:00.0: failed initializing UVD (-1).
The computer boots without screen attached on that card (but on another card).
Once I plug a screen on the card I get those other errors including a GPU reset and the computer hangs.
[ 224.950878] amdgpu 0000:07:00.0: [drm] User-defined mode not supported: "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x60 0xa
[ 229.927522] [drm] fb mappable at 0xFFF03BF000
[ 229.927527] [drm] vram apper at 0xFFF0000000
[ 229.927528] [drm] size 3145728
[ 229.927528] [drm] fb depth is 24
[ 229.927529] [drm] pitch is 4096
[ 229.927631] radeon 0000:01:00.0: [drm] fb1: radeondrmfb frame buffer device
[ 240.597775] radeon 0000:01:00.0: ring 0 stalled for more than 10240msec
[ 240.597791] radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000000004 last fence id 0x0000000000000005 on ring 0)
[ 241.169931] radeon 0000:01:00.0: Saved 23 dwords of commands on ring 0.
[ 241.169956] radeon 0000:01:00.0: GPU softreset: 0x00000219
[ 241.169960] radeon 0000:01:00.0: GRBM_STATUS = 0xE5704CA0
[ 241.169963] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0xFE000001
[ 241.169965] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0xFE000001
[ 241.169968] radeon 0000:01:00.0: SRBM_STATUS = 0x20080FC0
[ 241.169971] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000
[ 241.169973] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x01000000
[ 241.169976] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00011000
[ 241.169978] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00068400
[ 241.169980] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80870243
[ 241.169983] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[ 241.620578] radeon 0000:01:00.0: Wait for MC idle timedout !
[ 241.620580] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B
[ 241.620634] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00020100
[ 241.621782] radeon 0000:01:00.0: GRBM_STATUS = 0x00003828
[ 241.621784] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000007
[ 241.621785] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000007
[ 241.621787] radeon 0000:01:00.0: SRBM_STATUS = 0x20080EC0
[ 241.621789] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000
[ 241.621790] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[ 241.621792] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000
[ 241.621793] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000
[ 241.621795] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000
[ 241.621797] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[ 241.621817] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[ 241.635604] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[ 242.141194] radeon 0000:01:00.0: Wait for MC idle timedout !
I have reproduced it on two different motherboards and both Ubuntu 23.10 and 22.04 LTS.
I would like to know if that's a driver bug or if the card is faulty.
One thing is that the web has plenty of 10 years old threads about similar error logs and people saying to disable modesetting to workaround their issues, though modesetting is not really optional today.
Edit: The amdgpu log lines are because there is another GPU (here, a Vega integrated with the CPU), this is what allows me to boot the system without plugging a screen on the HD 6870. I also tested on another computer where the screen was plugged on another TeraScale GPU. This way I know everything works unless I start to use the HD 6870.