>mesa-2.1.10 breaks amdgpu driver when replaying a video in WebKitGTK/gstreamer
$ uname -a
Linux cloudchaser 5.4.80-gentoo-r1 #3 SMP Sat Dec 26 23:19:37 -00 2020 x86_64 AMD Ryzen 5 PRO 3500U w/ Radeon Vega Mobile Gfx AuthenticAMD GNU/Linux
And
$ uname -a
Linux cloudchaser 5.10.8-gentoo #1 SMP Thu Jan 21 18:21:13 -00 2021 x86_64 AMD Ryzen 5 PRO 3500U w/ Radeon Vega Mobile Gfx AuthenticAMD GNU/Linux
- Mesa versions where it happens: 20.2.4, 20.3.0, 20.3.2, 20.3.3
- Mesa version where it doesn't happens: 20.1.10 and earlier
Which is why I think it's a bug from mesa.
With Linux 5.4.80:
2021-01-15T12:05:37.196652+00:00—[856703.322095] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
2021-01-15T12:05:37.196684+00:00—[856708.291651] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=104666115, emitted seq=104666117
2021-01-15T12:05:37.196693+00:00—[856708.291789] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process WebKitWebProces pid 24211 thread WebKitWebP:cs0 pid 24227
2021-01-15T12:05:37.196695+00:00—[856708.291795] amdgpu 0000:06:00.0: GPU reset begin!
2021-01-15T12:05:37.488451+00:00—[856708.596143] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2021-01-15T12:05:37.488473+00:00—[856708.596194] [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
2021-01-15T12:05:37.508887+00:00—[856708.617516] amdgpu 0000:06:00.0: GPU reset succeeded, trying to resume
2021-01-15T12:05:37.508902+00:00—[856708.618692] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
2021-01-15T12:05:37.508904+00:00—[856708.619369] [drm] PSP is resuming...
2021-01-15T12:05:37.528900+00:00—[856708.639395] [drm] reserve 0x400000 from 0xf47f800000 for PSP TMR
2021-01-15T12:05:37.568432+00:00—[856708.671446] [drm] psp command failed and response status is (0x7)
2021-01-15T12:05:38.559248+00:00—[856709.663299] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2021-01-15T12:05:38.559273+00:00—[856709.663354] [drm:gfx_v9_0_hw_init [amdgpu]] *ERROR* KCQ enable failed
2021-01-15T12:05:38.559275+00:00—[856709.663399] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
2021-01-15T12:05:38.559276+00:00—[856709.663419] amdgpu 0000:06:00.0: GPU reset(2) failed
2021-01-15T12:05:38.559278+00:00—[856709.663421] amdgpu 0000:06:00.0: GPU reset end with ret = -110
2021-01-15T12:05:48.708602+00:00—[856719.812356] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
2021-01-15T12:05:58.938819+00:00—[856730.041938] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
With Linux 5.10.8 (litte bit different for few lines)
2021-01-21T19:33:13.111230+00:00—[ 1056.467476] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
2021-01-21T19:33:13.111251+00:00—[ 1061.834456] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=57084, emitted seq=57086
2021-01-21T19:33:13.111254+00:00—[ 1061.834626] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process WebKitWebProces pid 26496 thread WebKitWebP:cs0 pid 26512
2021-01-21T19:33:13.111257+00:00—[ 1061.834633] amdgpu 0000:06:00.0: amdgpu: GPU reset begin!
2021-01-21T19:33:13.431316+00:00—[ 1062.154221] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2021-01-21T19:33:13.531305+00:00—[ 1062.253720] [drm] free PSP TMR buffer
2021-01-21T19:33:13.581236+00:00—[ 1062.304875] amdgpu 0000:06:00.0: amdgpu: MODE2 reset
2021-01-21T19:33:13.581250+00:00—[ 1062.305530] amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume
2021-01-21T19:33:13.581252+00:00—[ 1062.305904] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
2021-01-21T19:33:13.581253+00:00—[ 1062.306285] [drm] PSP is resuming...
2021-01-21T19:33:13.601223+00:00—[ 1062.326323] [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
2021-01-21T19:33:14.111718+00:00—[ 1062.833796] amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
2021-01-21T19:33:14.171233+00:00—[ 1062.893739] amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
2021-01-21T19:33:14.421213+00:00—[ 1063.152471] [drm] kiq ring mec 2 pipe 1 q 0
2021-01-21T19:33:14.661249+00:00—[ 1063.391508] amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
2021-01-21T19:33:14.661269+00:00—[ 1063.391592] [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
2021-01-21T19:33:14.661271+00:00—[ 1063.391646] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
2021-01-21T19:33:14.661273+00:00—[ 1063.391663] amdgpu 0000:06:00.0: amdgpu: GPU reset(2) failed
2021-01-21T19:33:14.661275+00:00—[ 1063.391666] amdgpu 0000:06:00.0: amdgpu: GPU reset end with ret = -110
2021-01-21T19:33:25.261577+00:00—[ 1073.978694] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
It didn't actually recover, my LCD freezes and then becomes blank. SSH'ing still works though.
Reproducing:
- Play a video (MP4?) in a WebKitGTK browser, gstreamer alone might be enough
- Once it has played, click play again
Edited by Haelwenn Monnier