VAAPI on VCN: bad stream may crash whole gfx system
I got a damaged DVD I tried to rescue using DDrescue, it got rescued for 98.4%. When I play the iso image with vlc, using VAAPI on my ryzen 4600, at a certain point it (off course) enters the damaged area, and the whole gfx system in kernel seems to give up instead of VLC only. a dmesg of what happens:
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:2 pasid:32788, for process vlc pid 25313 thread vlc:cs0 pid 25324)
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: in page starting at address 0x0000800101222000 from IH client 0x12 (VMC)
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00243850
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: \x09 Faulty UTCL2 client ID: VCN (0x1c)
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: \x09 MORE_FAULTS: 0x0
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: \x09 WALKER_ERROR: 0x0
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: \x09 PERMISSION_FAULTS: 0x5
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: \x09 MAPPING_ERROR: 0x0
Aug 14 09:37:09 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: \x09 RW: 0x1
Aug 14 09:37:19 desktopjp kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec timeout, signaled seq=1159, emitted seq=1161
Aug 14 09:37:19 desktopjp kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process vlc pid 25313 thread vlc:cs0 pid 25324
Aug 14 09:37:19 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: GPU reset begin!
Aug 14 09:37:20 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: MODE2 reset
Aug 14 09:37:20 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: RAS: optional ras ta ucode is not available
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: RAP: optional rap ta ucode is not available
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: SMU is resuming...
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: SMU is resumed successfully!
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec test failed (-110)
Aug 14 09:37:21 desktopjp kernel: [drm:amdgpu_do_asic_reset [amdgpu]] *ERROR* resume of IP block <vcn_v2_0> failed -110
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: GPU reset(1) failed
Aug 14 09:37:21 desktopjp kernel: amdgpu 0000:30:00.0: amdgpu: GPU reset end with ret = -110
Aug 14 09:37:21 desktopjp kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Aug 14 09:37:21 desktopjp kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
I can't image this is how it's supposed to work: a damaged video stream crashes the whole gpu, and nothing is visible anymore (though when using ssh from another PC, you can still see it's alive)