VAAPI: EFC on VCN2 produces broken H264 video and crashes the HEVC encoder
EFC on VCN2/Renoir/5700G causes two issues:
- H264 encoder produces broken video
- Crashes the kernel driver when using HEVC encoder
ffmpeg -vaapi_device /dev/dri/renderD128 -f lavfi -i testsrc=s=1280x720,format=bgra -vf hwupload,scale_vaapi=format=nv12 -c:v h264_vaapi -b:v 4M -maxrate 4M -maxrate 8M -async_depth 4 -y /tmp/out.mp4
ffmpeg -vaapi_device /dev/dri/renderD128 -f lavfi -i testsrc=s=1280x720,format=bgra -vf hwupload,scale_vaapi=format=nv12 -c:v hevc_vaapi -b:v 4M -maxrate 4M -maxrate 8M -async_depth 4 -y /tmp/out.mp4
[ 520.247551] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_enc0 timeout, signaled seq=1637, emitted seq=1638
[ 520.247905] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
[ 520.248234] amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
[ 520.443137] [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
[ 520.639900] [drm] Register(0) [mmUVD_RB_RPTR] failed to reach value 0x00000080 != 0x00000040
[ 520.829040] [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
[ 520.867002] amdgpu 0000:0b:00.0: amdgpu: MODE2 reset
[ 520.867067] amdgpu 0000:0b:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 520.867198] [drm] PCIE GART of 1024M enabled.
[ 520.867200] [drm] PTB located at 0x000000F41FC00000
[ 520.867249] [drm] PSP is resuming...
[ 521.568365] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
[ 521.858437] amdgpu 0000:0b:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 521.868948] amdgpu 0000:0b:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 521.868949] amdgpu 0000:0b:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 521.868951] amdgpu 0000:0b:00.0: amdgpu: SMU is resuming...
[ 521.869941] amdgpu 0000:0b:00.0: amdgpu: SMU is resumed successfully!
[ 521.870485] [drm] DMUB hardware initialized: version=0x01010027
[ 522.058836] [drm] kiq ring mec 2 pipe 1 q 0
[ 522.063130] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 522.063174] [drm] JPEG decode initialized successfully.
[ 522.063175] amdgpu 0000:0b:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[ 522.063176] amdgpu 0000:0b:00.0: amdgpu: ring gfx_low uses VM inv eng 1 on hub 0
[ 522.063177] amdgpu 0000:0b:00.0: amdgpu: ring gfx_high uses VM inv eng 4 on hub 0
[ 522.063178] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 5 on hub 0
[ 522.063178] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 6 on hub 0
[ 522.063179] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 7 on hub 0
[ 522.063179] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 8 on hub 0
[ 522.063180] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 9 on hub 0
[ 522.063180] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 10 on hub 0
[ 522.063181] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 11 on hub 0
[ 522.063181] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 12 on hub 0
[ 522.063182] amdgpu 0000:0b:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 13 on hub 0
[ 522.063182] amdgpu 0000:0b:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[ 522.063183] amdgpu 0000:0b:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[ 522.063183] amdgpu 0000:0b:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[ 522.063184] amdgpu 0000:0b:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[ 522.063184] amdgpu 0000:0b:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[ 522.065114] amdgpu 0000:0b:00.0: amdgpu: recover vram bo from shadow start
[ 522.065115] amdgpu 0000:0b:00.0: amdgpu: recover vram bo from shadow done
[ 522.065127] amdgpu 0000:0b:00.0: amdgpu: GPU reset(1) succeeded!
Adding AMD_DEBUG=noefc
fixes both issues.
Kernel: 6.4.5
Mesa: 23.3.0-devel
Firmware: [ 8.251016] [drm] Found VCN firmware Version ENC: 1.20 DEC: 6 VEP: 0 Revision: 0
cc @thongthai