AMD/RX 6600 - VA-API video output is corrupted if decoded surfaces are exported by vaExportSurfaceHandle and then quickly returned to ffmpeg/va-api decoder and reused
Fedora 37 / Mesa 23.0.3 / Radeon RX 6600 / H.264 clip
Firefox bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1832080 Affected clip: https://bugzilla.mozilla.org/attachment.cgi?id=9332399
Reproduction steps:
- Install latest firefox nightly - https://www.mozilla.org/en-US/firefox/channel/desktop/
- Open clip from https://bugzilla.mozilla.org/attachment.cgi?id=9332399
- output is broken
Dumped decoded va-api frames and related logs are available at: https://bugzilla.mozilla.org/attachment.cgi?id=9332401
This bug is caused by vaExportSurfaceHandle() call on decoded surface (by ffmpeg/va-api decoder on Radeon) and surface return to decoder. There's no need to render/create EGLImage over it or so. To reproduce the bug you just need to call vaExportSurfaceHandle() on decoded surface and then return it to ffmpeg to re-use it. If vaExportSurfaceHandle() is not called and surface is returned to ffmpeg for reuse, the bug doesn't occur. So if first 4-5 frames are skipped (vaExportSurfaceHandle() is not called on them) the bug doesn't occur.
There's related log from Firefox:
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: avcodec_send_packet
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: VA-API Got one frame output ID 0x18 key 1 with pts=0 dts=40000 duration=40000 opaque=-9223372036854775808
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: avcodec_send_packet
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: VA-API Got one frame output ID 0x17 key 0 with pts=40000 dts=80000 duration=40000 opaque=-9223372036854775808
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: avcodec_send_packet
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: VA-API Got one frame output ID 0x18 key 1 with pts=80000 dts=120000 duration=40000 opaque=-9223372036854775808
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: avcodec_send_packet
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: VA-API Got one frame output ID 0x17 key 1 with pts=120000 dts=160000 duration=40000 opaque=-9223372036854775808
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: avcodec_send_packet
[RDD 212388: MediaPDecoder #1]: D/PlatformDecoderModule FFMPEG: VA-API Got one frame output ID 0x16 key 0 with pts=160000 dts=200000 duration=40000 opaque=-9223372036854775808
And now the surface 0x16 is corrupted and all following non-keyframes too.
OTOH mpv is not affectes as it holds reference to the frames longer:
avcodec_receive_frame() ID 0x16 key 1
avcodec_receive_frame() ID 0x15 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x16
avcodec_receive_frame() ID 0x14 key 1
avcodec_receive_frame() ID 0x13 key 1
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x15
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x14
avcodec_receive_frame() ID 0x15 key 0
avcodec_receive_frame() ID 0x12 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x13
avcodec_receive_frame() ID 0x14 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x15
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x12
avcodec_receive_frame() ID 0x16 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x14
avcodec_receive_frame() ID 0x13 key 0
avcodec_receive_frame() ID 0x15 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x16
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x13
avcodec_receive_frame() ID 0x11 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x15
avcodec_receive_frame() ID 0x14 key 0
vaExportSurfaceHandle 0x7fe0c43e13f0 ID 0x11
avcodec_receive_frame() ID 0x12 key 0
avcodec_receive_frame() ID 0x16 key 0
...
As you see, mpv uses decoded surfaces with 2-3 delay so it's not affected by this bug.