libs: decoder: h264, h265: in context at least 16 reference surfaces

Registering only stream's DBP size number of surfaces for decoding VA
surfaces brings issues for certain streams. This change register all
possible number of reference surfaces in a stream, which is 16.

Fixes: #94
parent aa737094
Pipeline #173317 waiting for manual action with stages
in 51 seconds
......@@ -1623,7 +1623,7 @@ ensure_context (GstVaapiDecoderH264 * decoder, GstH264SPS * sps)
info.chroma_type = priv->chroma_type;
info.width = sps->width;
info.height = sps->height;
info.ref_frames = dpb_size;
info.ref_frames = 16;
if (!gst_vaapi_decoder_ensure_context (GST_VAAPI_DECODER (decoder), &info))
return GST_VAAPI_DECODER_STATUS_ERROR_UNKNOWN;
......
......@@ -1201,7 +1201,7 @@ ensure_context (GstVaapiDecoderH265 * decoder, GstH265SPS * sps)
info.chroma_type = priv->chroma_type;
info.width = sps->width;
info.height = sps->height;
info.ref_frames = dpb_size;
info.ref_frames = 16;
if (!gst_vaapi_decoder_ensure_context (GST_VAAPI_DECODER (decoder), &info))
return GST_VAAPI_DECODER_STATUS_ERROR_UNKNOWN;
......
  • This will increase the memory footprint significantly. if dpb size is 3, this will increase 13132M(80964320*4) = 1.8G memory for an 8k 10 bits stream.

    Could we just report an error for the stream and keep the original code? @vjaquez @He_Junyan @ndufresne

  • Yes, that increase the memory footprint, but also it's safest since the reference list can have 16 pictures, regardless the specified/calculated dpb. FFMpeg uses that number.

    Do you have a real use case where this increment of memory brings a problem?

    Be aware that now, with boundless surfaces in context the number of allocated surfaces can grow with no limit (for example, encoders of filters with a big latency).

  • @vjaquez This patch will force every stream to reach the max surface number. even we only have dpb size 1.

    For IoT use cases every memory is important, some users may use 8G memory to handle many streams.

    ffmpeg is not a role model for memory usage, it has a big problem. for a typical 4 channels 8k@30fps, it will use 4 times memory than gst-vaapi(before the current patch). it's more than 8G. I can send you a detailed report if you are interested.

    Be aware that now, with boundless surfaces in context the number of allocated surfaces can grow with no limit (for example, encoders of filters with a big latency).

    the encoder will report a min buffer count. if we can use it + dpb size + scratch buffer count as the max pool size, it may be not an issue.

    Not sure we can find another way to handle the error bitstream or not.

  • A problem is that we can not know the exact buffer count of downstream when we create the decoder context. After first frame is decoded and ready to push to downstream, we can send caps to downstream and know the exact buffer count of downstream.

    And the surfaces created by context->pool may be more than the DPB size. For example, a downstream deep learning element may hold 100 frames for dereference(if we have enough memory), but the DPB is only 16.

    Edited by He Junyan
  • Can the surfaceless context patch can solve that stream issue?

  • Can the surfaceless context patch can solve that stream issue?

    It probably would, but more for chance rather than design.

  • For IoT use cases every memory is important, some users may use 8G memory to handle many streams. ffmpeg is not a role model for memory usage, it has a big problem. for a typical 4 channels 8k@30fps, it will use 4 times memory than gst-vaapi(before the current patch). it's more than 8G. I can send you a detailed report if you are interested.

    That's a valid observation, though it's still theoretical so far, isn't it? or is it there any tests or use-case having regressions?

    Still, even incorrect from the spec point of view, while using DBP size only it has been working since the beginnig; and with the surfaceless context patch the problems with deadlocks are gone. Perhaps reverting with comment in there would make sense. Dunno.

Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment