subtitleoverlay: Handle video sink pad CAPS query earlier
The internal elements are only created when caps on both video and subtitle pads are known.
Prior to that, a GST_QUERY_CAPS on a video sink pad would just return ANY instead of giving a hint of what downstream can actually handle and prefers. This could result in upstream elements (such as decoders) deciding on chosing (in the best cases) a non-optimal caps or (in the worst case) caps that couldn't be handled by the elements downstream of subtitleoverlay.
In order to fix that, we assume that all subtitle "elements" handle the subtitle
overlay composition feature/meta and handle GST_QUERY_CAPS
ourselves if the
internal elements aren't present yet.
Fixes #3171 (closed) #3160 (closed) #3120 (closed) #1746 (closed) #3248 (closed)