videoparsers: av1: Add the AV1 parse.
There is no start code defined in AV's stream, so the input can not be aligned to byte, the minimal unit should be the OBU. There are three types of AV1 alignment in the AV1 stream.
alignment: obu, tu, frame
- Aligned to byte. The basic and default one for input.
- Aligned to obu(Open Bitstream Units). The default one for output.
- Aligned to tu(Temporal Unit). A temporal unit consists of all the OBUs that are associated with a specific, distinct time instant. When scalability is disabled, it contains just exact one showing frame(may contain several unshowing frames). When scalability is enabled, it contains frames depending on the layer number. It should begin with a temporal delimiter obu. It may be useful for mux/demux to index the data of some timestamp.
- Aligned to frame. This ensures that each buffer contains only one frame of base or sub layer. It is useful for the decoder.
The annex B define a special format for the temporal unit. The size of each temporal unit is extract out to the header of the buffer, and no size field inside the each obu. There is two stream format:
stream-format: obu-stream, annexb
- obu-stream. The basic and default one.
- annexb. A special stream of temporal unit. It also implies that the alignment should be TU.
This AV1 parse implements the conversion between the alignments and the stream-formats. If the input and output have the same alignment and the same stream-format, it will check and bypass the data.
TODO:
- May need a property of operating_point to filter the OBUs
- May add a property to disable deep parse.