awss3sink: Added "Soft Seek" feature
This PR adds an in-memory "part cache" for the multi-part uploader to leverage when handling seeking operations. Integration tests have been included to show the expected behaviors. One of the main use cases our team has had on the older C/C++ -based s3sink
element was to allow the mp4mux
to seek back into the header of the file and re-write it before EOS. This PR ports a significant portion of that logic to the Rust -based element here. The feature is controlled by the num-cache-parts
property, which defines the number of multipart uploader "parts" to store in memory either at the head (positive number) or tail (negative number) up to the AWS S3 limit of 10,000 parts. With the default and minimum part size being 5 megabytes, care must be taken not to consume all the memory of the system, of course.
Expected behaviors:
- Seeking into a cached part via segment event is now possible (seek-then-write).
- Continued writes into subsequent parts from the cache results in the preceding part being flushed to S3 and the next part loaded from the cache.
- Writing up to the last byte stored in the cache will produce a debug message indicating that the next write will cause a cache miss. An EOS at this point will still finalize the file without error.
- Writing past the last byte stored in the cache will produce a flow error and log message describing the cache miss.
- Seeking into the active part's buffer is also possible (with or without cache having been enabled, this is equivalent to
num-cache-parts=-1
). - Querying for seek limits will return "everything" (0->infinity) since there is no clear way in the API to describe seekable regions; the segment event's
false
return is the way to check if a specific offset is available in the cache. - The position query is now implemented.
This was originally developed against GStreamer version 1.24.2.