Skip to content

Add GstIdStr for holding structure names, field names and caps features instead of using GQuark

Sebastian Dröge requested to merge slomo/gstreamer:structure-fields into main

See individual commits for adding the implementation and then switching code over from quarks to the new string type.

GstStrId allows storing strings with up to 16 bytes (including NUL-terminator) inline, and bigger strings are stored as pointers. Pointer strings can be stored statically allocated (copying is for free), or heap allocated (requires actual allocations, also for copying). Special care was taken in the implementation to make equality checks cheap, e.g. by also storing the length of pointer strings.

Theoretically this can lead to more heap allocations for these strings, but in practice very few strings are longer than 16 bytes and most of them are statically allocated (message/query/event structure names). In addition, very little code was actually making use of the GQuark API so instead of lot of code paths had to go through the (global RW lock protected!) quark table to convert between strings and quarks, which makes e.g. equality comparisons of GstIdStrs faster now.

The main reason for this change is that currently it is possible for an attacker to use up arbitrary amounts of memory via GQuark strings because we generate field names from SDP content, RTP SSRCs, etc.. See for example #3567 (closed) . This is obviously not good.

Breaking Changes

Previously gst_structure_get_name() and any API that would return the name of a structure field (gst_structure_nth_field_name(), gst_structure_foreach() etc) would return an interned string, i.e. a string with static lifetime. This is no longer the case but instead the lifetime of the string is bound to the lifetime of the structure itself now.

The static lifetime was not documented anywhere and it doesn't look like anything in the monorepo except for validate makes use of this property. The Rust bindings unfortunately do make use of it (that needs to be fixed), but it seems hard to imagine actual code that would make use of this property and that would fail now. An initial backwards-compatible fix for the Rust bindings is in gstreamer-rs!1519 (merged).

This behaviour change will have to be mentioned clearly in the release notes.

Benchmark

In practice, the new implementation seems to be faster or approximately the same. See below for details. I've run everything 6 times, removed the biggest outlier and average the remaining 5 numbers. Allocations / memory usage are measured with heaptrack.

One important thing to note here is that benchmarking on my machine seems to be not very reliable (AMD CPU? Temperature? Who knows!). Everything that is <5% difference can be considered measurement noise, unfortunately. I'll have to figure that out at a later time.

The Cast

  • before: First run with current git main 3c7ddf90
  • idstr: This branch as of commit 183452f6
  • idstr-coreinline: Same as above, but only inlining the GstIdStr implementation inside core
  • idstr-noinline: Same as above, but not inlining the GstIdStr implementation at all
  • idstr-ref: Same as above but using GRefString for heap allocated strings to make copies free
  • idstr-refint: Same as above but using interned GRefStrings. This a) allows for fewer allocations (all pixel-aspect-ratio are the same allocation), b) allows for comparison by simply checking for pointer equality
  • after: Last run after everything else with git main

Results

capsnego audio

build/subprojects/gstreamer/tests/benchmarks/capsnego --loops 200 --depth 5 --flavour audio

(10 loops for heaptrack)

time (s, %) allocations temporary allocs peak memory (MB)
before 12.648 100% 38.533.166 143.816 30.7
idstr 10.040 79% 38.532.800 142.365 32.0
idstr-coreinline 10.025 79% 38.532.800 142.365 32.0
idstr-noinline 10.128 80% 38.532.800 142.365 32.0
idstr-ref 10.022 79% 38.529.375 143.088 32.0
idstr-refint 9.988 79% 38.528.496 144.409 32.0
after 12.761 101%

capsnego video

build/subprojects/gstreamer/tests/benchmarks/capsnego --loops 200 --depth 4 --flavour video

(10 loops for heaptrack)

time (s, %) allocations temporary allocs peak memory (MB)
before 13.142 100% 22.351.253 367.438 209.6
idstr 13.204 100% 22.627.363 367.056 208.7
idstr-coreinline 13.252 100% 22.627.363 367.056 208.7
idstr-noinline 13.353 102% 22.627.363 367.056 208.7
idstr-ref 13.298 101% 22.369.393 366.312 211.5
idstr-refint 13.292 101% 22.351.256 366.108 210.9
after 13.309 101%

microbenchmark

test.c

time (s, %) allocations temporary allocs peak memory (MB)
before 9.629 100% 158.047.993 2.000.797 2.8
idstr 9.981 104% 161.048.091 2.000.797 2.9
idstr-coreinline 9.890 102% 161.048.091 2.000.797 2.9
idstr-noinline 9.942 103% 161.048.091 2.000.797 2.9
idstr-ref 9.819 102% 159.048.091 2.000.797 2.9
idstr-refint 9.783 102% 158.048.078 2.000.801 2.9
after 9.891 103%

Conclusion

My takeaway from this is that the different variants all don't really matter in practice. The GstIdStr variant is considerably faster in the "capsnego audio" case for some reason, but otherwise it's about the same.

So for the time being, the implementation of GstIdStr will be inlined only in core, which will allow us to change the implementation in any way we want as long as we only need 16 bytes and pointer alignment for it. And things like GRefString or other clever solutions can be implemented later on top of this once it is shown that it actually makes a meaningful difference.

Future Optimizations

There's also some potential for further optimizations here:

  • Make use of the static string API in more places
  • interned GRefString-style strings (but our own implementation because GLib's is broken), "German strings", etc. which can also be done later as long as it fits into 16 bytes.
Edited by Sebastian Dröge

Merge request reports