Add GstIdStr for holding structure names, field names and caps features instead of using GQuark
See individual commits for adding the implementation and then switching code over from quarks to the new string type.
GstStrId
allows storing strings with up to 16 bytes (including NUL-terminator) inline, and bigger strings are stored as pointers. Pointer strings can be stored statically allocated (copying is for free), or heap allocated (requires actual allocations, also for copying). Special care was taken in the implementation to make equality checks cheap, e.g. by also storing the length of pointer strings.
Theoretically this can lead to more heap allocations for these strings, but in practice very few strings are longer than 16 bytes and most of them are statically allocated (message/query/event structure names). In addition, very little code was actually making use of the GQuark
API so instead of lot of code paths had to go through the (global RW lock protected!) quark table to convert between strings and quarks, which makes e.g. equality comparisons of GstIdStr
s faster now.
The main reason for this change is that currently it is possible for an attacker to use up arbitrary amounts of memory via GQuark
strings because we generate field names from SDP content, RTP SSRCs, etc.. See for example #3567 (closed) . This is obviously not good.
Breaking Changes
Previously gst_structure_get_name()
and any API that would return the name of a structure field (gst_structure_nth_field_name()
, gst_structure_foreach()
etc) would return an interned string, i.e. a string with static lifetime. This is no longer the case but instead the lifetime of the string is bound to the lifetime of the structure itself now.
The static lifetime was not documented anywhere and it doesn't look like anything in the monorepo except for validate makes use of this property. The Rust bindings unfortunately do make use of it (that needs to be fixed), but it seems hard to imagine actual code that would make use of this property and that would fail now. An initial backwards-compatible fix for the Rust bindings is in gstreamer-rs!1519 (merged).
This behaviour change will have to be mentioned clearly in the release notes.
Benchmark
In practice, the new implementation seems to be faster or approximately the same. See below for details. I've run everything 6 times, removed the biggest outlier and average the remaining 5 numbers. Allocations / memory usage are measured with heaptrack.
One important thing to note here is that benchmarking on my machine seems to be not very reliable (AMD CPU? Temperature? Who knows!). Everything that is <5% difference can be considered measurement noise, unfortunately. I'll have to figure that out at a later time.
The Cast
-
before
: First run with current git main 3c7ddf90 -
idstr
: This branch as of commit 183452f6 -
idstr-coreinline
: Same as above, but only inlining theGstIdStr
implementation inside core -
idstr-noinline
: Same as above, but not inlining theGstIdStr
implementation at all -
idstr-ref
: Same as above but usingGRefString
for heap allocated strings to make copies free -
idstr-refint
: Same as above but using internedGRefString
s. This a) allows for fewer allocations (allpixel-aspect-ratio
are the same allocation), b) allows for comparison by simply checking for pointer equality -
after
: Last run after everything else with git main
Results
capsnego audio
build/subprojects/gstreamer/tests/benchmarks/capsnego --loops 200 --depth 5 --flavour audio
(10 loops for heaptrack)
time (s, %) | allocations | temporary allocs | peak memory (MB) | |
---|---|---|---|---|
before | 12.648 100% | 38.533.166 | 143.816 | 30.7 |
idstr | 10.040 79% | 38.532.800 | 142.365 | 32.0 |
idstr-coreinline | 10.025 79% | 38.532.800 | 142.365 | 32.0 |
idstr-noinline | 10.128 80% | 38.532.800 | 142.365 | 32.0 |
idstr-ref | 10.022 79% | 38.529.375 | 143.088 | 32.0 |
idstr-refint | 9.988 79% | 38.528.496 | 144.409 | 32.0 |
after | 12.761 101% |
capsnego video
build/subprojects/gstreamer/tests/benchmarks/capsnego --loops 200 --depth 4 --flavour video
(10 loops for heaptrack)
time (s, %) | allocations | temporary allocs | peak memory (MB) | |
---|---|---|---|---|
before | 13.142 100% | 22.351.253 | 367.438 | 209.6 |
idstr | 13.204 100% | 22.627.363 | 367.056 | 208.7 |
idstr-coreinline | 13.252 100% | 22.627.363 | 367.056 | 208.7 |
idstr-noinline | 13.353 102% | 22.627.363 | 367.056 | 208.7 |
idstr-ref | 13.298 101% | 22.369.393 | 366.312 | 211.5 |
idstr-refint | 13.292 101% | 22.351.256 | 366.108 | 210.9 |
after | 13.309 101% |
microbenchmark
time (s, %) | allocations | temporary allocs | peak memory (MB) | |
---|---|---|---|---|
before | 9.629 100% | 158.047.993 | 2.000.797 | 2.8 |
idstr | 9.981 104% | 161.048.091 | 2.000.797 | 2.9 |
idstr-coreinline | 9.890 102% | 161.048.091 | 2.000.797 | 2.9 |
idstr-noinline | 9.942 103% | 161.048.091 | 2.000.797 | 2.9 |
idstr-ref | 9.819 102% | 159.048.091 | 2.000.797 | 2.9 |
idstr-refint | 9.783 102% | 158.048.078 | 2.000.801 | 2.9 |
after | 9.891 103% |
Conclusion
My takeaway from this is that the different variants all don't really matter in practice. The GstIdStr
variant is considerably faster in the "capsnego audio" case for some reason, but otherwise it's about the same.
So for the time being, the implementation of GstIdStr
will be inlined only in core, which will allow us to change the implementation in any way we want as long as we only need 16 bytes and pointer alignment for it. And things like GRefString
or other clever solutions can be implemented later on top of this once it is shown that it actually makes a meaningful difference.
Future Optimizations
There's also some potential for further optimizations here:
- Make use of the static string API in more places
- interned
GRefString
-style strings (but our own implementation because GLib's is broken), "German strings", etc. which can also be done later as long as it fits into 16 bytes.