Skip to content

media-session: deal with global id race conditions

P V requested to merge pvir/pipewire:sm-sync-fix into master

Address some async races in media-session, which occur because one of the monitor and policy cores may be processing events late.

I think it is async-safe with these changes, and avoids having duplicate objects with the same id. There are two caveats, which seem to require redesign elsewhere to fix.


To resolve monitor and policy core global ids racing with each other, use separate registry event handlers for both cores. Each handles only their own objects, determined by where the object handle was created.

Postpone handling of policy core new global events after monitor sync, which orders them after the corresponding monitor proxy and registry events. Monitor core is then more up-to-date, so we resolve id clashes in favor of monitor globals, which avoids duplicate objects.

Fix use-after-free by tracking whether a monitor holds references to sm_object. Keep also objects pending for id in a list, so that they can be cleaned up on session_shutdown (monitors may leak objects at shutdown, because spa objectinfo events won't be handled then).

Caveats:

Zombie objects may still created if policy core is late by several events, but in those cases the corresponding remove events are already in the queue.

Also, there's a (theoretical) possibility that pw_registry_bind will bind the wrong object, if the registry event is handled too late and an id is reused by the server.

For details, see reverted 77e4fdb1 for which this is a another approach.


libcamera + bluez5 devices should be freed/unloaded via the sm_object free callback, similarly as in alsa and v4l2 monitors. This ensures they are run at session_shutdown.


Reproducing race conditions in practice:

bluetoothctl connect XX:XX:XX:XX:XX:XX
pactl list cards
export OBJID=60  # bluez device id
while true; do pw-cli set-param $OBJID 9 '{ "Spa:Pod:Object:Param:Profile:index": 1 }'; pw-cli set-param $OBJID 9 '{ "Spa:Pod:Object:Param:Profile:index": 0 }'; done

(Now that pipewire-pulse waits for reply after each operation, the pactl route doesn't repro any more.) On current master, this leads to id clashes like

[D][000011482.542554][media-session.c:1254 registry_global()] media-session 0x7fff09ab00b0: new global '56' PipeWire:Interface:Client/3
[D][000011482.542651][media-session.c:1204 bind_object()] media-session 0x7fff09ab00b0: bound new object 0x55dbf4c2de08 proxy 0x55dbf4c2dd90 id:56
[D][000011482.546846][media-session.c:1181 create_object()] media-session 0x7fff09ab00b0: created new object 0x55dbf5017978 proxy:(nil) handle:0x55dbf5017900
[D][000011482.547609][media-session.c:1070 bound_proxy()] bound 0x55dbf5017978 proxy (nil) handle 0x55dbf5017900 id:-1->56
[D][000011482.548101][media-session.c:1397 registry_global_remove()] media-session 0x7fff09ab00b0: remove global '56'
[D][000011482.548112][media-session.c:281 sm_object_destroy()] media-session 0x7fff09ab00b0: object 56 proxy:(nil) handle:0x55dbf5017900
[D][000011482.548366][media-session.c:1254 registry_global()] media-session 0x7fff09ab00b0: new global '56' PipeWire:Interface:Node/3
[D][000011482.548481][media-session.c:1204 bind_object()] media-session 0x7fff09ab00b0: bound new object 0x55dbf4a9da88 proxy 0x55dbf4a9da10 id:56
[D][000011482.549073][media-session.c:1397 registry_global_remove()] media-session 0x7fff09ab00b0: remove global '56'
[D][000011482.549094][media-session.c:281 sm_object_destroy()] media-session 0x7fff09ab00b0: object 56 proxy:0x55dbf4a9da10 handle:0x55dbf4a9da10

where the wrong object gets removed and duplicates get created.

Edited by P V

Merge request reports