spa: loop: TSS issues
- In
flush_all_queues()
, I believe unlocking the lock when invoking the callback may be problematic because the owning thread might go away. Commit 494600d4 added the lock/unlock sequence around the callback with the comment that
We don't actually need to hold the lock while calling the invoke function, we only need the lock to protect the list of queues.
However, if the queue goes away, then item
will be dangling, so any access to item
after the unlock can be problematic.
-
Leaking of
queue
s. This happens because after (for some definition of "after")tss_delete()
returns, no more destructors are called. So if there is a thread that has a queue, but the loop is destroyed before that thread ends, then itsqueue
object will never be freed. -
TSS destructors racing with
impl_clear()
. This can cause two instances ofloop_queue_destroy()
to run for the same queue (one from the TSS destructor and the other fromimpl_clear()
). This can result in the queue being removed twice from the list, or being freed twice, etc.
I think (3) can be addressed by some more locking in impl_clear()
to avoid the races with TSS destructors.
As far as I can see one way to solve (2) is to defer the deletion of the TSS key until the last queue
of the particular loop has been destroyed. For example, by putting into a separate reference counted allocation.
I am not sure if it is possible to delete the TSS key earlier. I have looked at the musl implementation and it seems possible that a TSS destructor is invoked after tss_delete()
returns in the calling thread. So it appears impossible to determine which queue
objects would need to be freed, which means that a thread's queue
can only really be safely freed in the thread that "owns" it, so in the TSS destructor (or as a special case, the executing thread's queue
can be freed in impl_clear()
itself). However, that means that the TSS key cannot be destroyed until all queue
s have been freed, otherwise the destructors wouldn't be called.
Unfortunately deferring the deletion of the TSS key is not ideal because it essentially wastes a TSS slot for an unbound amount of time (e.g. on musl there are only 128 TSS slots).