V4L2 buffers are lost if the client responds too slowly
The v4l2 plugin is leaking v4l2 buffers if the client returns buffers to late or not at all.
Steps to reproduce
- Start a client with a v4l2 source device and
PIPEWIRE_DEBUG=7
, e.g., a GStreamer Pipelinegst-launch-1.0 pipewiresrc ! fakesink
. - Stop the client, e.g., by suspending the process or stopping it with gdb.
- Wait a little and continue execution.
In the trace you can see that the client now always receives the same buffer with the same ID from the server and no other buffers anymore.
The client does not need to be suspended, but this also happens under high system load.
Analysis
The reason for that is that the activation from mmap_read()
in v4l2-utils.c is racing with the recycle buffer call from the client handled in impl_node_process()
in v4l2-source.c.
- The v4l2 device triggers a fd event.
- The v4l2 source dequeues the buffer from the v4l2 device, puts the buffer id into the io of the v4l2-source and sets the status to HAVE_DATA.
- The output port of the v4l2-source processes the buffer in
tee_process()
and sets the status to NEED_DATA. - The client reads the buffer and sends a recycle event when it is done.
- The client's response triggers a fd event on the socket.
- The v4l2 source processes the io, sees the NEED_DATA with a buffer id and queues the buffer back into the v4l2 device.
If the client is too late with sending the response, 1. will be triggered again between steps 3. and 5. Furthermore, the mmap_read()
does not care that the buffer in the io has not been queued the v4l2 device and will put the new buffer into the io. At that point, the buffer that had been there will be lost.
I worked around the problem by adding a check in mmap_read()
to recycle the v4l2 buffer right before putting a new buffer into the io. The buffers are not lost anymore, but this only feels like a workaround for an issue that shouldn't be there.
The issue is more that the io has only a single slot that is used for both directions and the NEED_DATA status is not only used to signal, that more data is required, but misused to send a signal to the producer to recycle the buffer that is in the IO. Further, by having two different events that could result in a new buffer, things become really weird.
Therefore, I think that this must be solved on an architectural level, but I don't know how to do that.