souphttpsrc: stall and high memory usage if upstream connection is fast
Since commit 0bcefa73, souhttpsrc will stall for an undefinite period of time and keep accumulating buffers internally if the upstream connection is fast enough. This is because the code relies on an idle source to trigger reading from soup internal buffers using g_input_stream_read_async
; in the case where the data can be downloaded very quickly from the server, the main loop might not trigger idle events for a long time, while soup accumulates data internally.
Additionally, the idle callback does not try to dequeue as much data as possible so only one read is done when the callback triggers.
Example (at interrupt 110MB were used by gst-launch):
# GST_DEBUG='soup*:7,queue2:6,uridecodebin:5' gst-launch -v souphttpsrc location=https://absolut.zogzog.org/share/1G ! fakesink silent=false
Setting pipeline to PAUSED ...
0:00:00.082892581 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:2332:gst_soup_http_src_start:<souphttpsrc0> start("https://absolut.zogzog.org/share/1G")
0:00:00.083209620 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:1156:gst_soup_http_src_session_open:<souphttpsrc0> Creating session (can share 1)
0:00:00.083415530 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:1161:gst_soup_http_src_session_open:<souphttpsrc0> Created session 0x8306ea8
0:00:00.084049640 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:1181:gst_soup_http_src_session_open:<souphttpsrc0> Waiting for thread to start...
0:00:00.084529470 4761 0x82e7928 DEBUG souphttpsrc gstsouphttpsrc.c:989:thread_func:<souphttpsrc0> thread start
0:00:00.086765908 4761 0x82e7928 DEBUG souphttpsrc gstsouphttpsrc.c:1087:_session_ready_cb:<souphttpsrc0> thread ready
0:00:00.087012528 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:1184:gst_soup_http_src_session_open:<souphttpsrc0> Soup thread started
0:00:00.087141418 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:1194:gst_soup_http_src_session_open:<GstSoupSession@0x8306ea8> Sharing session 0x8306ea8
0:00:00.087335958 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:2395:gst_soup_http_src_set_context:<souphttpsrc0> Setting external session 0x8306ea8
0:00:00.087881797 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:2442:gst_soup_http_src_get_size:<souphttpsrc0> get_size() = FALSE
0:00:00.088047987 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:2488:gst_soup_http_src_do_seek:<souphttpsrc0> do_seek(0-18446744073709551615)
0:00:00.088136807 4761 0x838abc0 DEBUG souphttpsrc gstsouphttpsrc.c:2493:gst_soup_http_src_do_seek:<souphttpsrc0> Seek to current read/end position and no seek pending
Pipeline is PREROLLING ...
Got context from element 'souphttpsrc0': gst.soup.session=context, session=(GstSoupSession)NULL;
0:00:00.090859835 4761 0x82e7960 LOG souphttpsrc gstsouphttpsrc.c:1932:gst_soup_http_src_do_request:<souphttpsrc0> Running request for method: GET
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = event ******* (fakesink0:sink) E (type: stream-start (10254), GstEventStreamStart, stream-id=(string)bc3c4067f7177cbb9fdaceb2d86d28b714f82b1a725d6e27b5b8db0d5e17ed8d, flags=(GstStreamFlags)GST_STREAM_FLAG_NONE, group-id=(uint)1;) 0x76c02950
0:00:00.187350836 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(M): > GET /share/1G HTTP/2
0:00:00.187888396 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(M): > Soup-Debug-Timestamp: 1665162303
0:00:00.188134846 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(M): > Soup-Debug: SoupSession 1 (0x82f6330), SoupMessage 1 (0x83aaf00), GSocket 1 (0x76c140f0)
0:00:00.188320226 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(H): > User-Agent: GStreamer souphttpsrc 1.20.3.1 libsoup/3.2.0
0:00:00.188533656 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(H): > Accept-Encoding: identity
0:00:00.188650696 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(H): > icy-metadata: 1
0:00:00.188865185 4761 0x82e7928 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<GstSoupSession@0x8306ea8> HTTP_SESSION(M):
0:00:00.196704730 4761 0x82e7928 INFO souphttpsrc gstsouphttpsrc.c:1383:gst_soup_http_src_got_headers:<souphttpsrc0> got headers
0:00:00.197537039 4761 0x82e7928 DEBUG souphttpsrc gstsouphttpsrc.c:1451:gst_soup_http_src_got_headers:<souphttpsrc0> size = 1073741824
0:00:00.198081909 4761 0x82e7928 DEBUG souphttpsrc gstsouphttpsrc.c:1537:gst_soup_http_src_got_headers:<souphttpsrc0> Content-Type: application/octet-stream
0:00:00.198483569 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:1908:gst_soup_http_src_send_message:<souphttpsrc0> Successfully got a reply
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = event ******* (fakesink0:sink) E (type: custom-downstream-sticky (76830), http-headers, uri=(string)https://absolut.zogzog.org/share/1G, http-status-code=(uint)200, request-headers=(structure)"request-headers\,\ User-Agent\=\(string\)\"GStreamer\\\ souphttpsrc\\\ 1.20.3.1\\\ libsoup/3.2.0\"\,\ Accept-Encoding\=\(string\)identity\,\ icy-metadata\=\(string\)1\;", response-headers=(structure)"response-headers\,\ Server\=\(string\)\"nginx/1.18.0\\\ \\\(Ubuntu\\\)\"\,\ Date\=\(string\)\"Fri\\\,\\\ 07\\\ Oct\\\ 2022\\\ 17:05:03\\\ GMT\"\,\ Content-Type\=\(string\)application/octet-stream\,\ Content-Length\=\(string\)1073741824\,\ Last-Modified\=\(string\)\"Mon\\\,\\\ 25\\\ Jan\\\ 2021\\\ 10:43:42\\\ GMT\"\,\ ETag\=\(string\)\"\\\"600ea0de-40000000\\\"\"\,\ Accept-Ranges\=\(string\)bytes\,\ strict-transport-security\=\(string\)\"max-age\\\=15768000\"\;";) 0x76c28990
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = event ******* (fakesink0:sink) E (type: custom-downstream-sticky (76830), http-context, user-agent=(string)"GStreamer\ souphttpsrc\ 1.20.3.1\ ";) 0x76c289d0
0:00:01.829610239 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2130:gst_soup_http_src_read_buffer:<souphttpsrc0> Read 4096 bytes from http input
0:00:01.829810409 4761 0x82e7960 LOG souphttpsrc gstsouphttpsrc.c:1994:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Checking to update blocksize. Read: 4096 bytes, blocksize: 4096 bytes, time since last read: 4:56:36.772971000
0:00:01.830280849 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2303:gst_soup_http_src_create:<souphttpsrc0> Returning 0 ok
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = event ******* (fakesink0:sink) E (type: segment (17934), GstEventSegment, segment=(GstSegment)"segment, flags=(GstSegmentFlags)GST_SEGMENT_FLAG_NONE, rate=(double)1, applied-rate=(double)1, format=(GstFormat)bytes, base=(guint64)0, offset=(guint64)0, start=(guint64)0, stop=(guint64)18446744073709551615, time=(guint64)0, position=(guint64)0, duration=(guint64)1073741824;";) 0x76c02b50
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = preroll *******
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
Redistribute latency...
New clock: GstSystemClock
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain ******* (fakesink0:sink) (4096 bytes, dts: 0:00:00.000000000, pts: none, duration: none, offset: 0, offset_end: -1, flags: 00000040 discont , meta: none) 0x7620d000
0:00:05.391198168 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2130:gst_soup_http_src_read_buffer:<souphttpsrc0> Read 4096 bytes from http input
0:00:05.391459388 4761 0x82e7960 LOG souphttpsrc gstsouphttpsrc.c:1994:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Checking to update blocksize. Read: 4096 bytes, blocksize: 4096 bytes, time since last read: 0:00:03.561371000
0:00:05.391656428 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2018:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Decreased blocksize to 4096
0:00:05.391933258 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2303:gst_soup_http_src_create:<souphttpsrc0> Returning 0 ok
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain ******* (fakesink0:sink) (4096 bytes, dts: none, pts: none, duration: none, offset: 4096, offset_end: -1, flags: 00000000 , meta: none) 0x7620d0a8
0:00:12.280025073 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2130:gst_soup_http_src_read_buffer:<souphttpsrc0> Read 4096 bytes from http input
0:00:12.280287663 4761 0x82e7960 LOG souphttpsrc gstsouphttpsrc.c:1994:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Checking to update blocksize. Read: 4096 bytes, blocksize: 4096 bytes, time since last read: 0:00:06.888288000
0:00:12.280502793 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2303:gst_soup_http_src_create:<souphttpsrc0> Returning 0 ok
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain ******* (fakesink0:sink) (4096 bytes, dts: none, pts: none, duration: none, offset: 8192, offset_end: -1, flags: 00000000 , meta: none) 0x7620d150
0:00:12.282089982 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2130:gst_soup_http_src_read_buffer:<souphttpsrc0> Read 4096 bytes from http input
0:00:12.282591361 4761 0x82e7960 LOG souphttpsrc gstsouphttpsrc.c:1994:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Checking to update blocksize. Read: 4096 bytes, blocksize: 4096 bytes, time since last read: 0:00:00.001868000
0:00:12.282830051 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2006:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Increased blocksize to 8192
0:00:12.283045621 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2303:gst_soup_http_src_create:<souphttpsrc0> Returning 0 ok
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain ******* (fakesink0:sink) (4096 bytes, dts: none, pts: none, duration: none, offset: 12288, offset_end: -1, flags: 00000000 , meta: none) 0x7620d1f8
0:00:21.581647473 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2130:gst_soup_http_src_read_buffer:<souphttpsrc0> Read 8192 bytes from http input
0:00:21.581960993 4761 0x82e7960 LOG souphttpsrc gstsouphttpsrc.c:1994:gst_soup_http_src_check_update_blocksize:<souphttpsrc0> Checking to update blocksize. Read: 8192 bytes, blocksize: 8192 bytes, time since last read: 0:00:09.298853000
0:00:21.582249863 4761 0x82e7960 DEBUG souphttpsrc gstsouphttpsrc.c:2303:gst_soup_http_src_create:<souphttpsrc0> Returning 0 ok
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain ******* (fakesink0:sink) (8192 bytes, dts: none, pts: none, duration: none, offset: 16384, offset_end: -1, flags: 00000000 , meta: none) 0x7620d2a0
^Chandling interrupt.
Interrupt: Stopping pipeline ...
I've tried reverting the patch (there were conflicts so I'm not confident) and this solves the problem.
FYI thread safety of SoupSession was reintroduced in 3.2.0: https://libsoup.org/libsoup-3.0/client-thread-safety.html