Possible race condition in dbus_watch_handle and dbus_message_unref
Submitted by mep..@..il.com
Assigned to Havoc Pennington
Description
Hello,
My code receives incoming messages in one thread and sends outgoing messages from other threads. It works fine, but Helgrind issues the following warning from time to time:
==5039== Possible data race during read of size 4 at 0x45CA46C by thread #4 ==5039== Locks held: 1, at address 0x45CA760 ==5039== at 0x408BCB4: _dbus_counter_get_value (dbus-resources.c:170) ==5039== by 0x4094532: check_read_watch (dbus-transport-socket.c:195) ==5039== by 0x40949F6: do_reading.part.2 (dbus-transport-socket.c:687) ==5039== by 0x409535E: socket_handle_watch (dbus-transport-socket.c:677) ==5039== by 0x4093908: _dbus_transport_handle_watch (dbus-transport.c:865) ==5039== by 0x4073F0E: _dbus_connection_handle_watch (dbus-connection.c:1447) ==5039== by 0x40968BE: dbus_watch_handle (dbus-watch.c:669) ==5039== by 0x404D7F6: DBus::Watch::handle(int) (in ~/gcc-4.4.5-dbus-c++-0.5.0-0.12.20090203git13281b3.fc15/lib/libdbus-c++-1.so.0.0.0) ==5039== by 0x4055914: DBus::Glib::BusWatch::watch_handler(void*) (in ~/gcc-4.4.5-dbus-c++-0.5.0-0.12.20090203git13281b3.fc15/lib/libdbus-c++-1.so.0.0.0) ==5039== by 0x40551CB: watch_dispatch(_GSource*, int ()(void), void*) (in ~/gcc-4.4.5-dbus-c++-0.5.0-0.12.20090203git13281b3.fc15/lib/libdbus-c++-1.so.0.0.0) ==5039== by 0x41465E4: g_main_context_dispatch (gmain.c:1960) ==5039== by 0x414A2D7: g_main_context_iterate (gmain.c:2591) ==5039== ==5039== This conflicts with a previous write of size 4 by thread #3 ==5039== Locks held: none ==5039== at 0x408BC6D: _dbus_counter_adjust (dbus-resources.c:146) ==5039== by 0x4081BCE: free_size_counter (dbus-message.c:504) ==5039== by 0x4099D10: _dbus_list_foreach (dbus-list.c:800) ==5039== by 0x408218D: dbus_message_unref (dbus-message.c:527) ==5039== by 0x40507AC: DBus::Message::~Message() (in ~/gcc-4.4.5-dbus-c++-0.5.0-0.12.20090203git13281b3.fc15/lib/libdbus-c++-1.so.0.0.0) ==5039== by 0x42B19E4: rmc::ResolverC::resolve(int, rmc::ClientManager&, std::string const&) (ResolverC.cpp:41) ==5039== by 0x42B1AFD: rmc::ResolverC::resolve(std::string const&) (ResolverC.cpp:49) ==5039== by 0x42A9D3C: rmc::ClientManager::get(std::string const&) (ClientManager.cpp:41) ==5039== ==5039== Address 0x45CA46C is 4 bytes inside a block of size 20 alloc'd ==5039== at 0x4026CAC: malloc (vg_replace_malloc.c:263) ==5039== by 0x409BA10: dbus_malloc (dbus-memory.c:471) ==5039== by 0x408BB3A: _dbus_counter_new (dbus-resources.c:81) ==5039== by 0x40927C5: _dbus_transport_init_base (dbus-transport.c:118) ==5039== by 0x4095595: _dbus_transport_new_for_socket (dbus-transport-socket.c:1203) ==5039== by 0x409577D: _dbus_transport_new_for_tcp_socket (dbus-transport-socket.c:1290)
I have taken a look at the D-Bus code and found out that DBusMessages are linked to transport->live_messages counters, that are not thread-safe. When DBusMessage is destroyed, the counter is updated by _dbus_counter_adjust(). When this coincides with watch bookkeeping, _dbus_counter_get_value() may randomly return either old or new value - I'm not sure whether this may lead to failures.
However, I could imagine that if two different threads send two different messages through the same connection at the same time, counter value might become wrong.
Version: 1.5