Skip to content
  • Thomas Hutschenreuther's avatar
    atomic: fix load and store for armv7 and higher · d4ff4adc
    Thomas Hutschenreuther authored and Tanu Kaskinen's avatar Tanu Kaskinen committed
    The original atomic implementation in pulseaudio based on
    libatomic stated that the intent was to use full memory barriers.
    
    According to [1], the load and store implementation based on
    gcc builtins matches sequential consistent (i.e. full memory barrier)
    load and store ordering only for x86.
    
    I observed random crashes in client applications using memfd srbchannel
    transport on an armv8-aarch64 platform (cortex-a57).
    In all those crashes the first read on the pstream descriptor
    (the size field) was wrong and looked like it contained old data.
    I boiled the relevant parts of the srbchannel implementation down to
    a simple test case and could observe random test failures.
    So I figured that the atomic implementation was broken for armv8
    with respect to cross-cpu memory access ordering consistency.
    
    In order to come up with a minimal fix, I used the newer
    __atomic_load_n/__atomic_store_n builtins from gcc.
    
    With
    aarch64-linux-gnu-gcc (Linaro GCC 7.3-2018.05) 7.3.1 20180425
    they compile to
    ldar and stlxr on arm64, which is correct according to [1] and [2].
    
    The other atomic operations based on __sync builtins don't need
    to be touched since they already are of the full memory barrier
    variety.
    
    [1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
    [2] <https://community.arm.com/developer/ip-products/processors
        /b/processors-ip-blog/posts/armv8-a-architecture-2016-additions>
    d4ff4adc