Distorted audio when mis-matched devices joined to the same virtual sink
I created a virtual sink for simultaneous output of audio to usb headphones and a pci audio card in the manner described on the ArchLinux wiki for "Simultaneous output to transient devices" here.
When the headphones are linked to the virtual sink with any number of devices that share the same S16LE format it works perfectly. However as soon as a pci card with the S32LE format or a USB sound output device with the S24LE format is linked then the quality of output to any of the S16LE devices becomes very poor, normally for a few seconds after the link but then again at random intervals for perhaps 10 seconds out of every 60 seconds of sound output thereafter. I assumed that perhaps it was because the virtual sink was set to the F32P format, however it made no difference when I forced the virtual sink to be created in the S16LE format in order to match the headphones. Interestingly the pci card with the S32LE format works perfectly with no crackling during every test, regardless of what other devices are linked to the virtual sink alongside it. This issue only appears for some applications, most noticeably for anything running via Wine, but also Discord which is a "native" electron app.
I've attached a pw-dump from a correctly working config (good-sound.txt) and one where a single channel is linked to the pci card (bad-sound.txt).
The diff between the dump files is small and when I had a look it the only thing that caught my eye was this:
> "api.alsa.headroom": 0,
4536a4538,4539
> "api.alsa.period-num": 32,
> "api.alsa.period-size": 1024,
Could it be one of the alsa configs above that is causing the problem?
I am running libwireplumber 0.4.17 and libpipewire 1.0.2 (Included by default in Ubuntu 24.04).
I am running a completely vanilla config with no lua scripts (other than the virtual sink created as per the referenced Wiki page).
I've attached a visual representation of the links using qpwgraph to make it easier to see what I'm talking about.
I've also attached an mp3 file of the good and bad configs so you can hear the type of distortion. Note that towards the end of the "bad" recording that it starts to clear up. Typically it will then output good sound for around 30-60 seconds before reverting to the distortion.
Again, to be clear I can listen to flawless audio on any of my devices for hours on end. As long as I don't link them to the same virtual sink.