- Is PipeWire Ready Yet?
- Where Is PipeWire In The Stack?
- Don't Pro-Audio and Consumer Audio Have Conflicting Requirements?
- Is PipeWire Just Another GStreamer?
- Is PipeWire Another JACK Implementation?
- Does PipeWire Replace ALSA?
- Will PipeWire Ever Be As Good As JACK?
- Are You Using A Push Or Pull Model For Scheduling?
- Isn't Format Negotiation Bad For Pro Audio?
- What About Pro Video?
- What Kind Of API Will There Be To Interface With PipeWire?
- What Audio API Do You Recommend To Use?
- Is there a native GUI tool to configure PipeWire
- What Is Wrong With JACK + PulseAudio?
- Why Not Just Improve JACK Instead?
- Why Not Just Improve PulseAudio Instead?
- How Is PipeWire Supposed To Be A Better PulseAudio?
- How Is PipeWire Supposed To Be A Better JACK?
- How Is PipeWire Going To Avoid Xruns?
- How Is PipeWire Going To Handle Latency?
- Why Is The API So Complicated?
- PipeWire Buffering Explained
- User Questions
- Does Desktop Audio Interfere With Pro-Audio Using PipeWire?
- Bypass Mixer In Exclusive Mode?
- Developers: About a2j In PipeWire Current Midi Catch 22 With JACK
- Output To Two Devices?
- Could PipeWire Be Used To Work Around Lack of Acceleration in Xinerama?
- How to Change rtkit rt.prio?
- Is There a Way to Connect Droidcam Audio?
Is PipeWire Ready Yet?
It is ready for broader use and is scheduled to be included in Fedora 34.
The API/ABI has been declared stable since version 0.3.
The protocol can support older 0.2 version clients transparently. This means that flatpaks with older PipeWire libraries can connect to a newer daemon.
Where Is PipeWire In The Stack?
PipeWire sits right on top of the kernel drivers (or as close as possible). You can think of it as a multimedia routing layer on top of the drivers that applications and libraries can use.
Don't Pro-Audio and Consumer Audio Have Conflicting Requirements?
Pro-audio needs low and reliable latency with minimal audio over/underruns. Power usage is of little concern. Pro-audio requires flexible user-configurable routing of the signals.
Consumer Audio focuses on low power usage, latency (in the case of playback) is of no concern. Consumer audio wants things to just work with minimal configuration.
Where JACK and PulseAudio where explicitly (exclusively) tuned for their respective use cases, PipeWire takes a hybrid approach. PipeWire uses the scheduling and graph model of JACK but mainly uses timer-based wakeups like PulseAudio. This makes it possible to dynamically switch between small buffers with low-latency/high power usage and large buffers with high-latency/low power usage. It adapts based on the latency requirements of the application in a glitch-free way. There are limits to this, PipeWire can only increase buffer sizes to 8192 samples (+-180ms) but coupled with much more simple code-paths this should be good enough for consumer use.
PipeWire also supports dynamic add/remove of devices with automatic clock slaving. It handles bluetooth devices or any other node that can be written as a plugin.
PipeWire mainly uses pro-audio formats (floating point samples) as the canonical data-format between nodes in the graph. It is also possible to negotiate other formats to support compressed formats.
The part of PulseAudio that manages the policy is implemented in a separate session manager that can be adapted and configured according to the consumer use case.
Is PipeWire Just Another GStreamer?
PipeWire is architecturally significantly different from GStreamer and is designed more like JACK. Differences include:
- The processing graphs are processed in a much more controlled fashion. This allows us to achieve much lower and more predictable latencies. All nodes in the graph are woken up from source to sink when the device wakes up for input/output. Data is processed in fixed sized chunks.
- lockfree processing.
- More localized and lighter format negotiation. The Negotiation and format description is borrowed from GStreamer.
- No dynamic buffer allocation while processing. All buffers and metadata are allocated before processing begins.
GStreamer is intended to be a swiss army knife of multimedia, PipeWire is meant to be much lower level, more like what alsa-lib, JACK or libv4l2 provides.
Is PipeWire Another JACK Implementation?
PipeWire has a very similar processing model as JACK but adds the following features compared to JACK:
- Extensible communication protocol that allows new interfaces on objects to be added in the future.
- Arbitrary formats can be negotiated between nodes. This allows us to handle video as well as compressed formats. This is important for sending compressed formats to the device (AC3 over HDMI or AAC over bluetooth, for example).
- Negotiation of buffers. A pool of buffers can be negotiated between instances and the memory is exchanged with fd passing. This makes it possible to share hardware surfaces and make video possible.
- Dynamic sinks and sources. Devices can be hotplugged. There is automatic slaving between devices similar to what a2j does when graphs are joined.
- Dynamic latency, it adapts the buffer period to the lowest requested latency. Smaller buffer sizes use more CPU but larger buffer sizes have more latency.
- Synchronous clients are providing data for the current processing cycle of the device(s). There is no extra period of latency.
- Dynamic device suspend and resume. Unused devices are closed to save CPU.
- Implemented with sandboxing in mind.
- Some of the limitations of JACK are fixed. PipeWire has something similar to the JACK transport that also supports looping, trick modes and lookahead of the scheduled timeline.
- PipeWire has a more generic control type that can be used to implement Midi and OSC natively. Midi similar to a2jmidid is built in.
Does PipeWire Replace ALSA?
No, ALSA is an essential part of the Linux audio stack, it provides the interface to the kernel audio drivers.
That said, the ALSA user space library has a lot of stuff in it that is probably not desirable anymore these days, like effects plugins, mixing, routing, slaving, etc.
PipeWire uses a small subset of the core alsa functionality to access the hardware (It should run with tinyalsa, for example). All of the other features should be handled by PipeWire.
Will PipeWire Ever Be As Good As JACK?
Possibly... there are some things that JACK can optimize for, like:
- It can configure the alsa device with 2 periods with fixed small size. With the current ALSA driver implementations this can result in more reliable low latencies than can be achieved with using a timer based mechanism. With some tuning, similar latency as JACK can be achieved on USB and internal audio cards.
- It does not need to care about security and can simply allocate all objects in one fixed piece of shared memory, this makes it much faster to get to the data you need and to introspect objects.
- It does not need to care about negotiation of data formats or buffers, which makes it faster to build the graph and start processing.
- It has a lot of support and history.
- We might not want to support some JACK features, like session management.
Are You Using A Push Or Pull Model For Scheduling?
PipeWire uses pull model. This means that the device wakes up at the last possible moment to pull in more data from all the nodes in the graph. This allows for the lowest possible latency between producing the data and consuming it.
This is in contrast to GStreamer that mostly uses the push model. In this model, data is produced independently of the device and is then queued in the device or queues in front of the device (in case of video playback).
Isn't Format Negotiation Bad For Pro Audio?
Yes. Format conversions are not cheap and must be avoided. For audio processing in PipeWire we have the following rules:
- Filters and real-time clients must use float 32 mono audio. The audio processing graph is only operating in this format.
- Format conversions are done at the input/output nodes. This means that conversions are done to and from devices and also to and from clients that use the stream API.
- This also means that the conversion code for clients runs in the context of the client and not the server. This also avoids issues with having complicated code such as decoders running in the server context.
What About Pro Video?
- Similar to audio we have one common format: RGBA float32 premultiplied linear video. This should be easy to generate and manipulate on GPU/CPU and allow for HDR and simple compositing operations.
- A splitter/converter for video devices. We need to convert the v4l2 buffers to the common format so that the filters can work on them. Likewise we need converters inside the server side stream API to send/receive video in other formats.
What Kind Of API Will There Be To Interface With PipeWire?
A lowlevel API that allows you to create a node that you can then add to a local or remote processing graph. This API gives you full control over format and buffer negotiation, supports multiple input and outputs as well as controls, commands, events and parameters. The node will be part of the real-time processing graph and provides data for the current processing cycle of the graph.
There is a filter API that can be used to make audio and video filters. It can have multiple input and output ports as well as parameters. It is like the JACK client API but more powerful.
The most used API will be the stream API. The idea is to create a stream object that allows you to play or record one stream from the server. You then receive callbacks when a buffer needs to be provided for playback or when a buffer is available for capture. The stream API has a client side component that will do format and buffer size conversions when requested. The stream API has simple controls for audio volume and video colorbalance. The stream API can work synchronous and asynchronous. It is like the pulseaudio API but more powerful.
There is a replacement library for JACK client so that they run on top of PipeWire. PulseAudio application continue to use the pulseaudio client library. There is a replacement pulseaudio daemon implemented on top of PipeWire.
What Audio API Do You Recommend To Use?
The situation is a bit like GUI toolkits. There are many, each with different use cases. Nobody uses the native display server protocols directly (X11, Wayland) but always through an abstraction layer (GTK, Qt, etc).
We recommend that you continue to use PulseAudio, JACK and ALSA API's for now.
Is there a native GUI tool to configure PipeWire
No. We recommend you use pavucontrol and carla/qjackctl. They work fine and there is no urgent need to rewrite those applications.
The PipeWire lowlevel API is a loose collection of objects, properties and parameters that are combined into a coherent use case by the Audio Toolkit in use (JACK/PulseAudio). So any GUI without a concrete use case would not make much sense.
What Is Wrong With JACK + PulseAudio?
PulseAudio has a JACK backend that sends all the mixed streams to JACK. It however has some problems:
- Smaller JACK period sizes wake up pulseaudio a lot, causing it to use massive amounts of CPU.
- Suspend of the JACK device is not implemented/possible.
- Passthrough on the JACK device is not possible.
- Individual streams in PulseAudio are not managed inside JACK.
Why Not Just Improve JACK Instead?
- JACK has no support for negotiating formats or buffers. This makes it hard to implement anything like exclusive access to devices or more complicated buffer memory. PipeWire attempts to keep the same goals as JACK but with adding format and buffer negotiation.
- The JACK API has no support for fd backed memory. For video it is important to leave the pixels on the GPU instead of touching it with the CPU. It's not clear how this can be added nicely. One option would be to embed more data into the port buffers. With an extension to the protocol we could place a data structure in a local buffer with the video fd.
- Current JACK implementations do not care about security of sandboxed clients.
Why Not Just Improve PulseAudio Instead?
- The PulseAudio design does not allow for video buffers.
- PulseAudio design is not suited for the kind of low-latency we target. There is too much logic and context switches between the client and device.
How Is PipeWire Supposed To Be A Better PulseAudio?
- PipeWire can achieve lower latency with much less CPU usage and dropouts compared to PulseAudio. This would greatly improve video conferencing apps, like WebRTC in the browser.
- PipeWire's security model can stop applications from snooping on each other's audio.
- PipeWire allows more control over how applications are linked to devices and filters.
- PipeWire uses an external policy manager that can provide better integration with the rest of the desktop system and configuration.
How Is PipeWire Supposed To Be A Better JACK?
- PipeWire is more dynamic by design. It can expose all devices and does similar things that zita-a2j/j2a can provide. The implementation of merging the devices and doing resampling is also a lot more efficient than what zita-a2j can provide.
- Multiple devices don't need to be resampled to a common clock when they are not in any way linked to each other.
- It handles Bluetooth devices or any device for which a plugin can be made.
- PipeWire can adapt the latency dynamically, which is important for power usage on a laptop. When low latency is required, the system can switch automatically and seamlessly to smaller buffer sizes.
- PipeWire allows arbitrary formats, which makes it possible to implement exclusive access to devices, passthrough and more. This is important if you want to send raw DTS to your amplifier or AAC to your Bluetooth headphones, potentially improving audio quality and preserving power.
- PipeWire will implement full latency compensation. This is not available in JACK and it would be hard to implement efficiently.
How Is PipeWire Going To Avoid Xruns?
- All the regular system tuning you might do to avoid or reduce xruns still apply, for now.
- PipeWire uses a thread with real-time priority, eventfd and epoll in the data processing path. This does not fix avoid the underrun/overrun problems but using simple primitives allows the processing pipeline to run on a real-time kernel subsystem like EVL (https://evlproject.org). This might be to solution in the long term to avoid xruns.
How Is PipeWire Going To Handle Latency?
- The plan is to implement full latency compensation in PipeWire. This means that streams will be sample accurately aligned even when signals go through different paths with different latencies because of how PipeWire allocates memory, this can be done quite efficiently by changing offsets in the sample buffers.
- For how PipeWire handles latency for USB devices please see
api.alsa.headroomunder pipewire-media-session in the Wiki.
Why Is The API So Complicated?
"Can't an audio API be simply open/read/write"? "Why do I need a mainloop and callbacks"? "Isn't doing audio essentially just copying samples to and from a buffer"?
For anything more than playing a beep, it is more complicated.
At the lowest level, the device decides when more samples need to be written or read to/from the device ring buffer. This is usually implemented with an interrupt of some sort. For optimal performance, the application needs to directly react to this signal and read/write samples to the device ring buffer. This is called the pull module.
This way, the application can wait with generating the audio data until the last possible moment to achieve the lowest possible latency. Volume updates or synthesizer can react to GUI sliders and keyboard events with lower latency this way.
With a simple read/write model this cannot be done, you need an API to wait for the device signal either with a poll or an event. Additional API that provides timing information can also work but then you need to do polling or implement the timeouts or callbacks yourself, likely with less accurate results than what the device interrupt can provide.
That said, you can always write a simple open/read/write API on top of pull based API and PipeWire provides the more low level APIs to make this possible. Look at pa_simple, which also works fine on PipeWire.
There is some more about this here:
PipeWire Buffering Explained
PipeWire has two 'buffers':
One it keeps in the hardware device by keeping one period (quantum) of data in the sink. If there is < quantum of data if runs the graph to ask all nodes to provide one new quantum of data. Presumably that can happen before the sink finishes playing the remaining data (if not: xrun). Batch devices (alsa devices that report themselves as batch) enable IRQs at period-size/2 (by default 1024/2) and keep an extra period-size/2 of data in the device as headroom. So, normal device introduce a latency of
quantum, batch devices run at a latency of
quantum + period-size/2.
Buffers in the application. This can be anything you would want it to be. jack clients use 0 buffering and react to the graph wake-up immediately. stream based clients can do the same thing but usually do some sort of extra buffering.
This does not include extra buffering in the resamplers (~50 samples). There is a resampler in each stream and sink/source. The resampler latency is 0 when the resampler is not used.
This does also not include latency introduced by the USB subsystem or other hardware latencies. These are generally the same as with JACK and PulseAudio when the IRQ period-size is set to a sufficiently low value for batch devices.
The quantum on the server is controlled by the clients node.latency property. It it always set to the lowest requested latency. If you start a PipeWire app with PIPEWIRE_LATENCY=128/48000 it will use a 2.6ms quantum and the latency between a client waking up and the sink read pointer will be 2.6ms.
default.clock.max-quantum you should be able to configure 128 samples for a 2.6ms server latency. If you use
pw-top you can see the selected latency between app/device. I would suggest to see what it says there after adjusting the max-quantum.
For PulseAudio clients, it is the
pipewire-pulse server that is woken up every quantum and it has an internal buffer based on what the client negotiated. Clients in general set buffering requirements based on configuration options in the application or you can use
PULSE_LATENCY_MSEC env variable to configure things. Other than that it should work pretty much the same way as PulseAudio.
PIPEWIRE_LATENCY= does nothing for PulseAudio clients.
So, first use max-quantum to limit server latency, use the alsa-monitor.conf file to set batch period size (for batch devices), then use client config or env variable to set client latency.
Does Desktop Audio Interfere With Pro-Audio Using PipeWire?
No, we're going to make sure they don't interfere. We have some options to do this:
- Tag some devices as DSP devices and make sure when in use by DSP apps, we don't route non-DSP apps to it.
- Make sure the pipewire-pulse server can't see the DSP tagged devices.
- Go into a 'DSP' mode where we simply don't service non-DSP apps.
I think in the end we will need to look at some concrete use cases and then implement the policy correctly.
Taken from Gitlab.
Bypass Mixer In Exclusive Mode?
Please see issue #126.
Developers: About a2j In PipeWire Current Midi Catch 22 With JACK
Please see issue #406.
Output To Two Devices?
Please see issue #508
Could PipeWire Be Used To Work Around Lack of Acceleration in Xinerama?
Please see issue #522
How to Change rtkit rt.prio?
Please see issue #685
Is There a Way to Connect Droidcam Audio?
Please see issue 713