Handle "default" input/output nodes ("sinks/sources" and "devices") fluidly and consistently among all access APIs
An amalgamation of ideas from:
- #428 (comment 711202) & #461 (comment 812965) - potential perfect audio hardware setup.
- #426 (comment 711771) - shared dynamic sink/source lists.
- #446 (comment 730811) - default "system" sink/source for JACK.
-
#109 (comment 595133) -
marking any non-shim JACK node as PA sink/source. - #415 (comment 714954) - PA & JACK sink/source shim-node made via PA-specific utility.
- #292 (comment 627613) - how to set hardware volume by default.
- #729 (comment 800047) - hw-volume versus sw-volume.
-
#461 (comment 723819) - how to handle channel mapping.
- #461 (comment 752873) - why default PA-style blind stereo duplication is bad.
-
How to do stereo-to-surround "upmix" right: different methods of simulating virtual room, that's "filled" by waves from the stereo source and then mapped to your real room with surround outputs. Difference between most advanced imitations and plain stereo is still negligible and purely subjective, as only echo is simulated and not the audio sources themselves.
- #461 (comment 723819) - ideally, only ray-tracing of audio-wave propagation gives perfect surround results, just as RT of photons does for perfect visual rendering or any other wave simulation.
- #438 (comment 716847) - save & restore routing state (session) as complex node-graph, that's made-up of per-hardware-port pipelines which have ability to intertwine at shared nodes.
-
#425 (comment 709792) & #109 (comment 724954) & #515 (comment 794581) - automatically spawn/despawn intermediate nodes that are needed to restore full session for currently running edge-apps (ones that want nodes marked as sink/source); semi-arbitrary multi-stage post-processing pipelines between apps and devices.
- #698 (closed) - use manual JACK routing + virtual/null PA sink or PulseEffects for now.
- #704 (comment 802987) - about devices with multiple sub-devices/maps.
So, the goal of "proper" default input/output node handling would be:
- Expose shared node list & their properties for all APIs (naked ALSA, PA, JACK, native PW) as much as each of them allows.
- Main properties of audio nodes being volume and channel map, they should be handled especially carefully.
- They all should be saved and restored.
- Unknown devices should get hearing-safe values like -36dB for speakers and -48dB for headphones; master, capture and the undefined may be maxed.
- Set software at 100% and hardware - at safe, if hardware is present; otherwise set software to safe. Use 100% sw-volume by default for null/loopback nodes.
- For dynamic sensitivity control of capture, if enabled, to avoid clipping software (generally at hardware peaks should stop at -6dB to -24dB, depending on available DR of microphone and carefulness) boost should always be minimized (unlike how Windows does it and PA, that's parroting it), real <0dB range of capture port should be handled with 0 boost at first.
- It's better to use software post-processing, like DRC and complex noise cancelling, to boost volume without clipping than do so directly at hardware ports.
- Port plug/unplug events should not change volume by default (again, as PA does parroting Windows), they may influence routing but by adding connections, not switching them unless explicitly defined by user.
- Naturally, nodes should not be recreated when their properties change (including but not limited to plugged/unplugged state). Especially when apps directly interface with them.
- There should always be an I/O interface presented to apps, such as a dummy sink/source.
- Main properties of audio nodes being volume and channel map, they should be handled especially carefully.
- Keep shared state for all nodes among all APIs (sync all changes across all APIs).
- Don't depend on PA, JACK or whatever else's apps for configuration of PW-specific features.
- Come up with common terminology, such as:
- End-Node: node closest to hardware, usually known as "device".
- Edge-Node: node closest to software, usually known as "sink" or "source" which an app uses for immediate I/O transfers; equals to end-node if there is no intermediate processing nodes.
- Intermediate Processing Node: a node somewhere in the pipeline that always connected to others from both I/O sides (that means that any duplex end-node can also serve as an IPN in pipelines of others).
- Inter-Node Processing: processing that PW does in between the nodes (probably should be exposed as IPN that cannot be rerouted).
- Shared Node: IPN that is used in several pipelines (of same/single or different/several apps).
- Node-Graph: full view of nodes and their routing made-up of all pipelines.
- PipeLine: a string of nodes from an edge to an end (any edge or end node can participate in the unlimited number of simultaneous pipelines).
- Channel/Port Map: a named group of ports of same type but possibly differentiated purpose ("subtype" ?) that's not limited to a single node, just as a single node is not limited to a single map.
- Port: a single I/O interface of a node.
- Session: a state of routing, particularly records about all of IPNs & INP between all end & edge nodes and their connections, and all of node properties.
- Save persistent priorities for each "type" of node, so dynamic lists could be made for automatic switch & substitution of absent ones with "the next best" which should mainly be used for edge-nodes (sink/source) to substitute "default".
- If nodes could be tagged with any arbitrary tag-names then more ordered groups with automatic substitution could be made than just "default".
- These groups could be used not just to define sink/source edge-nodes but also any node by their purpose and special characteristics that may require getting them by such list.
- Maybe it's worth to save priorities as float number or some kind of other less-strict way of representing their relative relations to each other without being strictly tied to unique IDs.
- It should always be assumed that newly discovered nodes may be registered/redefined to any order position in a list.
- For JACK-style routing apps should be provided with purely software nodes ("software default") where their pipelines start and last software node of each pipeline should be provided with end-nodes ("hardware default"), so hardware routing would be performed per-pipeline and not per-app.
- Except that JACK is limited to a single hardware interface and can't spawn its own shim-nodes that just mix and do no external processing, in PW we want fluidity among all nodes.
- What JACK nodes do have, however, is arbitrary port configuration per node, as such PW should not be limited to a single channel map per node (especially, end-node of hardware), so any ports of any node or across multiple end-nodes (that's #71 (closed) and #461 (comment 731585)) should be able to be grouped into a channel map for which pipeline to an edge-node is able to be constructed.
- JACK's nodes are always arbitrary, there is no channel map handling, which his why using OpenAL-based inter-node "spacialiser" (like how resample is done) or self-spawned IPNs (from GStreamer or whatever) for proper channel remapping is good idea (after all, even stereo is supposed to be mixed differently between speaker and headphone targets, let alone some surround configurations).
- Likely that multi-map/layout devices with subdevices should be exposed separately, such as: "<receiver_device_name> on <sender_device_name> via <bus>" per each map/layout of each device, where "sender" is local device acting as signal proxy/preparator, "receiver" is actual sound generator and "bus" is something like HDMI/DP/USB/network_RPC/etc.
- This may be used as common convention for all device naming and will also apply to manually aggregated/subdivided devices.
-
#222 and #109 (comment 724954) imply using post-processing pipelines per hardware port (actually, per channel map of hardware node) by default and for apps to connect to edge-node of each such pipeline that represents end-node instead of end-node itself.
- PW should be able to save & restore session but it shouldn't be limited to JACK conventions in that too: it should handle self-spawned plugins/filters (as INP ?), forked executables (including other LV2/whatever plugin hosts), socket-triggered systemd & session IPC-triggered services.
- That means assigning unique IDs to all node-providers, probably by assigning permanent UIDs to all nodes in general. These UIDs needs to be unique globally, as in "among all potential networked PW instances of a single user or even non-repeatable planet-wide".
- Just having abstract names may not be enough, categorized multi-level names may need to exist for matching by types. Or tags could be used for that in all cases.
- Node configurations would be tied to their UIDs. This is especially important since pipelines of each end-node would be configured differently by different instances of nodes of the same type (like DR Corrector, noise/echo suppressor, etc.).
- That means assigning unique IDs to all node-providers, probably by assigning permanent UIDs to all nodes in general. These UIDs needs to be unique globally, as in "among all potential networked PW instances of a single user or even non-repeatable planet-wide".
- Not using end-nodes/devices directly and instead always using most high-quality settings for them, as per #453, will save users from disrupting device reinitialization triggered by apps (usually, for format and buffer size changes).
- All quality-related settings for edge-node and end-node of the same pipeline should not be the same. It should be possible to dip or lift up quality at any point by down/up-conversion for certain IPNs where processing is too costly otherwise.
- In terms of audio it means lower latency/buffering and higher sampling frequency & depth at an end-node.
- To improve such format negotiation it may be feasible to somehow track and compare "cost of processing" for each node to permanently deny quality uplift in some places for some systems or temporarily, only under hardware load strain.
- OS might need some tweaks to kernel & user's scheduling priority defaults to achieve stability with heavy processing, despite raw hardware capabilities.
- All quality-related settings for edge-node and end-node of the same pipeline should not be the same. It should be possible to dip or lift up quality at any point by down/up-conversion for certain IPNs where processing is too costly otherwise.
- PW should be able to save & restore session but it shouldn't be limited to JACK conventions in that too: it should handle self-spawned plugins/filters (as INP ?), forked executables (including other LV2/whatever plugin hosts), socket-triggered systemd & session IPC-triggered services.
- Except that JACK is limited to a single hardware interface and can't spawn its own shim-nodes that just mix and do no external processing, in PW we want fluidity among all nodes.
- Simultaneous connections from/to multiple nodes of same type should be supported (like outputting to stereo speakers & headphone ports on same or different DACs), maybe via equal priority in their group.
- Protection from self-amplifying feedback loops in routing is required but it should be handled on port (channel map port group ?) basis, not node, because some effects may be fed to a side-channel for desirable feedback or something.
- If nodes could be tagged with any arbitrary tag-names then more ordered groups with automatic substitution could be made than just "default".