Draft: coqui: New plugin for speech-to-text transcription
The coquistt element is an audiofilter processing incoming audio samples, feeding them to the Coqui-AI STT engine, which performs an inference using pre-trained models.
This element should be combined with a VAD filter upstream, such as webrtcdsp, so that it will process only the samples representing a human voice.
The resulting transcription is posted on the bus as an element message.
Based on initial work by Mike Sheldon elleo@gnu.org.