Piper TTS is a [[MIT License]] licensed text to speech program written in [[C++]] and [[Python]]. - [Website](https://rhasspy.github.io/piper-samples/) - [Source](https://github.com/rhasspy/piper) - [Voices](https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0) - https://piper.ttstool.com/ > A fast, local neural text to speech system # Notability See also: [[Text to Speech]] - Very compact distribution, fast to download, doesn't require `pip` install. - Downloading voices is easy, just clone their repo. Very fast and optimized input text streaming and output audio streaming. As a result it begins speaking instantly when a large document is passed in. # Philosophy # OS Support - [[Linux]] - [[MacOS]] - [[Windows]] # Features ## CLI [[Piper TTS]] can do very rapid text to audio streaming. There are a lot of options and capabilities of the commandline tool, but it isn't terribly ergonomic. The resulting audio can be good, but tends to be not quite as nice as [[Mimic3]]. ## Favorite Voices - Alan Pope - https://github.com/MycroftAI/mimic3-voices/blob/master/voices/en_UK/apope_low - `en_GB-alan-medium` - RP, masc, I prefer the way this voice sounds out of Piper - Semaine - https://github.com/marytts/dfki-semaine-data - Some very dynamic performances out of these voices, it's a bit excessive, but also a nice contrast - CMU Arctic - http://www.festvox.org/cmu_arctic/ - `en_US-arctic-medium` - ??, masc, Uncertain about the accent type, but it is intelligible while being mostly flat - Hi-Fi Captain - https://ast-astrec.nict.go.jp/en/release/hi-fi-captain/ - `en_US-hfc_male-medium` - GA, masc, a bit uncanny valley, but very clear - Ryan - https://www.kaggle.com/datasets/roholazandie/ryanspeech - `en_US-ryan-high` - GA, masc, a bit stiff, but quite clear - CSTR VCTK Corpus - https://datashare.ed.ac.uk/handle/10283/3443 - I much prefer Mimic 3's rendering of these voices. - LJSpeech - https://github.com/rhasspy/piper/discussions/202 - I prefer Mimic 3's rendering of this voice. # Tips ## Usage ```sh echo "Sphinx of black quartz, judge my vow!" | \ piper \ -m alan/en_GB-alan-medium.onnx \ -c alan/en_en_GB_alan_medium_en_GB-alan-medium.onnx.json \ -f - --output_raw | \ aplay -r 22050 -f S16_LE -t raw - ``` ## Speech Dispatcher See also: [[Mimic3#Speech Dispatcher]] In `piper.conf`: ```sh Debug 0 GenericExecuteSynth "printf %s \'$DATA\' | /opt/pkg/piper/piper -f - -m \'/opt/pkg/piper/models/$VOICE.onnx\' | $PLAY_COMMAND" AddVoice "en" "MALE1" "alan/en_GB-alan-medium" AddVoice "en" "FEMALE1" "ljspeech/ljspeech" DefaultVoice "ljspeech/ljspeech" ``` In `speechd.conf`: ```ini AddModule "piper" "sd_generic" "piper.conf" DefaultModule piper ``` # Resources - https://github.com/Elleo/pied - tool for installing and managing Piper and voices with [[Speech Dispatcher]] # References - https://github.com/rhasspy/piper/discussions/328 - https://gist.github.com/alexkuz/f24f93245ff80458c9b6ec93c644c40b