Piper TTS is a [[MIT License]] licensed text to speech program written in [[C++]] and [[Python]].
- [Website](https://rhasspy.github.io/piper-samples/)
- [Source](https://github.com/rhasspy/piper)
- [Voices](https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0)
- https://piper.ttstool.com/
> A fast, local neural text to speech system
# Notability
See also: [[Text to Speech]]
- Very compact distribution, fast to download, doesn't require `pip` install.
- Downloading voices is easy, just clone their repo.
Very fast and optimized input text streaming and output audio streaming. As a result it begins speaking instantly when a large document is passed in.
# Philosophy
# OS Support
- [[Linux]]
- [[MacOS]]
- [[Windows]]
# Features
## CLI
[[Piper TTS]] can do very rapid text to audio streaming.
There are a lot of options and capabilities of the commandline tool, but it isn't terribly ergonomic. The resulting audio can be good, but tends to be not quite as nice as [[Mimic3]].
## Favorite Voices
- Alan Pope - https://github.com/MycroftAI/mimic3-voices/blob/master/voices/en_UK/apope_low
- `en_GB-alan-medium` - RP, masc, I prefer the way this voice sounds out of Piper
- Semaine - https://github.com/marytts/dfki-semaine-data
- Some very dynamic performances out of these voices, it's a bit excessive, but also a nice contrast
- CMU Arctic - http://www.festvox.org/cmu_arctic/
- `en_US-arctic-medium` - ??, masc, Uncertain about the accent type, but it is intelligible while being mostly flat
- Hi-Fi Captain - https://ast-astrec.nict.go.jp/en/release/hi-fi-captain/
- `en_US-hfc_male-medium` - GA, masc, a bit uncanny valley, but very clear
- Ryan - https://www.kaggle.com/datasets/roholazandie/ryanspeech
- `en_US-ryan-high` - GA, masc, a bit stiff, but quite clear
- CSTR VCTK Corpus - https://datashare.ed.ac.uk/handle/10283/3443
- I much prefer Mimic 3's rendering of these voices.
- LJSpeech - https://github.com/rhasspy/piper/discussions/202
- I prefer Mimic 3's rendering of this voice.
# Tips
## Usage
```sh
echo "Sphinx of black quartz, judge my vow!" | \
piper \
-m alan/en_GB-alan-medium.onnx \
-c alan/en_en_GB_alan_medium_en_GB-alan-medium.onnx.json \
-f - --output_raw | \
aplay -r 22050 -f S16_LE -t raw -
```
## Speech Dispatcher
See also: [[Mimic3#Speech Dispatcher]]
In `piper.conf`:
```sh
Debug 0
GenericExecuteSynth "printf %s \'$DATA\' | /opt/pkg/piper/piper -f - -m \'/opt/pkg/piper/models/$VOICE.onnx\' | $PLAY_COMMAND"
AddVoice "en" "MALE1" "alan/en_GB-alan-medium"
AddVoice "en" "FEMALE1" "ljspeech/ljspeech"
DefaultVoice "ljspeech/ljspeech"
```
In `speechd.conf`:
```ini
AddModule "piper" "sd_generic" "piper.conf"
DefaultModule piper
```
# Resources
- https://github.com/Elleo/pied - tool for installing and managing Piper and voices with [[Speech Dispatcher]]
# References
- https://github.com/rhasspy/piper/discussions/328
- https://gist.github.com/alexkuz/f24f93245ff80458c9b6ec93c644c40b