Piper TTS is a [[MIT License]] licensed [[Text to Speech]] program written in [[C++]] and [[Python]].
- [Website](https://rhasspy.github.io/piper-samples/)
- [Source](https://github.com/rhasspy/piper)
- [Voices](https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0)
- https://piper.ttstool.com/
> A fast, local neural text to speech system
# Notability
See also: [[Text to Speech]], [[Speech Dispatcher]]
- Very compact distribution, fast to download, doesn't require `pip` install.
- Downloading voices is easy, just clone their repo.
Very fast and optimized input text streaming and output audio streaming. As a result it begins speaking instantly when a large document is passed in.
# Philosophy
# OS Support
- [[Linux]]
- [[MacOS]]
- [[Windows]]
# Features
## CLI
[[Piper TTS]] can do very rapid text to audio streaming.
There are a lot of options and capabilities of the commandline tool, but it isn't terribly ergonomic. The resulting audio can be good, but tends to be not quite as nice as [[Mimic3]].
# Tips
## Usage
### Direct Audio Output
Piping the output to `aplay`:
```sh
echo "Sphinx of black quartz, judge my vow!" | \
piper \
-m alan/en_GB-alan-medium.onnx \
-c alan/en_en_GB_alan_medium_en_GB-alan-medium.onnx.json \
-f - --output_raw | \
aplay -r 22050 -f S16_LE -t raw -
```
## Speech Dispatcher
See also: [[Mimic3#Speech Dispatcher]]
In `piper.conf`:
```sh
Debug 0
GenericExecuteSynth "printf %s \'$DATA\' | /opt/pkg/piper/piper -f - -m \'/opt/pkg/piper/models/$VOICE.onnx\' | $PLAY_COMMAND"
AddVoice "en" "MALE1" "alan/en_GB-alan-medium"
AddVoice "en" "FEMALE1" "ljspeech/ljspeech"
DefaultVoice "ljspeech/ljspeech"
```
In `speechd.conf`:
```ini
AddModule "piper" "sd_generic" "piper.conf"
DefaultModule piper
```
# Resources
- https://github.com/Elleo/pied - tool for installing and managing Piper and voices with [[Speech Dispatcher]]
## Additional Voices
- https://brycebeattie.com/files/tts/
## Favorite Voices
- CMU Arctic - http://www.festvox.org/cmu_arctic/
- `en_US-arctic-medium` - scottish?, masc, Uncertain about the accent type, but it is intelligible while being mostly flat
- Hi-Fi Captain - https://ast-astrec.nict.go.jp/en/release/hi-fi-captain/
- `en_US-hfc_male-medium` - GA, masc, a bit uncanny valley, but very clear
### EN_US
- Ryan - https://www.kaggle.com/datasets/roholazandie/ryanspeech
- `en_US-ryan-high` - GA, masc, a bit stiff, but quite clear
- libritts_r - https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts_r/medium
- femme
### EN_GB
- Alan Pope - https://github.com/MycroftAI/mimic3-voices/blob/master/voices/en_UK/apope_low
- `en_GB-alan-medium` - RP, masc, I prefer the way this voice sounds out of Piper
- Semaine - https://github.com/marytts/dfki-semaine-data
- Some very dynamic performances out of these voices, it's a bit excessive, but also a nice contrast
- `poppy` - femme
- `prudence` - femme
- Jenny (Dioco) - https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/jenny_dioco/medium
- Southern English Female - https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/southern_english_female/low
- `en-gb-southern_english_female-low` - femme
- Cori -
- `en_GB-cori-high` - femme, with clear unrushed deliberate enunciation
### Better in Mimic
See [[Mimic3#Favorite Voices]]
- CSTR VCTK Corpus
- LJSpeech
# References
- https://github.com/rhasspy/piper/discussions/328
- https://gist.github.com/alexkuz/f24f93245ff80458c9b6ec93c644c40b