See also: [[Personal Assistant]], [[Speech to Text]], [[Speech to Text]]
# Generators
Quality can have more to do with the models than the software itself, but their demos are still what I judge on.
## Espeak
Including Espeak-NG.
- https://github.com/espeak-ng/espeak-ng/
## Piper
**Quality**: Good
**Install**: Painless
See also: [[Piper TTS]]
Used by [[Personal Assistant#Rhasspy]]
## Mimic 3
**Quality**: Good
**Install**: Meh
See also: [[Mimic3]]
## Festival
**Quality**: Medium
Quality varies a lot. Some voices are loud and others quiet. Some are distorted and others are grainy. Some are pretty okay.
- https://www.cstr.ed.ac.uk/projects/festival/
- https://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html
## Tortise TTS
- https://nonint.com/static/tortoise_v2_examples.html
- https://github.com/neonbjb/tortoise-tts
Gives the feeling of a smaller project. The results seem to be pretty good. But the site feels kind of jank.
## RHVoice
- https://rhvoice.org/
- https://github.com/RHVoice/RHVoice
Documentation is spotty. Supposedly supports Linux but provides no info. Probably need to compile the C++ project from scratch.
## CMU Flite
> CMU Flite (festival-lite) is a small, fast run-time open source text to speech synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative text to speech synthesis engine to [Festival](http://festvox.org/festival) for voices built using the [FestVox](http://festvox.org) suite of voice building tools.
- http://www.festvox.org/flite/
- https://github.com/festvox/flite
## Larynx
Predecessor to [[#Piper]].
## Mary TTS
- ! Written in Java
- https://github.com/marytts/marytts
- https://marytts.github.io/
## Coqui TTS
- https://github.com/coqui-ai/TTS
- https://docs.coqui.ai/en/latest/
I'm not really clear what it is capable of, their demos are weird, and there's a lot of emojis and marketing speak.
## gTTS
- ! Sends text to Google
- https://github.com/pndurette/gTTS
## StyleTTS2
Primarily a research project, but seems to do a good job of reading text. Also can replicate the prosody of random speakers. Written in [[Python]].
- https://github.com/yl4579/StyleTTS2
- https://styletts2.github.io/
# Utilities
## SpeechD
- https://freebsoft.org/speechd
## Obsidian TTS
- https://github.com/joethei/obsidian-tts/issues/9
## Gruut
> A tokenizer, text cleaner, and [IPA](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) phonemizer for several human languages that supports [SSML](https://github.com/rhasspy/gruut/#ssml).
- https://github.com/rhasspy/gruut/
## PyTTSx3
Seems to be a wrapper library for some of the others here.
- https://www.geeksforgeeks.org/python-text-to-speech-by-using-pyttsx3/
- https://pypi.org/project/pyttsx3/
# Other
## Bark
Takes prompts to generate audio, not really TTS, sort of a Dali/GPT generator for voice and audio. Results may deviate from the prompt, but it may also seem very lifelike.
- https://github.com/suno-ai/bark
# Subfolders
```dataview
LIST
FROM #foldernote
WHERE contains(file.folder, this.file.folder)
AND file != this.file
SORT file.name ASC
```
# Notes in this Folder
```dataview
LIST
FROM -#foldernote
WHERE file.folder = this.file.folder
AND database-plugin != "basic"
SORT file.name ASC
```
# References
- https://askubuntu.com/questions/53896/natural-sounding-text-to-speech