Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

Stars: ✭ 22 (-81.97%)

Mutual labels: text-to-speech, tts, speech-synthesis

Tts

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Stars: ✭ 5,427 (+4348.36%)

Mutual labels: jupyter-notebook, text-to-speech, tts

Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Stars: ✭ 73 (-40.16%)

Mutual labels: text-to-speech, tts, speech-synthesis

editts

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

Stars: ✭ 74 (-39.34%)

Mutual labels: text-to-speech, tts, speech-synthesis

Tts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Stars: ✭ 305 (+150%)

Mutual labels: jupyter-notebook, text-to-speech, tts

View All Similar Projects ➔

PyTorch implementation of Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention based partially on the following projects:

https://github.com/Kyubyong/dc_tts (audio pre processing)
https://github.com/r9y9/deepvoice3_pytorch (data loader sampler)

Online Text-To-Speech Demo

The following notebooks are executable on https://colab.research.google.com :

For audio samples and pretrained models, visit the above notebook links.

Training/Synthesizing English Text-To-Speech

The English TTS uses the LJ-Speech dataset.

Download the dataset: python dl_and_preprop_dataset.py --dataset=ljspeech
Train the Text2Mel model: python train-text2mel.py --dataset=ljspeech
Train the SSRN model: python train-ssrn.py --dataset=ljspeech
Synthesize sentences: python synthesize.py --dataset=ljspeech
- The WAV files are saved in the samples folder.

Training/Synthesizing Mongolian Text-To-Speech

The Mongolian text-to-speech uses 5 hours audio from the Mongolian Bible.

Download the dataset: python dl_and_preprop_dataset.py --dataset=mbspeech
Train the Text2Mel model: python train-text2mel.py --dataset=mbspeech
Train the SSRN model: python train-ssrn.py --dataset=mbspeech
Synthesize sentences: python synthesize.py --dataset=mbspeech
- The WAV files are saved in the samples folder.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 122

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗