All Projects → tugstugi → Pytorch Dc Tts

tugstugi / Pytorch Dc Tts

Licence: mit
Text to Speech with PyTorch (English and Mongolian)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytorch Dc Tts

Parallelwavegan
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Stars: ✭ 682 (+459.02%)
Mutual labels:  jupyter-notebook, speech-synthesis, text-to-speech, tts
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (-57.38%)
Mutual labels:  jupyter-notebook, speech-synthesis, text-to-speech, tts
Wavegrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+100.82%)
Mutual labels:  jupyter-notebook, speech-synthesis, text-to-speech, tts
Wavernn
WaveRNN Vocoder + TTS
Stars: ✭ 1,636 (+1240.98%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Cognitive Speech Tts
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Stars: ✭ 312 (+155.74%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Hifi Gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Stars: ✭ 325 (+166.39%)
Mutual labels:  speech-synthesis, text-to-speech, tts
esp32-flite
Speech synthesis running on ESP32 based on Flite engine.
Stars: ✭ 28 (-77.05%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Stars: ✭ 108 (-11.48%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Multilingual text to speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Stars: ✭ 324 (+165.57%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Voice Builder
An opensource text-to-speech (TTS) voice building tool
Stars: ✭ 362 (+196.72%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Jsut Lab
HTS-style full-context labels for JSUT v1.1
Stars: ✭ 28 (-77.05%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Durian
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-9.02%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Glow Tts
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Stars: ✭ 284 (+132.79%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting WaveFlow, WaveNet, Transformer TTS and Tacotron2)
Stars: ✭ 279 (+128.69%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-15.57%)
Mutual labels:  speech-synthesis, text-to-speech, tts
Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Stars: ✭ 22 (-81.97%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Tts
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+4348.36%)
Mutual labels:  jupyter-notebook, text-to-speech, tts
Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Stars: ✭ 73 (-40.16%)
Mutual labels:  text-to-speech, tts, speech-synthesis
editts
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
Stars: ✭ 74 (-39.34%)
Mutual labels:  text-to-speech, tts, speech-synthesis
Tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+150%)
Mutual labels:  jupyter-notebook, text-to-speech, tts

PyTorch implementation of Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention based partially on the following projects:

Online Text-To-Speech Demo

The following notebooks are executable on https://colab.research.google.com :

For audio samples and pretrained models, visit the above notebook links.

Training/Synthesizing English Text-To-Speech

The English TTS uses the LJ-Speech dataset.

  1. Download the dataset: python dl_and_preprop_dataset.py --dataset=ljspeech
  2. Train the Text2Mel model: python train-text2mel.py --dataset=ljspeech
  3. Train the SSRN model: python train-ssrn.py --dataset=ljspeech
  4. Synthesize sentences: python synthesize.py --dataset=ljspeech
    • The WAV files are saved in the samples folder.

Training/Synthesizing Mongolian Text-To-Speech

The Mongolian text-to-speech uses 5 hours audio from the Mongolian Bible.

  1. Download the dataset: python dl_and_preprop_dataset.py --dataset=mbspeech
  2. Train the Text2Mel model: python train-text2mel.py --dataset=mbspeech
  3. Train the SSRN model: python train-ssrn.py --dataset=mbspeech
  4. Synthesize sentences: python synthesize.py --dataset=mbspeech
    • The WAV files are saved in the samples folder.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].