soobinseo / Tacotron Pytorch
Licence: apache-2.0
Pytorch implementation of Tacotron
Stars: ✭ 189
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Tacotron Pytorch
Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Stars: ✭ 22 (-88.36%)
Mutual labels: text-to-speech, tts, tacotron
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-92.59%)
Mutual labels: text-to-speech, tts, tacotron
Tts
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+2771.43%)
Mutual labels: tacotron, text-to-speech, tts
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (-37.57%)
Mutual labels: text-to-speech, tts, tacotron
Tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+61.38%)
Mutual labels: tacotron, text-to-speech, tts
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-45.5%)
Mutual labels: text-to-speech, tts
Tacotron Pytorch
A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model
Stars: ✭ 104 (-44.97%)
Mutual labels: tacotron, text-to-speech
Tensorflowtts
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Stars: ✭ 2,382 (+1160.32%)
Mutual labels: text-to-speech, tts
Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Stars: ✭ 108 (-42.86%)
Mutual labels: text-to-speech, tts
Durian
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-41.27%)
Mutual labels: text-to-speech, tts
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+927.51%)
Mutual labels: text-to-speech, tts
Zerospeech Tts Without T
A Pytorch implementation for the ZeroSpeech 2019 challenge.
Stars: ✭ 100 (-47.09%)
Mutual labels: text-to-speech, tts
Gst Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Stars: ✭ 175 (-7.41%)
Mutual labels: tacotron, tts
Joytan
Creative Audio/Textbook Maker 🎵 📖 See our YouTube channel
Stars: ✭ 91 (-51.85%)
Mutual labels: text-to-speech, tts
Gtts
Python library and CLI tool to interface with Google Translate's text-to-speech API
Stars: ✭ 1,303 (+589.42%)
Mutual labels: text-to-speech, tts
Marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Stars: ✭ 1,699 (+798.94%)
Mutual labels: text-to-speech, tts
Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+941.27%)
Mutual labels: tacotron, text-to-speech
Tacotron-pytorch
A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
Requirements
- Install python 3
- Install pytorch == 0.2.0
- Install requirements:
pip install -r requirements.txt
Data
I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded here. I referred https://github.com/keithito/tacotron for the preprocessing code.
File description
-
hyperparams.py
includes all hyper parameters that are needed. -
data.py
loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory. -
module.py
contains all methods, including CBHG, highway, prenet, and so on. -
network.py
contains networks including encoder, decoder and post-processing network. -
train.py
is for training. -
synthesis.py
is for generating TTS sample.
Training the network
- STEP 1. Download and extract LJSpeech data at any directory you want.
- STEP 2. Adjust hyperparameters in
hyperparams.py
, especially 'data_path' which is a directory that you extract files, and the others if necessary. - STEP 3. Run
train.py
.
Generate TTS wav file
- STEP 1. Run
synthesis.py
. Make sure the restore step.
Samples
- You can check the generated samples in 'samples/' directory. Training step was only 60K, so the performance is not good yet.
Reference
- Keith ito: https://github.com/keithito/tacotron
Comments
- Any comments for the codes are always welcome.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].