All Projects → soobinseo → Tacotron Pytorch

soobinseo / Tacotron Pytorch

Licence: apache-2.0
Pytorch implementation of Tacotron

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tacotron Pytorch

Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Stars: ✭ 22 (-88.36%)
Mutual labels:  text-to-speech, tts, tacotron
Wavernn
WaveRNN Vocoder + TTS
Stars: ✭ 1,636 (+765.61%)
Mutual labels:  tacotron, text-to-speech, tts
TTS tf
WIP Tensorflow implementation of https://github.com/mozilla/TTS
Stars: ✭ 14 (-92.59%)
Mutual labels:  text-to-speech, tts, tacotron
Tts
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+2771.43%)
Mutual labels:  tacotron, text-to-speech, tts
Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
Stars: ✭ 118 (-37.57%)
Mutual labels:  text-to-speech, tts, tacotron
Tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+61.38%)
Mutual labels:  tacotron, text-to-speech, tts
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (-45.5%)
Mutual labels:  text-to-speech, tts
Tacotron Pytorch
A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model
Stars: ✭ 104 (-44.97%)
Mutual labels:  tacotron, text-to-speech
Tensorflowtts
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Stars: ✭ 2,382 (+1160.32%)
Mutual labels:  text-to-speech, tts
Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Stars: ✭ 108 (-42.86%)
Mutual labels:  text-to-speech, tts
Durian
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-41.27%)
Mutual labels:  text-to-speech, tts
Tts
Text-to-Speech for Arduino
Stars: ✭ 118 (-37.57%)
Mutual labels:  text-to-speech, tts
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+927.51%)
Mutual labels:  text-to-speech, tts
Zerospeech Tts Without T
A Pytorch implementation for the ZeroSpeech 2019 challenge.
Stars: ✭ 100 (-47.09%)
Mutual labels:  text-to-speech, tts
Gst Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Stars: ✭ 175 (-7.41%)
Mutual labels:  tacotron, tts
Joytan
Creative Audio/Textbook Maker 🎵 📖 See our YouTube channel
Stars: ✭ 91 (-51.85%)
Mutual labels:  text-to-speech, tts
Gtts
Python library and CLI tool to interface with Google Translate's text-to-speech API
Stars: ✭ 1,303 (+589.42%)
Mutual labels:  text-to-speech, tts
Talkify
Javascript Text to speech library
Stars: ✭ 132 (-30.16%)
Mutual labels:  text-to-speech, tts
Marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Stars: ✭ 1,699 (+798.94%)
Mutual labels:  text-to-speech, tts
Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+941.27%)
Mutual labels:  tacotron, text-to-speech

Tacotron-pytorch

A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.

Requirements

  • Install python 3
  • Install pytorch == 0.2.0
  • Install requirements:
    pip install -r requirements.txt
    

Data

I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded here. I referred https://github.com/keithito/tacotron for the preprocessing code.

File description

  • hyperparams.py includes all hyper parameters that are needed.
  • data.py loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory.
  • module.py contains all methods, including CBHG, highway, prenet, and so on.
  • network.py contains networks including encoder, decoder and post-processing network.
  • train.py is for training.
  • synthesis.py is for generating TTS sample.

Training the network

  • STEP 1. Download and extract LJSpeech data at any directory you want.
  • STEP 2. Adjust hyperparameters in hyperparams.py, especially 'data_path' which is a directory that you extract files, and the others if necessary.
  • STEP 3. Run train.py.

Generate TTS wav file

  • STEP 1. Run synthesis.py. Make sure the restore step.

Samples

  • You can check the generated samples in 'samples/' directory. Training step was only 60K, so the performance is not good yet.

Reference

Comments

  • Any comments for the codes are always welcome.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].