All Projects β†’ r9y9 β†’ Tacotron_pytorch

r9y9 / Tacotron_pytorch

Licence: other
PyTorch implementation of Tacotron speech synthesis model.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tacotron pytorch

Wavegrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+1.24%)
Mutual labels:  jupyter-notebook, speech, speech-synthesis
Tts
πŸ€– πŸ’¬ Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+2142.56%)
Mutual labels:  jupyter-notebook, speech, tacotron
Tts
πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+26.03%)
Mutual labels:  jupyter-notebook, speech, tacotron
Xva Synth
Machine learning based speech synthesis Electron app, with voices from specific characters from video games
Stars: ✭ 136 (-43.8%)
Mutual labels:  speech-synthesis, tacotron
Wavernn
WaveRNN Vocoder + TTS
Stars: ✭ 1,636 (+576.03%)
Mutual labels:  speech-synthesis, tacotron
Durian
Implementation of "Duration Informed Attention Network for Multimodal Synthesis" (https://arxiv.org/pdf/1909.01700.pdf) paper.
Stars: ✭ 111 (-54.13%)
Mutual labels:  speech, speech-synthesis
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (-78.51%)
Mutual labels:  jupyter-notebook, speech-synthesis
Wavenet vocoder
WaveNet vocoder
Stars: ✭ 1,926 (+695.87%)
Mutual labels:  speech, speech-synthesis
Diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Stars: ✭ 139 (-42.56%)
Mutual labels:  speech, speech-synthesis
Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+713.22%)
Mutual labels:  speech-synthesis, tacotron
Lingvo
Lingvo
Stars: ✭ 2,361 (+875.62%)
Mutual labels:  speech, speech-synthesis
Tacotron Pytorch
A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model
Stars: ✭ 104 (-57.02%)
Mutual labels:  speech-synthesis, tacotron
Waveflow
A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"
Stars: ✭ 95 (-60.74%)
Mutual labels:  jupyter-notebook, speech-synthesis
Pytorch Dc Tts
Text to Speech with PyTorch (English and Mongolian)
Stars: ✭ 122 (-49.59%)
Mutual labels:  jupyter-notebook, speech-synthesis
Tf Wavenet vocoder
Wavenet and its applications with Tensorflow
Stars: ✭ 58 (-76.03%)
Mutual labels:  jupyter-notebook, speech-synthesis
Wavegrad
A fast, high-quality neural vocoder.
Stars: ✭ 138 (-42.98%)
Mutual labels:  speech, speech-synthesis
Expressive tacotron
Tensorflow Implementation of Expressive Tacotron
Stars: ✭ 192 (-20.66%)
Mutual labels:  speech-synthesis, tacotron
Neural Voice Cloning With Few Samples
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Stars: ✭ 211 (-12.81%)
Mutual labels:  speech, speech-synthesis
Source separation
Deep learning based speech source separation using Pytorch
Stars: ✭ 226 (-6.61%)
Mutual labels:  jupyter-notebook, speech
Wsay
Windows "say"
Stars: ✭ 36 (-85.12%)
Mutual labels:  speech, speech-synthesis

tacotron_pytorch

Build Status

PyTorch implementation of Tacotron speech synthesis model.

Inspired from keithito/tacotron. Currently not as much good speech quality as keithito/tacotron can generate, but it seems to be basically working. You can find some generated speech examples trained on LJ Speech Dataset at here.

If you are comfortable working with TensorFlow, I'd recommend you to try https://github.com/keithito/tacotron instead. The reason to rewrite it in PyTorch is that it's easier to debug and extend (multi-speaker architecture, etc) at least to me.

Requirements

  • PyTorch
  • TensorFlow (if you want to run the training script. This definitely can be optional, but for now required.)

Installation

git clone --recursive https://github.com/r9y9/tacotron_pytorch
pip install -e . # or python setup.py develop

If you want to run the training script, then you need to install additional dependencies.

pip install -e ".[train]"

Training

The package relis on keithito/tacotron for text processing, audio preprocessing and audio reconstruction (added as a submodule). Please follows the quick start section at https://github.com/keithito/tacotron and prepare your dataset accordingly.

If you have your data prepared, assuming your data is in "~/tacotron/training" (which is the default), then you can train your model by:

python train.py

Alignment, predicted spectrogram, target spectrogram, predicted waveform and checkpoint (model and optimizer states) are saved per 1000 global step in checkpoints directory. Training progress can be monitored by:

tensorboard --logdir=log

Testing model

Open the notebook in notebooks directory and change checkpoint_path to your model.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].