All Projects → NVIDIA → Tacotron2

NVIDIA / Tacotron2

Licence: bsd-3-clause
Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
Dockerfile
14818 projects

Projects that are alternatives of or similar to Tacotron2

Covid Chestxray Dataset
We are building an open database of COVID-19 cases with chest X-ray or CT images.
Stars: ✭ 2,759 (-16.39%)
Mutual labels:  jupyter-notebook
End2end All Conv
Deep Learning to Improve Breast Cancer Detection on Screening Mammography
Stars: ✭ 236 (-92.85%)
Mutual labels:  jupyter-notebook
Datascience
Data Science in Julia course for JuliaAcademy.com, taught by Huda Nassar
Stars: ✭ 239 (-92.76%)
Mutual labels:  jupyter-notebook
Pytorch Bert Crf Ner
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
Stars: ✭ 236 (-92.85%)
Mutual labels:  jupyter-notebook
Deeplung
WACV18 paper "DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification"
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook
Kaggle airbus ship detection
Kaggle airbus ship detection challenge 21st solution
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook
Blog
for code created as part of http://studywolf.wordpress.com
Stars: ✭ 236 (-92.85%)
Mutual labels:  jupyter-notebook
Text Classification
Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Stars: ✭ 239 (-92.76%)
Mutual labels:  jupyter-notebook
R C3d
code for R-C3D
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook
Skills
个人的技能树仓库,主要包含个人机器学习以及深度学习的笔记
Stars: ✭ 240 (-92.73%)
Mutual labels:  jupyter-notebook
Enet Real Time Semantic Segmentation
ENet - A Neural Net Architecture for real time Semantic Segmentation
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook
Learningdl
三个月教你从零入门深度学习Tensorflow版配套代码
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook
Aravec
AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Stars: ✭ 239 (-92.76%)
Mutual labels:  jupyter-notebook
Learndatascience
Open Content for self-directed learning in data science
Stars: ✭ 2,688 (-18.55%)
Mutual labels:  jupyter-notebook
Deepreplay
Deep Replay - Generate visualizations as in my "Hyper-parameters in Action!" series!
Stars: ✭ 240 (-92.73%)
Mutual labels:  jupyter-notebook
Deepnlp Models Pytorch
Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)
Stars: ✭ 2,760 (-16.36%)
Mutual labels:  jupyter-notebook
Ida ipython
An IDA Pro Plugin for embedding an IPython Kernel
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook
Stat Nlp Book
Interactive Lecture Notes, Slides and Exercises for Statistical NLP
Stars: ✭ 240 (-92.73%)
Mutual labels:  jupyter-notebook
Malaya
Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/
Stars: ✭ 239 (-92.76%)
Mutual labels:  jupyter-notebook
Iclr2021 Openreviewdata
Crawl & visualize ICLR papers and reviews.
Stars: ✭ 238 (-92.79%)
Mutual labels:  jupyter-notebook

Tacotron 2 (without wavenet)

PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions.

This implementation includes distributed and automatic mixed precision support and uses the LJSpeech dataset.

Distributed and Automatic Mixed Precision support relies on NVIDIA's Apex and AMP.

Visit our website for audio samples using our published Tacotron 2 and WaveGlow models.

Alignment, Predicted Mel Spectrogram, Target Mel Spectrogram

Pre-requisites

  1. NVIDIA GPU + CUDA cuDNN

Setup

  1. Download and extract the LJ Speech dataset
  2. Clone this repo: git clone https://github.com/NVIDIA/tacotron2.git
  3. CD into this repo: cd tacotron2
  4. Initialize submodule: git submodule init; git submodule update
  5. Update .wav paths: sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' filelists/*.txt
    • Alternatively, set load_mel_from_disk=True in hparams.py and update mel-spectrogram paths
  6. Install PyTorch 1.0
  7. Install Apex
  8. Install python requirements or build docker image
    • Install python requirements: pip install -r requirements.txt

Training

  1. python train.py --output_directory=outdir --log_directory=logdir
  2. (OPTIONAL) tensorboard --logdir=outdir/logdir

Training using a pre-trained model

Training using a pre-trained model can lead to faster convergence
By default, the dataset dependent text embedding layers are ignored

  1. Download our published Tacotron 2 model
  2. python train.py --output_directory=outdir --log_directory=logdir -c tacotron2_statedict.pt --warm_start

Multi-GPU (distributed) and Automatic Mixed Precision Training

  1. python -m multiproc train.py --output_directory=outdir --log_directory=logdir --hparams=distributed_run=True,fp16_run=True

Inference demo

  1. Download our published Tacotron 2 model
  2. Download our published WaveGlow model
  3. jupyter notebook --ip=127.0.0.1 --port=31337
  4. Load inference.ipynb

N.b. When performing Mel-Spectrogram to Audio synthesis, make sure Tacotron 2 and the Mel decoder were trained on the same mel-spectrogram representation.

Related repos

WaveGlow Faster than real time Flow-based Generative Network for Speech Synthesis

nv-wavenet Faster than real time WaveNet.

Acknowledgements

This implementation uses code from the following repos: Keith Ito, Prem Seetharaman as described in our code.

We are inspired by Ryuchi Yamamoto's Tacotron PyTorch implementation.

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].