All Projects → A-Jacobson → Tacotron2

A-Jacobson / Tacotron2

pytorch tacotron2 https://arxiv.org/pdf/1712.05884.pdf

Projects that are alternatives of or similar to Tacotron2

Parallelwavegan
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Stars: ✭ 682 (+1382.61%)
Mutual labels:  jupyter-notebook, wavenet, text-to-speech
Tf Wavenet vocoder
Wavenet and its applications with Tensorflow
Stars: ✭ 58 (+26.09%)
Mutual labels:  jupyter-notebook, wavenet
Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (+13.04%)
Mutual labels:  jupyter-notebook, text-to-speech
Interspeech2019 Tutorial
INTERSPEECH 2019 Tutorial Materials
Stars: ✭ 160 (+247.83%)
Mutual labels:  jupyter-notebook, text-to-speech
Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+4178.26%)
Mutual labels:  wavenet, text-to-speech
Cross Lingual Voice Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
Stars: ✭ 106 (+130.43%)
Mutual labels:  jupyter-notebook, text-to-speech
Pytorch Dc Tts
Text to Speech with PyTorch (English and Mongolian)
Stars: ✭ 122 (+165.22%)
Mutual labels:  jupyter-notebook, text-to-speech
Nemo
NeMo: a toolkit for conversational AI
Stars: ✭ 3,685 (+7910.87%)
Mutual labels:  jupyter-notebook, text-to-speech
Wavegrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+432.61%)
Mutual labels:  jupyter-notebook, text-to-speech
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (+243.48%)
Mutual labels:  text-to-speech, wavenet
Tts
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Stars: ✭ 5,427 (+11697.83%)
Mutual labels:  jupyter-notebook, text-to-speech
Tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Stars: ✭ 305 (+563.04%)
Mutual labels:  jupyter-notebook, text-to-speech
Asrgen
Attacking Speaker Recognition with Deep Generative Models
Stars: ✭ 31 (-32.61%)
Mutual labels:  jupyter-notebook, text-to-speech
Practical Deep Learning For Coders
Material for my run of Fast.AI
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook
Deeplearner
AI精研社 超级原创 Learn Python and Deep Learning from scratch. 会用搜狗输入法 + chrome浏览器,就能学的会的 Python + 人工智能·机器学习·深度学习算法 的完整学习解决方案。
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook
Regression Lineaire Numpy
Codes provenant de mes vidéos YouTube : https://www.youtube.com/channel/UCmpptkXu8iIFe6kfDK5o7VQ
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook
Nagisa Tutorial Pycon2019
Code for PyCon JP 2019 talk "Python による日本語自然言語処理 〜系列ラベリングによる実世界テキスト分析〜"
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook
Generativegraph
Implementation For the paper from DeepMind
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook
Computing Density Maps
Fast computing density maps for ShanghaiTech and other datasets
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook
Probabilisticdeeplearningtensorflow
Material for ODSC Europe presentation -- Probabilistic Deep Learning in TensorFlow, the why and the how
Stars: ✭ 46 (+0%)
Mutual labels:  jupyter-notebook

Tacotron2

im

NATURAL TTS SYNTHESIS BY CONDITIONING WAVENET ON MEL SPECTROGRAM PREDICTIONS https://arxiv.org/pdf/1712.05884.pdf

WaveNet: A Generative Model for Raw Audio https://arxiv.org/abs/1609.03499

Contents

  • Simple LJ Speech DataLoader
  • Mel Spectrogram Prediction network (text to Spectrogram)
  • [TODO] WaveNet Vocoder (Spectrogram to raw audio)

Status

  • Spectrogram network is functional but not fully trained. The model takes ~3 hours per epoch on an M6000 gpu.

Setup

  1. install pytorch and torchvision:
conda install pytorch -c pytorch
  1. install other requirements:
pip install -r requirements.txt

Usage

train Spectrogram Prediction Network

python train.py

view logs in Tensorboard

tensorboard --logdir runs

im

im

Wavenet Resources

https://r9y9.github.io/wavenet_vocoder/ https://twitter.com/heiga_zen/status/832145314559750145 http://musyoku.github.io/2016/09/18/wavenet-a-generative-model-for-raw-audio/ https://www.slideshare.net/danilosoba1/generative-model-based-texttospeech

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].