All Projects → L0SG → Waveflow

L0SG / Waveflow

Licence: bsd-3-clause
A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"

Projects that are alternatives of or similar to Waveflow

Cs224n Gpu That Talks
Attention, I'm Trying to Speak: End-to-end speech synthesis (CS224n '18)
Stars: ✭ 52 (-45.26%)
Mutual labels:  jupyter-notebook, speech-synthesis
Nemo
NeMo: a toolkit for conversational AI
Stars: ✭ 3,685 (+3778.95%)
Mutual labels:  jupyter-notebook, speech-synthesis
Pytorch Dc Tts
Text to Speech with PyTorch (English and Mongolian)
Stars: ✭ 122 (+28.42%)
Mutual labels:  jupyter-notebook, speech-synthesis
Wavegrad
Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.
Stars: ✭ 245 (+157.89%)
Mutual labels:  jupyter-notebook, speech-synthesis
Flowtron
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
Stars: ✭ 546 (+474.74%)
Mutual labels:  jupyter-notebook, speech-synthesis
Gantts
PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)
Stars: ✭ 460 (+384.21%)
Mutual labels:  jupyter-notebook, speech-synthesis
Tacotron pytorch
PyTorch implementation of Tacotron speech synthesis model.
Stars: ✭ 242 (+154.74%)
Mutual labels:  jupyter-notebook, speech-synthesis
Parallelwavegan
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Stars: ✭ 682 (+617.89%)
Mutual labels:  jupyter-notebook, speech-synthesis
Tf Wavenet vocoder
Wavenet and its applications with Tensorflow
Stars: ✭ 58 (-38.95%)
Mutual labels:  jupyter-notebook, speech-synthesis
Vae Text Generation
Text Generation Using A Variational Autoencoder
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Stingray
Anything can happen in the next half hour (including spectral timing made easy)!
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Deep Learning With Pytorch Quick Start Guide
Deep Learning with PyTorch Quick Start Guide, published by Packt
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Python option pricing
An libary to price financial options written in Python. Includes: Black Scholes, Black 76, Implied Volatility, American, European, Asian, Spread Options
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Notebooks
Examples and IPython Notebooks about NetworkX
Stars: ✭ 93 (-2.11%)
Mutual labels:  jupyter-notebook
Build Knowledge Base With Domain Specific Documents
Create a knowledge base using domain specific documents and the mammoth python library
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Mslearn Dp100
Lab files for Azure Machine Learning exercises
Stars: ✭ 92 (-3.16%)
Mutual labels:  jupyter-notebook
Pybnn
Bayesian neural network package
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Tensorflow Eager Execution
使用 tensorflow eager execution 的机器学习全新教程
Stars: ✭ 94 (-1.05%)
Mutual labels:  jupyter-notebook
Deepspeechdistances
Authors' implementation of DeepSpeech Distances.
Stars: ✭ 95 (+0%)
Mutual labels:  jupyter-notebook
Fast Track To Data Science 30 Days
Stars: ✭ 95 (+0%)
Mutual labels:  jupyter-notebook

WaveFlow: A Compact Flow-based Model for Raw Audio

This is an unofficial PyTorch implementation of WaveFlow (Ping et al, ICML 2020) model.

The aim for this repo is to provide easy-to-use PyTorch version of WaveFlow as a drop-in alternative to various neural vocoder models used with NVIDIA's Tacotron2 audio processing backend.

Please refer to the official implementation written in PaddlePaddle for the official results.

Setup

  1. Clone this repo and install requirements

    git clone https://github.com/L0SG/WaveFlow.git
    cd WaveFlow
    pip install -r requirements.txt
    
  2. Install Apex for mixed-precision training

Train your model

  1. Download LJ Speech Data. In this example it's in data/

  2. Make a list of the file names to use for training/testing.

    ls data/*.wav | tail -n+10 > train_files.txt
    ls data/*.wav | head -n10 > test_files.txt
    

    -n+10 and -n10 indicates that this example reserves the first 10 audio clips for model testing.

  3. Edit the configuration file and train the model.

    Below are the example commands using waveflow-h16-r64-bipartize.json

    nano configs/waveflow-h16-r64-bipartize.json
    python train.py -c configs/waveflow-h16-r64-bipartize.json
    

    Single-node multi-GPU training is automatically enabled with DataParallel (instead of DistributedDataParallel for simplicity).

    For mixed precision training, set "fp16_run": true on the configuration file.

    You can load the trained weights from saved checkpoints by providing the path to checkpoint_path variable in the config file.

    checkpoint_path accepts either explicit path, or the parent directory if resuming from averaged weights over multiple checkpoints.

    Examples

    insert checkpoint_path: "experiments/waveflow-h16-r64-bipartize/waveflow_5000" in the config file then run

    python train.py -c configs/waveflow-h16-r64-bipartize.json
    

    for loading averaged weights over 10 recent checkpoints, insert checkpoint_path: "experiments/waveflow-h16-r64-bipartize" in the config file then run

    python train.py -a 10 -c configs/waveflow-h16-r64-bipartize.json
    

    you can reset the optimizer and training scheduler (and keep the weights) by providing --warm_start

    python train.py --warm_start -c configs/waveflow-h16-r64-bipartize.json
    
  4. Synthesize waveform from the trained model.

    insert checkpoint_path in the config file and use --synthesize to train.py. The model generates waveform by looping over test_files.txt.

    python train.py --synthesize -c configs/waveflow-h16-r64-bipartize.json
    

    if fp16_run: true, the model uses FP16 (half-precision) arithmetic for faster performance (on GPUs equipped with Tensor Cores).

Reference

NVIDIA Tacotron2: https://github.com/NVIDIA/waveglow

NVIDIA WaveGlow: https://github.com/NVIDIA/waveglow

r9y9 wavenet-vocoder: https://github.com/r9y9/wavenet_vocoder

FloWaveNet: https://github.com/ksw0306/FloWaveNet

Parakeet: https://github.com/PaddlePaddle/Parakeet

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].