All Projects → ksw0306 → Flowavenet

ksw0306 / Flowavenet

Licence: mit
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Flowavenet

Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+317.83%)
Mutual labels:  wavenet
Music-Style-Transfer
Source code for "Transferring the Style of Homophonic Music Using Recurrent Neural Networks and Autoregressive Model"
Stars: ✭ 16 (-96.6%)
Mutual labels:  wavenet
hifigan-denoiser
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Stars: ✭ 88 (-81.32%)
Mutual labels:  wavenet
Source Separation Wavenet
A neural network for end-to-end music source separation
Stars: ✭ 185 (-60.72%)
Mutual labels:  wavenet
birdsong-generation-project
Generating birdsong with WaveNet
Stars: ✭ 26 (-94.48%)
Mutual labels:  wavenet
constant-memory-waveglow
PyTorch implementation of NVIDIA WaveGlow with constant memory cost.
Stars: ✭ 36 (-92.36%)
Mutual labels:  wavenet
Fast Wavenet
Speedy Wavenet generation using dynamic programming ⚡
Stars: ✭ 1,705 (+262%)
Mutual labels:  wavenet
Time Series Prediction
A collection of time series prediction methods: rnn, seq2seq, cnn, wavenet, transformer, unet, n-beats, gan, kalman-filter
Stars: ✭ 351 (-25.48%)
Mutual labels:  wavenet
wavenet
Audio source separation (mixture to vocal) using the Wavenet
Stars: ✭ 20 (-95.75%)
Mutual labels:  wavenet
ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Stars: ✭ 158 (-66.45%)
Mutual labels:  wavenet
Seriesnet
Time series prediction using dilated causal convolutional neural nets (temporal CNN)
Stars: ✭ 185 (-60.72%)
Mutual labels:  wavenet
wavenet-like-vocoder
Basic wavenet and fftnet vocoder model.
Stars: ✭ 20 (-95.75%)
Mutual labels:  wavenet
chainer-ClariNet
A Chainer implementation of ClariNet.
Stars: ✭ 45 (-90.45%)
Mutual labels:  wavenet
Deep Time Series Prediction
Seq2Seq, Bert, Transformer, WaveNet for time series prediction.
Stars: ✭ 183 (-61.15%)
Mutual labels:  wavenet
Pytorchwavenetvocoder
WaveNet-Vocoder implementation with pytorch.
Stars: ✭ 269 (-42.89%)
Mutual labels:  wavenet
Wavenet vocoder
WaveNet vocoder
Stars: ✭ 1,926 (+308.92%)
Mutual labels:  wavenet
QPPWG
Quasi-Periodic Parallel WaveGAN Pytorch implementation
Stars: ✭ 41 (-91.3%)
Mutual labels:  wavenet
Pycadl
Python package with source code from the course "Creative Applications of Deep Learning w/ TensorFlow"
Stars: ✭ 356 (-24.42%)
Mutual labels:  wavenet
Clarinet
A Pytorch Implementation of ClariNet
Stars: ✭ 273 (-42.04%)
Mutual labels:  wavenet
chainer-Fast-WaveNet
A Chainer implementation of Fast WaveNet(mel-spectrogram vocoder).
Stars: ✭ 33 (-92.99%)
Mutual labels:  wavenet

FloWaveNet : A Generative Flow for Raw Audio

This is a PyTorch implementation of our work "FloWaveNet : A Generative Flow for Raw Audio". (We'll update soon.)

For a purpose of parallel sampling, we propose FloWaveNet, a flow-based generative model for raw audio synthesis. FloWaveNet can generate audio samples as fast as ClariNet and Parallel WaveNet, while the training procedure is really easy and stable with a single-stage pipeline. Our generated audio samples are available at https://ksw0306.github.io/flowavenet-demo/. Also, our implementation of ClariNet (Gaussian WaveNet and Gaussian IAF) is available at https://github.com/ksw0306/ClariNet

Requirements

  • PyTorch 0.4.1
  • Python 3.6
  • Librosa

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

python preprocessing.py --in_dir ljspeech --out_dir DATASETS/ljspeech

Step 3. Train

Single-GPU training

python train.py --model_name flowavenet --batch_size 2 --n_block 8 --n_flow 6 --n_layer 2 --block_per_split 4

Multi-GPU training

python train.py --model_name flowavenet --batch_size 8 --n_block 8 --n_flow 6 --n_layer 2 --block_per_split 4 --num_gpu 4

NVIDIA TITAN V (12GB VRAM) : batch size 2 per GPU

NVIDIA Tesla V100 (32GB VRAM) : batch size 8 per GPU

Step 4. Synthesize

--load_step CHECKPOINT : the # of the pre-trained model's global training step (also depicted in the trained weight file)

--temp: Temperature (standard deviation) value implemented as z ~ N(0, 1 * TEMPERATURE^2)

ex) python synthesize.py --model_name flowavenet --n_block 8 --n_flow 6 --n_layer 2 --load_step 100000 --temp 0.8 --num_samples 10 --block_per_split 4

Sample Link

Sample Link : https://ksw0306.github.io/flowavenet-demo/

Our implementation of ClariNet (Gaussian WaveNet, Gaussian IAF) : https://github.com/ksw0306/ClariNet

  • Results 1 : Model Comparisons (WaveNet (MoL, Gaussian), ClariNet and FloWaveNet)

  • Results 2 : Temperature effect on Audio Quality Trade-off (Temperature T : 0.0 ~ 1.0, Model : FloWaveNet)

  • Results 3 : Analysis of ClariNet Loss Terms (Loss functions : 1. Only KL 2. KL + Frame 3. Only Frame)

  • Results 4 : Causality of WaveNet Dilated Convolutions (FloWaveNet : Non-causal WaveNet Affine Coupling Layers, FloWaveNet_causal : Causal WaveNet Affine Coupling Layers)

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].