All Projects → dhgrs → chainer-Fast-WaveNet

dhgrs / chainer-Fast-WaveNet

Licence: other
A Chainer implementation of Fast WaveNet(mel-spectrogram vocoder).

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to chainer-Fast-WaveNet

chainer-ClariNet
A Chainer implementation of ClariNet.
Stars: ✭ 45 (+36.36%)
Mutual labels:  chainer, wavenet
Wavenet
WaveNet implementation with chainer
Stars: ✭ 53 (+60.61%)
Mutual labels:  chainer, wavenet
Chainer Vq Vae
A Chainer implementation of VQ-VAE.
Stars: ✭ 77 (+133.33%)
Mutual labels:  chainer, wavenet
deep-learning-tutorial-with-chainer
Deep learning tutorial with Chainer
Stars: ✭ 25 (-24.24%)
Mutual labels:  chainer
chainer-fcis
[This project has moved to ChainerCV] Chainer Implementation of Fully Convolutional Instance-aware Semantic Segmentation
Stars: ✭ 45 (+36.36%)
Mutual labels:  chainer
superresolution gan
Chainer implementation of Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
Stars: ✭ 50 (+51.52%)
Mutual labels:  chainer
char-rnn-text-generation
Character Embeddings Recurrent Neural Network Text Generation Models
Stars: ✭ 64 (+93.94%)
Mutual labels:  chainer
chainer-pix2pix
Chainer implementation for Image-to-Image Translation Using Conditional Adversarial Networks
Stars: ✭ 40 (+21.21%)
Mutual labels:  chainer
Multi-task-Conditional-Attention-Networks
A prototype version of our submitted paper: Conversion Prediction Using Multi-task Conditional Attention Networks to Support the Creation of Effective Ad Creatives.
Stars: ✭ 21 (-36.36%)
Mutual labels:  chainer
Music-Style-Transfer
Source code for "Transferring the Style of Homophonic Music Using Recurrent Neural Networks and Autoregressive Model"
Stars: ✭ 16 (-51.52%)
Mutual labels:  wavenet
chainer-graph-cnn
Chainer implementation of 'Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering' (https://arxiv.org/abs/1606.09375)
Stars: ✭ 67 (+103.03%)
Mutual labels:  chainer
chainer-param-monitor
Monitor parameter and gradient statistics during neural network training with Chainer
Stars: ✭ 13 (-60.61%)
Mutual labels:  chainer
constant-memory-waveglow
PyTorch implementation of NVIDIA WaveGlow with constant memory cost.
Stars: ✭ 36 (+9.09%)
Mutual labels:  wavenet
convolutional seq2seq
fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer
Stars: ✭ 63 (+90.91%)
Mutual labels:  chainer
tutorials
Introduction to Deep Learning: Chainer Tutorials
Stars: ✭ 68 (+106.06%)
Mutual labels:  chainer
wavenet
Audio source separation (mixture to vocal) using the Wavenet
Stars: ✭ 20 (-39.39%)
Mutual labels:  wavenet
BMI219-2017-ProteinFolding
UCSF BMI219 Deep Learning (2017), Coding example (Prediction of protein folding with RNN and CNN)
Stars: ✭ 14 (-57.58%)
Mutual labels:  chainer
chainer-sort
Simple, Online, Realtime Tracking of Multiple Objects (SORT) implementation for Chainer and ChainerCV.
Stars: ✭ 20 (-39.39%)
Mutual labels:  chainer
QPPWG
Quasi-Periodic Parallel WaveGAN Pytorch implementation
Stars: ✭ 41 (+24.24%)
Mutual labels:  wavenet
deep-learning-platforms
deep-learning platforms,framework,data(深度学习平台、框架、资料)
Stars: ✭ 17 (-48.48%)
Mutual labels:  chainer

chainer-Fast-WaveNet

A Chainer implementation of WaveNet.

Requirements

I trained and generated with

  • python(3.5.2)
  • chainer(4.0.0b3)
  • librosa(0.5.1)

Usage

download dataset

You can download VCTK-Corpus(en) from here. And you can download CMU-ARCTIC(en)/voice-statistics-corpus(ja) very easily via my repository.

set parameters

parameters of training

  • batchsize
    • Batch size.
  • lr
    • Learning rate.
  • ema_mu
    • Rate of exponential moving average. If this is greater than 1 doesn't apply.
  • trigger
    • How many times you update the model. You can set this parameter like as (<int>, 'iteration') or (<int>, 'epoch')
  • evaluate_interval
    • The interval that you evaluate validation dataset. You can set this parameter like as trigger.
  • snapshot_interval
    • The interval that you save snapshot. You can set this parameter like as trigger.
  • report_interval
    • The interval that you write log of loss. You can set this parameter like as trigger.

parameters of dataset

  • root
    • The root directory of training dataset.
  • dataset_type
    • The architecture of the directory of training dataset. Now this parameter supports VCTK, ARCTIC and vs.

parameters of preprocessing

  • sr
    • Sampling rate. If it's different from input file, be resampled by librosa.
  • n_fft
    • The window size of FFT.
  • hop_length
    • The hop length of FFT.
  • n_mels
    • The number of mel frequencies.
  • top_db
    • The threshold db for triming silence.
  • input_dim
    • Input dimension of audio waves. It should be 1 or same as quantize.
  • quantize
    • If use_logistic is True it should be 2 ** 16. If False it should be 256.
  • length
    • How many samples used for training.
  • use_logistic
    • If True use mixture of logistics.

parameters of Encoder(Deconvolution network)

  • channels
    • Channels of deconvolution in encoder. The number of elements must be same as upsample_factors.
  • upsample_factors
    • The factor of upsampling by deconvolution. The number of elements must be same as channels and the product of elements must be same as hop_length.

parameters of Decoder(WaveNet)

  • n_loop
    • If you want to make network like dilations [1, 2, 4, 1, 2, 4] set n_loop as 2.
  • n_layer
    • If you want to make network like dilations [1, 2, 4, 1, 2, 4] set n_layer as 3.
  • filter_size
    • The filter size of each dilated convolution.
  • residual_channels
    • The number of input/output channels of residual blocks.
  • dilated_channels
    • The number of output channels of causal dilated convolution layers. This is splited into tanh and sigmoid so the number of hidden units is half of this number.
  • skip_channels
    • The number of channels of skip connections and last projection layer.
  • n_mixture
    • The number of logistic distribution. It is used only use_logistic is True.
  • log_scale_min
    • The number for stability. It is used only use_logistic is True.
  • condition_dim
    • The dimension of condition. It must be same as the last element of channels.
  • dropout_zero_rate
    • The rate of 0 in dropout. If 0 doesn't apply dropout.

parameters of generating

  • use_ema
    • If True use the value of exponential moving average.
  • apply_dropout
    • If True apply dropout.

training

(without GPU)
python train.py

(with GPU #n)
python train.py -g n

If you want to use multi GPUs, you can add IDs like below.

python train.py -g 0 1 2

You can resume snapshot and restart training like below.

python train.py -r snapshot_iter_100000

Other arguments -f and -p are parameters for multiprocess in preprocessing. -f means the number of prefetch and -p means the number of processes.

generating

python generate.py -i <input file> -o <output file> -m <trained model>
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].