All Projects → soobinseo → wavenet

soobinseo / wavenet

Licence: other
Audio source separation (mixture to vocal) using the Wavenet

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to wavenet

Vq Vae Wavenet
TensorFlow implementation of VQ-VAE with WaveNet decoder, based on https://arxiv.org/abs/1711.00937 and https://arxiv.org/abs/1901.08810
Stars: ✭ 40 (+100%)
Mutual labels:  wavenet
Pytorch Gan Timeseries
GANs for time series generation in pytorch
Stars: ✭ 109 (+445%)
Mutual labels:  wavenet
Seriesnet
Time series prediction using dilated causal convolutional neural nets (temporal CNN)
Stars: ✭ 185 (+825%)
Mutual labels:  wavenet
Wavenet
WaveNet implementation with chainer
Stars: ✭ 53 (+165%)
Mutual labels:  wavenet
Nsynth wavenet
parallel wavenet based on nsynth
Stars: ✭ 100 (+400%)
Mutual labels:  wavenet
Wavenet vocoder
WaveNet vocoder
Stars: ✭ 1,926 (+9530%)
Mutual labels:  wavenet
Wavenet Stt
An end-to-end speech recognition system with Wavenet. Built using C++ and python.
Stars: ✭ 18 (-10%)
Mutual labels:  wavenet
birdsong-generation-project
Generating birdsong with WaveNet
Stars: ✭ 26 (+30%)
Mutual labels:  wavenet
Numpy Ml
Machine learning, in numpy
Stars: ✭ 11,100 (+55400%)
Mutual labels:  wavenet
Source Separation Wavenet
A neural network for end-to-end music source separation
Stars: ✭ 185 (+825%)
Mutual labels:  wavenet
Tf Wavenet vocoder
Wavenet and its applications with Tensorflow
Stars: ✭ 58 (+190%)
Mutual labels:  wavenet
Wavenet Enhancement
Speech Enhancement using Bayesian WaveNet
Stars: ✭ 86 (+330%)
Mutual labels:  wavenet
Tacotron 2
DeepMind's Tacotron-2 Tensorflow implementation
Stars: ✭ 1,968 (+9740%)
Mutual labels:  wavenet
Tacotron2
pytorch tacotron2 https://arxiv.org/pdf/1712.05884.pdf
Stars: ✭ 46 (+130%)
Mutual labels:  wavenet
Vq Vae Speech
PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
Stars: ✭ 187 (+835%)
Mutual labels:  wavenet
Pytorch Uniwavenet
Stars: ✭ 30 (+50%)
Mutual labels:  wavenet
Fast Wavenet
Speedy Wavenet generation using dynamic programming ⚡
Stars: ✭ 1,705 (+8425%)
Mutual labels:  wavenet
SpleeterRT
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
Stars: ✭ 111 (+455%)
Mutual labels:  source-separation
wavenet-like-vocoder
Basic wavenet and fftnet vocoder model.
Stars: ✭ 20 (+0%)
Mutual labels:  wavenet
Deep Time Series Prediction
Seq2Seq, Bert, Transformer, WaveNet for time series prediction.
Stars: ✭ 183 (+815%)
Mutual labels:  wavenet

wavenet

Description

  • This is a Tensorflow implementaion of Audio source separation (mixture to vocal) using the Wavenet. Although the Wavenet model used the causal convolution (only previous sequence must be used for training) to generate(predict) next sequence, this task is for audio separation purposes, so the latter sequence may be used for training. I used original dilated 1-D convolution. Except this, the network structures are the same as the paper. See the file hyperparams.py for the detailed hyperparameters.

Requirements

  • NumPy >= 1.11.1
  • TensorFlow >= 1.0.0
  • librosa

Data

I used DSD100 dataset which consists of pairs of mixture audio files and vocal audio files. The complete dataset (~14 GB) can be downloaded here. The data was pre-processed with sample_rate=16000, and was divided into 380ms units. Therefore, the number of timesteps for network input was then 6080 raw data.

File description

  • hyperparams.py includes all hyper parameters that are needed.
  • data_utils.py loads training data and preprocess it into units of raw data sequences.
  • modules.py contains all methods, building blocks and skip connections for networks.
  • networks.py builds networks.
  • train.py is for training.
  • eval.py is for generating separated vocal sample.

Training the network

  • STEP 1. Adjust hyper parameters in hyperparams.py if necessary.
  • STEP 2. Download and extract DSD100 data as mentioned above at 'data' directory, and run data_utils.py.
  • STEP 3. Run train.py.

Generate seperated vocal audio

  • Prepare a test data (the name of this data should be defined at hyperparams.py) and locate it in 'data' directory and run eval.py.

Notes

  • I applied L1-loss instead of NLL-loss using the mu-law companding.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].