soobinseo / wavenet

Licence: other

Audio source separation (mixture to vocal) using the Wavenet

Programming Languages

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to wavenet

TensorFlow implementation of VQ-VAE with WaveNet decoder, based on https://arxiv.org/abs/1711.00937 and https://arxiv.org/abs/1901.08810

Stars: ✭ 40 (+100%)

Mutual labels: wavenet

Pytorch Gan Timeseries

GANs for time series generation in pytorch

Stars: ✭ 109 (+445%)

Mutual labels: wavenet

Seriesnet

Time series prediction using dilated causal convolutional neural nets (temporal CNN)

Stars: ✭ 185 (+825%)

Mutual labels: wavenet

Wavenet

WaveNet implementation with chainer

Stars: ✭ 53 (+165%)

Mutual labels: wavenet

Nsynth wavenet

parallel wavenet based on nsynth

Stars: ✭ 100 (+400%)

Mutual labels: wavenet

Wavenet vocoder

WaveNet vocoder

Stars: ✭ 1,926 (+9530%)

Mutual labels: wavenet

Wavenet Stt

An end-to-end speech recognition system with Wavenet. Built using C++ and python.

Stars: ✭ 18 (-10%)

Mutual labels: wavenet

birdsong-generation-project

Generating birdsong with WaveNet

Stars: ✭ 26 (+30%)

Mutual labels: wavenet

Numpy Ml

Machine learning, in numpy

Stars: ✭ 11,100 (+55400%)

Mutual labels: wavenet

Source Separation Wavenet

A neural network for end-to-end music source separation

Stars: ✭ 185 (+825%)

Mutual labels: wavenet

Tf Wavenet vocoder

Wavenet and its applications with Tensorflow

Stars: ✭ 58 (+190%)

Mutual labels: wavenet

Wavenet Enhancement

Speech Enhancement using Bayesian WaveNet

Stars: ✭ 86 (+330%)

Mutual labels: wavenet

Tacotron 2

DeepMind's Tacotron-2 Tensorflow implementation

Stars: ✭ 1,968 (+9740%)

Mutual labels: wavenet

Tacotron2

pytorch tacotron2 https://arxiv.org/pdf/1712.05884.pdf

Stars: ✭ 46 (+130%)

Mutual labels: wavenet

Vq Vae Speech

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Stars: ✭ 187 (+835%)

Mutual labels: wavenet

Pytorch Uniwavenet

Stars: ✭ 30 (+50%)

Mutual labels: wavenet

Fast Wavenet

Speedy Wavenet generation using dynamic programming ⚡

Stars: ✭ 1,705 (+8425%)

Mutual labels: wavenet

SpleeterRT

Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.

Stars: ✭ 111 (+455%)

Mutual labels: source-separation

wavenet-like-vocoder

Basic wavenet and fftnet vocoder model.

Stars: ✭ 20 (+0%)

Mutual labels: wavenet

Deep Time Series Prediction

Seq2Seq, Bert, Transformer, WaveNet for time series prediction.

Stars: ✭ 183 (+815%)

Mutual labels: wavenet

View All Similar Projects ➔

wavenet

Description

This is a Tensorflow implementaion of Audio source separation (mixture to vocal) using the Wavenet. Although the Wavenet model used the causal convolution (only previous sequence must be used for training) to generate(predict) next sequence, this task is for audio separation purposes, so the latter sequence may be used for training. I used original dilated 1-D convolution. Except this, the network structures are the same as the paper. See the file hyperparams.py for the detailed hyperparameters.

Requirements

NumPy >= 1.11.1
TensorFlow >= 1.0.0
librosa

Data

I used DSD100 dataset which consists of pairs of mixture audio files and vocal audio files. The complete dataset (~14 GB) can be downloaded here. The data was pre-processed with sample_rate=16000, and was divided into 380ms units. Therefore, the number of timesteps for network input was then 6080 raw data.

File description

hyperparams.py includes all hyper parameters that are needed.
data_utils.py loads training data and preprocess it into units of raw data sequences.
modules.py contains all methods, building blocks and skip connections for networks.
networks.py builds networks.
train.py is for training.
eval.py is for generating separated vocal sample.

Training the network

STEP 1. Adjust hyper parameters in hyperparams.py if necessary.
STEP 2. Download and extract DSD100 data as mentioned above at 'data' directory, and run data_utils.py.
STEP 3. Run train.py.

Generate seperated vocal audio

Prepare a test data (the name of this data should be defined at hyperparams.py) and locate it in 'data' directory and run eval.py.

Notes

I applied L1-loss instead of NLL-loss using the mu-law companding.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

soobinseo / wavenet

Programming Languages

Labels

Projects that are alternatives of or similar to wavenet

wavenet

Description

Requirements

Data

File description

Training the network

Generate seperated vocal audio

Notes