leimao / Singing-Voice-Separation-RNN

Licence: MIT License

Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Singing-Voice-Separation-RNN

DeepSeparation

Keras Implementation and Experiments with Deep Recurrent Neural Networks for Source Separation

Stars: ✭ 19 (-56.82%)

Mutual labels: recurrent-neural-networks, source-separation

STORN-keras

This is a STORN (Stochastical Recurrent Neural Network) implementation for keras!

Stars: ✭ 23 (-47.73%)

Mutual labels: recurrent-neural-networks

imessage-chatbot

💬 Recurrent neural network -- generates messages in your style of speech! Trained on imessage data. Sqlite3, TensorFlow, Flask, Twilio SMS, AWS.

Stars: ✭ 33 (-25%)

Mutual labels: recurrent-neural-networks

course-content-dl

NMA deep learning course

Stars: ✭ 537 (+1120.45%)

Mutual labels: recurrent-neural-networks

Conversational-AI-Chatbot-using-Practical-Seq2Seq

A simple open domain generative based chatbot based on Recurrent Neural Networks

Stars: ✭ 17 (-61.36%)

Mutual labels: recurrent-neural-networks

LSM

Liquid State Machines in Python and NEST

Stars: ✭ 39 (-11.36%)

Mutual labels: recurrent-neural-networks

AMSS-Net

A PyTorch implementation of the paper: "AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries" (ACM Multimedia 2021)

Stars: ✭ 19 (-56.82%)

Mutual labels: source-separation

Deep-Learning

This repo provides projects on deep-learning mainly using Tensorflow 2.0

Stars: ✭ 22 (-50%)

Mutual labels: recurrent-neural-networks

automatic-personality-prediction

[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings

Stars: ✭ 43 (-2.27%)

Mutual labels: recurrent-neural-networks

CS231n

PyTorch/Tensorflow solutions for Stanford's CS231n: "CNNs for Visual Recognition"

Stars: ✭ 47 (+6.82%)

Mutual labels: recurrent-neural-networks

recsys2019

The complete code and notebooks used for the ACM Recommender Systems Challenge 2019

Stars: ✭ 26 (-40.91%)

Mutual labels: recurrent-neural-networks

classifying-cancer

A Python-Tensorflow neural network for classifying cancer data

Stars: ✭ 30 (-31.82%)

Mutual labels: recurrent-neural-networks

entailment-neural-attention-lstm-tf

(arXiv:1509.06664) Reasoning about Entailment with Neural Attention.

Stars: ✭ 43 (-2.27%)

Mutual labels: recurrent-neural-networks

Meetup-Content

Entirety.ai Intuition to Implementation Meetup Content.

Stars: ✭ 33 (-25%)

Mutual labels: recurrent-neural-networks

TasNet

A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.

Stars: ✭ 81 (+84.09%)

Mutual labels: source-separation

rnn darts fastai

Implement Differentiable Architecture Search (DARTS) for RNN with fastai

Stars: ✭ 21 (-52.27%)

Mutual labels: recurrent-neural-networks

lyrics-generator

Generating lyrics with a recurrent neural network

Stars: ✭ 36 (-18.18%)

Mutual labels: recurrent-neural-networks

dts

A Keras library for multi-step time-series forecasting.

Stars: ✭ 130 (+195.45%)

Mutual labels: recurrent-neural-networks

cunet

Control mechanisms to the U-Net architecture for doing multiple source separation instruments

Stars: ✭ 36 (-18.18%)

Mutual labels: source-separation

spikeRNN

No description or website provided.

Stars: ✭ 28 (-36.36%)

Mutual labels: recurrent-neural-networks

View All Similar Projects ➔

Singing Voice Separation RNN

Lei Mao

University of Chicago

Introduction

This is a singing voice sepration tool developed using recurrent neural network (RNN). It could seperate the singer voice and the background music from the original song. It is still in the development stage since the separation has not been perfect yet. Please check the demo for the performance.

Dependencies

Python 3.5
Numpy 1.14
TensorFlow 1.8
RarFile 3.0
ProgressBar2 3.37.1
LibROSA 0.6
FFmpeg 4.0
Matplotlib 2.1.1
MIR_Eval 0.4

Files

.
├── demo
├── download.py
├── evaluate.py
├── figures
├── LICENSE.md
├── main.py
├── model
├── model.py
├── preprocess.py
├── README.md
├── songs
├── statistics
├── train.py
└── utils.py

Dataset

MIR-1K Dataset

Multimedia Information Retrieval, 1000 song clips (MIR-1K), dataset for singing voice separation.

To download the whole dataset, and split into train, validation, and test set, in the terminal:

$ python download.py

Usage

Train Model

To train the model, in the terminal:

$ python train.py

The training took roughly 45 minutes for 50,000 iterations on the train set of MIR-1K dataset using NVIDIA GTX TITAN X graphic card.

The program loads all the MIR-1K dataset into memory and stores all the processed MIR-1K data in the memory to accelerate the data sampling for training. However, this may cosume more than 10 GB of memory.

The trained model would be saved to the model directory.

Evaludate Model

To evaludate the model, in the terminal:

$ python evaluate.py

The evaluation took roughly 1 minute on the test set of MIR-1K dataset using NVIDIA GTX TITAN X graphic card. The separated sources, together with the monaural source, would be saved to the demo directory.

	GNSDR	GSIR	GSAR
Vocal	7.40	12.75	9.34
BGM	7.45	13.17	9.25

To do: The evaluation statistics would be saved.

Separate Sources for Customized Songs

To separate sources for customized songs, put the MP3 formatted songs to the songs directory, in the terminal:

$ python main.py

The separated sources, together with the monaural source, would be saved to the demo directory.

Demo

The MP3 of "Backstreet Boys - I want it that way", backstreet_boys-i_want_it_that_way.mp3 , was put to the songs directory. Using the pre-trained model in the model diretory, in the terminal:

$ python main.py

The separated sources, backstreet_boys-i_want_it_that_way_src1.mp3 and backstreet_boys-i_want_it_that_way_src2.mp3, together with the monaural source, backstreet_boys-i_want_it_that_way_mono.mp3, were saved to the demo directory.

References

Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis, Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks. 2014.
Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis, Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation. 2015.
Dabi Ahn's Music Source Separation Repository

To-Do List

Evaluation metrics
Hyper parameter tuning
Argparse

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

leimao / Singing-Voice-Separation-RNN

Programming Languages

Labels

Projects that are alternatives of or similar to Singing-Voice-Separation-RNN

Singing Voice Separation RNN

Introduction

Dependencies

Files

Dataset

MIR-1K Dataset

Usage

Train Model

Evaludate Model

Separate Sources for Customized Songs

Demo

References

To-Do List