All Projects → pritishyuvraj → Voice Conversion Gan

pritishyuvraj / Voice Conversion Gan

Licence: unlicense
Voice Conversion using Cycle GAN's For Non-Parallel Data

Projects that are alternatives of or similar to Voice Conversion Gan

Facial Similarity With Siamese Networks In Pytorch
Implementing Siamese networks with a contrastive loss for similarity learning
Stars: ✭ 719 (+776.83%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Relativistic Average Gan Keras
The implementation of Relativistic average GAN with Keras
Stars: ✭ 36 (-56.1%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Gans In Action
Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks
Stars: ✭ 748 (+812.2%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Early Stopping Pytorch
Early stopping for PyTorch
Stars: ✭ 612 (+646.34%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Animegan
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.
Stars: ✭ 1,095 (+1235.37%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Machine Learning
머신러닝 입문자 혹은 스터디를 준비하시는 분들에게 도움이 되고자 만든 repository입니다. (This repository is intented for helping whom are interested in machine learning study)
Stars: ✭ 705 (+759.76%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Udacity Deep Learning Nanodegree
This is just a collection of projects that made during my DEEPLEARNING NANODEGREE by UDACITY
Stars: ✭ 15 (-81.71%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Generative Adversarial Networks
Introduction to generative adversarial networks, with code to accompany the O'Reilly tutorial on GANs
Stars: ✭ 505 (+515.85%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Dl4sci Pytorch Webinar
Stars: ✭ 43 (-47.56%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Keras Pytorch Avp Transfer Learning
We pit Keras and PyTorch against each other, showing their strengths and weaknesses in action. We present a real problem, a matter of life-and-death: distinguishing Aliens from Predators!
Stars: ✭ 42 (-48.78%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Video Classification
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Stars: ✭ 543 (+562.2%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Machine Learning
My Attempt(s) In The World Of ML/DL....
Stars: ✭ 78 (-4.88%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Deep Learning With Pytorch Chinese
本仓库将PyTorch官方书籍《Deep learning with PyTorch》(基本摘录版)翻译成中文版并给出可运行的相关代码。
Stars: ✭ 517 (+530.49%)
Mutual labels:  jupyter-notebook, pytorch-tutorial
Fewshot Face Translation Gan
Generative adversarial networks integrating modules from FUNIT and SPADE for face-swapping.
Stars: ✭ 705 (+759.76%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Hidt
Official repository for the paper "High-Resolution Daytime Translation Without Domain Labels" (CVPR2020, Oral)
Stars: ✭ 513 (+525.61%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Multi Viewpoint Image Generation
Given an image and a target viewpoint, generate synthetic image in the target viewpoint
Stars: ✭ 23 (-71.95%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Generative Models
Annotated, understandable, and visually interpretable PyTorch implementations of: VAE, BIRVAE, NSGAN, MMGAN, WGAN, WGANGP, LSGAN, DRAGAN, BEGAN, RaGAN, InfoGAN, fGAN, FisherGAN
Stars: ✭ 438 (+434.15%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Generative Adversarial Network Tutorial
Tutorial on creating your own GAN in Tensorflow
Stars: ✭ 461 (+462.2%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Yann
This toolbox is support material for the book on CNN (http://www.convolution.network).
Stars: ✭ 41 (-50%)
Mutual labels:  jupyter-notebook, generative-adversarial-network
Vitech
tuyển chọn các tài liệu về công nghệ bằng tiếng Việt
Stars: ✭ 63 (-23.17%)
Mutual labels:  jupyter-notebook, pytorch-tutorial

Voice-Conversion-GAN

Voice Conversion using Cycle GAN's (PyTorch Implementation). Architecture of the Cycle GAN is as follows:

Dependencies

  • Python 3.5
  • Numpy 1.14
  • PyTorch 0.4.1
  • ProgressBar2 3.37.1
  • LibROSA 0.6
  • FFmpeg 4.0
  • PyWorld

Usage

Download Dataset

Download and unzip VCC2016 dataset to designated directories.

$ python download.py --help
usage: download.py [-h] [--download_dir DOWNLOAD_DIR] [--data_dir DATA_DIR]
                   [--datasets DATASETS]

optional arguments:
  -h, --help            show this help message and exit
  --download_dir DOWNLOAD_DIR
                        Download directory for zipped data
  --data_dir DATA_DIR   Data directory for unzipped data
  --datasets DATASETS   Datasets available: vcc2016

For example, to download the datasets to download directory and extract to data directory:

$ python download.py --download_dir ./download --data_dir ./data --datasets vcc2016

Preprocessing for Training

Preprocess voice data and stores it in numpy format in ../cache folder

$ python prepocess_training.py --help
Usage: preprocess_training.py [-h] [--train_A_dir TRAIN_A_DIR]
                              [--train_B_dir TRAIN_B_DIR]
                              [--cache_folder CACHE_FOLDER]

Prepare data for training Cycle GAN using PyTorch

optional arguments:
  -h, --help            show this help message and exit
  --train_A_dir TRAIN_A_DIR
                        Directory for source voice sample
  --train_B_dir TRAIN_B_DIR
                        Directory for target voice sample
  --cache_folder CACHE_FOLDER
                        Store preprocessed data in cache folders

For example, to train CycleGAN model for voice Conversion between SF1 and TM1:

$ python prepocess_training.py --train_A_dir ../data/vcc2016_training/SF1
                                --train_B_dir ../data/vcc2016_training/TM1
                                --cache_folder ../cache/

Train Model

$python train.py --help
usage: train.py [-h] [--logf0s_normalization LOGF0S_NORMALIZATION]
                [--mcep_normalization MCEP_NORMALIZATION]
                [--coded_sps_A_norm CODED_SPS_A_NORM]
                [--coded_sps_B_norm CODED_SPS_B_NORM]
                [--model_checkpoint MODEL_CHECKPOINT]
                [--resume_training_at RESUME_TRAINING_AT]
                [--validation_A_dir VALIDATION_A_DIR]
                [--output_A_dir OUTPUT_A_DIR]
                [--validation_B_dir VALIDATION_B_DIR]
                [--output_B_dir OUTPUT_B_DIR]

                Train CycleGAN using source dataset and target dataset

                optional arguments:

                  -h, --help            show this help message and exit
                  --logf0s_normalization LOGF0S_NORMALIZATION
                                        Cached location for log f0s normalized
                  --mcep_normalization MCEP_NORMALIZATION
                                        Cached location for mcep normalization
                  --coded_sps_A_norm CODED_SPS_A_NORM
                                        mcep norm for data A
                  --coded_sps_B_norm CODED_SPS_B_NORM
                                        mcep norm for data B
                  --model_checkpoint MODEL_CHECKPOINT
                                        location where you want to save the odel
                  --resume_training_at RESUME_TRAINING_AT
                                        Location of the pre-trained model to resume training
                  --validation_A_dir VALIDATION_A_DIR
                                        validation set for sound source A
                  --output_A_dir OUTPUT_A_DIR
                                        output for converted Sound Source A
                  --validation_B_dir VALIDATION_B_DIR
                                        Validation set for sound source B
                  --output_B_dir OUTPUT_B_DIR
                                        Output for converted sound Source B

For example, to train CycleGAN model for voice conversion between SF1 and TF2:

$python train.py --logf0s_normalization ../cache/logf0s_normalization.npz --mcep_normalization ../cache/mcep_normalization.npz --coded_sps_A_norm coded_sps_A_norm --coded_sps_B_norm coded_sps_B_norm --resume_training_at ../cache/model_checkpoint/_CycleGAN_CheckPoint --validation_A_dir ../data/vcc2016_training/evaluation_all/SF1/ --output_A_dir ../data/vcc2016_training/converted_sound/SF1 --validation_B_dir ../data/vcc2016_training/evaluation_all/TF2/ --output_B_dir ../data/vcc2016_training/converted_sound/TF2/

Reference

  • Takuhiro Kaneko, Hirokazu Kameoka. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. 2017. (Voice Conversion CycleGAN)
  • TensorFLow Implementation

To-Do List

  • [x] CPU compatible
  • [ ] Sample Outputs
  • [ ] Evaluation Metrics

Useful Tutorials

PyTorch Tutorial: https://github.com/yunjey/pytorch-tutorial

Gaussian Mixture Model Voice Conversion: https://r9y9.github.io/nnmnkwii/latest/nnmnkwii_gallery/notebooks/vc/01-GMM%20voice%20conversion%20(en).html

CycleGAN - VC: http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc/

Voice Conversion using Variational Auto-Encoder: https://github.com/JeremyCCHsu/vae-npvc

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].