All Projects → Tinglok → CVC

Tinglok / CVC

Licence: MIT license
CVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to CVC

Voice Converter Cyclegan
Voice Converter Using CycleGAN and Non-Parallel Data
Stars: ✭ 384 (+753.33%)
Mutual labels:  speech, cyclegan
Shifter
Pitch shifter using WSOLA and resampling implemented by Python3
Stars: ✭ 22 (-51.11%)
Mutual labels:  speech, voice-conversion
VQMIVC
Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
Stars: ✭ 278 (+517.78%)
Mutual labels:  speech, voice-conversion
JD-NMF
Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)
Stars: ✭ 20 (-55.56%)
Mutual labels:  speech, voice-conversion
Phomeme
Simple sentence mixing tool (work in progress)
Stars: ✭ 18 (-60%)
Mutual labels:  speech, voice-conversion
DisCont
Code for the paper "DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors".
Stars: ✭ 13 (-71.11%)
Mutual labels:  contrastive-learning
TCE
This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).
Stars: ✭ 51 (+13.33%)
Mutual labels:  contrastive-learning
audio noise clustering
https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.
Stars: ✭ 24 (-46.67%)
Mutual labels:  speech
simple-obs-stt
Speech-to-text and keyboard input captions for OBS.
Stars: ✭ 89 (+97.78%)
Mutual labels:  speech
UPIT
A fastai/PyTorch package for unpaired image-to-image translation.
Stars: ✭ 94 (+108.89%)
Mutual labels:  cyclegan
info-nce-pytorch
PyTorch implementation of the InfoNCE loss for self-supervised learning.
Stars: ✭ 160 (+255.56%)
Mutual labels:  contrastive-learning
Generative-Model
Repository for implementation of generative models with Tensorflow 1.x
Stars: ✭ 66 (+46.67%)
Mutual labels:  cyclegan
CLMR
Official PyTorch implementation of Contrastive Learning of Musical Representations
Stars: ✭ 216 (+380%)
Mutual labels:  contrastive-learning
kaldi ag training
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.
Stars: ✭ 14 (-68.89%)
Mutual labels:  speech
ViCC
[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.
Stars: ✭ 33 (-26.67%)
Mutual labels:  contrastive-learning
awesome-graph-self-supervised-learning-based-recommendation
A curated list of awesome graph & self-supervised-learning-based recommendation.
Stars: ✭ 37 (-17.78%)
Mutual labels:  contrastive-learning
MediumVC
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features
Stars: ✭ 46 (+2.22%)
Mutual labels:  voice-conversion
GeDML
Generalized Deep Metric Learning.
Stars: ✭ 30 (-33.33%)
Mutual labels:  contrastive-learning
Zero-Shot-TTS
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Stars: ✭ 33 (-26.67%)
Mutual labels:  speech
day2night
Image2Image Translation Research
Stars: ✭ 46 (+2.22%)
Mutual labels:  cyclegan

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper




This implementation is based on CUT, thanks Taesung and Junyan for sharing codes.

We provide a PyTorch implementation of non-parallel voice conversion based on patch-wise contrastive learning and adversarial learning. Compared to baseline CycleGAN-VC, CVC only requires one-way GAN training when it comes to non-parallel one-to-one voice conversion, while improving speech quality and reducing training time.

Prerequisites

  • Linux or macOS
  • Python 3
  • CPU or NVIDIA GPU + CUDA CuDNN

Kick Start

  • Clone this repo:
git clone https://github.com/Tinglok/CVC
cd CVC
  • Install PyTorch 1.6 and other dependencies.

    For pip users, please type the command pip install -r requirements.txt.

    For Conda users, you can create a new Conda environment using conda env create -f environment.yaml.

  • Download pre-trained Parallel WaveGAN vocoder to ./checkpoints/vocoder.

CVC Training and Test

  • Download the VCTK dataset
cd dataset
wget http://datashare.is.ed.ac.uk/download/DS_10283_2651.zip
unzip DS_10283_2651.zip
unzip VCTK-Corpus.zip
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainA
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainB

where the speaker folder could be any speakers (e.g. p256, and p270).

  • Train the CVC model:
python train.py --dataroot ./datasets/voice --name CVC

The checkpoints will be stored at ./checkpoints/CVC/.

  • Test the CVC model:
python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CVC/converted_sound

The converted utterance will be saved at ./checkpoints/CVC/converted_sound.

Baseline CycleGAN-VC Training and Test

  • Train the CycleGAN-VC model:
python train.py --dataroot ./datasets/voice --name CycleGAN --model cycle_gan
  • Test the CycleGAN-VC model:
python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CycleGAN/converted_sound --model cycle_gan

The converted utterance will be saved at ./checkpoints/CycleGAN/converted_sound.

Pre-trained CVC Model

Pre-trained models on p270-to-p256 and many-to-p249 are avaliable at this URL.

TensorBoard Visualization

To view loss plots, run tensorboard --logdir=./checkpoints and click the URL http://localhost:6006/.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2021cvc,
  author={Tingle Li and Yichen Liu and Chenxu Hu and Hang Zhao},
  title={{CVC: Contrastive Learning for Non-Parallel Voice Conversion}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1324--1328}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].