This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).

Stars: ✭ 51 (+13.33%)

Mutual labels: contrastive-learning

audio noise clustering

https://dodiku.github.io/audio_noise_clustering/results/ ==> An experiment with a variety of clustering (and clustering-like) techniques to reduce noise on an audio speech recording.

Stars: ✭ 24 (-46.67%)

Mutual labels: speech

simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Stars: ✭ 89 (+97.78%)

Mutual labels: speech

UPIT

A fastai/PyTorch package for unpaired image-to-image translation.

Stars: ✭ 94 (+108.89%)

Mutual labels: cyclegan

info-nce-pytorch

PyTorch implementation of the InfoNCE loss for self-supervised learning.

Stars: ✭ 160 (+255.56%)

Mutual labels: contrastive-learning

Generative-Model

Repository for implementation of generative models with Tensorflow 1.x

Stars: ✭ 66 (+46.67%)

Mutual labels: cyclegan

CLMR

Official PyTorch implementation of Contrastive Learning of Musical Representations

Stars: ✭ 216 (+380%)

Mutual labels: contrastive-learning

kaldi ag training

Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-grammar.

Stars: ✭ 14 (-68.89%)

Mutual labels: speech

ViCC

[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.

Stars: ✭ 33 (-26.67%)

Mutual labels: contrastive-learning

awesome-graph-self-supervised-learning-based-recommendation

A curated list of awesome graph & self-supervised-learning-based recommendation.

Stars: ✭ 37 (-17.78%)

Mutual labels: contrastive-learning

MediumVC

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

Stars: ✭ 46 (+2.22%)

Mutual labels: voice-conversion

GeDML

Generalized Deep Metric Learning.

Stars: ✭ 30 (-33.33%)

Mutual labels: contrastive-learning

Zero-Shot-TTS

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Stars: ✭ 33 (-26.67%)

Mutual labels: speech

day2night

Image2Image Translation Research

Stars: ✭ 46 (+2.22%)

Mutual labels: cyclegan

View All Similar Projects ➔

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

This implementation is based on CUT, thanks Taesung and Junyan for sharing codes.

We provide a PyTorch implementation of non-parallel voice conversion based on patch-wise contrastive learning and adversarial learning. Compared to baseline CycleGAN-VC, CVC only requires one-way GAN training when it comes to non-parallel one-to-one voice conversion, while improving speech quality and reducing training time.

Prerequisites

Linux or macOS
Python 3
CPU or NVIDIA GPU + CUDA CuDNN

Kick Start

Clone this repo:

git clone https://github.com/Tinglok/CVC
cd CVC

Install PyTorch 1.6 and other dependencies.

For pip users, please type the command pip install -r requirements.txt.

For Conda users, you can create a new Conda environment using conda env create -f environment.yaml.
Download pre-trained Parallel WaveGAN vocoder to ./checkpoints/vocoder.

CVC Training and Test

Download the VCTK dataset

cd dataset
wget http://datashare.is.ed.ac.uk/download/DS_10283_2651.zip
unzip DS_10283_2651.zip
unzip VCTK-Corpus.zip
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainA
cp -r ./VCTK-Corpus/wav48/p* ./voice/trainB

where the speaker folder could be any speakers (e.g. p256, and p270).

Train the CVC model:

python train.py --dataroot ./datasets/voice --name CVC

The checkpoints will be stored at ./checkpoints/CVC/.

Test the CVC model:

python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CVC/converted_sound

The converted utterance will be saved at ./checkpoints/CVC/converted_sound.

Baseline CycleGAN-VC Training and Test

Train the CycleGAN-VC model:

python train.py --dataroot ./datasets/voice --name CycleGAN --model cycle_gan

Test the CycleGAN-VC model:

python test.py --dataroot ./datasets/voice --validation_A_dir ./datasets/voice/trainA --output_A_dir ./checkpoints/CycleGAN/converted_sound --model cycle_gan

The converted utterance will be saved at ./checkpoints/CycleGAN/converted_sound.

Pre-trained CVC Model

Pre-trained models on p270-to-p256 and many-to-p249 are avaliable at this URL.

TensorBoard Visualization

To view loss plots, run tensorboard --logdir=./checkpoints and click the URL http://localhost:6006/.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2021cvc,
  author={Tingle Li and Yichen Liu and Chenxu Hu and Hang Zhao},
  title={{CVC: Contrastive Learning for Non-Parallel Voice Conversion}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1324--1328}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Tinglok / CVC

Programming Languages

Labels

Projects that are alternatives of or similar to CVC

Contrastive Voice Conversion (CVC)

Video (3m) | Website | Paper

Prerequisites

Kick Start

CVC Training and Test

Baseline CycleGAN-VC Training and Test

Pre-trained CVC Model

TensorBoard Visualization

Citation