All Projects → bigpon → vcc20_baseline_cyclevae

bigpon / vcc20_baseline_cyclevae

Licence: MIT license
Voice Conversion Challenge 2020 CycleVAE baseline system

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects
shell
77523 projects
Makefile
30231 projects

Projects that are alternatives of or similar to vcc20 baseline cyclevae

baset
Testing tool for baseline strategy
Stars: ✭ 26 (-78.86%)
Mutual labels:  baseline
l2rpn-baselines
L2RPN Baselines a repository to host baselines for l2rpn competitions.
Stars: ✭ 57 (-53.66%)
Mutual labels:  baseline
CVC
CVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)
Stars: ✭ 45 (-63.41%)
Mutual labels:  voice-conversion
Shifter
Pitch shifter using WSOLA and resampling implemented by Python3
Stars: ✭ 22 (-82.11%)
Mutual labels:  voice-conversion
DCASE2019 task4
Baseline of dcase 2019 task 4
Stars: ✭ 55 (-55.28%)
Mutual labels:  baseline
Phomeme
Simple sentence mixing tool (work in progress)
Stars: ✭ 18 (-85.37%)
Mutual labels:  voice-conversion
Text-Classification-LSTMs-PyTorch
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (-63.41%)
Mutual labels:  baseline
cis-dil-benchmark
CIS Distribution Independent Linux Benchmark - InSpec Profile
Stars: ✭ 120 (-2.44%)
Mutual labels:  baseline
MediumVC
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features
Stars: ✭ 46 (-62.6%)
Mutual labels:  voice-conversion
Speaker-Anti-Spoofing-Classifiers
Baselines and Classifiers for speaker anti-spoofing detection
Stars: ✭ 15 (-87.8%)
Mutual labels:  baseline
postgres-baseline
DevSec PostgreSQL Baseline - InSpec Profile
Stars: ✭ 47 (-61.79%)
Mutual labels:  baseline
SingleVC
Any-to-one voice conversion using the data augment strategy: pitch shifted and duration remained.
Stars: ✭ 25 (-79.67%)
Mutual labels:  voice-conversion
QPPWG
Quasi-Periodic Parallel WaveGAN Pytorch implementation
Stars: ✭ 41 (-66.67%)
Mutual labels:  parallel-wavenet-vocoder
naacl2018-fever
Fact Extraction and VERification baseline published in NAACL2018
Stars: ✭ 109 (-11.38%)
Mutual labels:  baseline
Voice-Conversion
No description or website provided.
Stars: ✭ 30 (-75.61%)
Mutual labels:  voice-conversion
revc
The fastest and safest EVC encoder and decoder
Stars: ✭ 75 (-39.02%)
Mutual labels:  baseline
text-classification-baseline
Pipeline for fast building text classification TF-IDF + LogReg baselines.
Stars: ✭ 55 (-55.28%)
Mutual labels:  baseline
JD-NMF
Joint Dictionary Learning-based Non-Negative Matrix Factorization for Voice Conversion (TBME 2016)
Stars: ✭ 20 (-83.74%)
Mutual labels:  voice-conversion
chainer-ClariNet
A Chainer implementation of ClariNet.
Stars: ✭ 45 (-63.41%)
Mutual labels:  parallel-wavenet-vocoder
Kevinpro-NLP-demo
All NLP you Need Here. 个人实现了一些好玩的NLP demo,目前包含13个NLP应用的pytorch实现
Stars: ✭ 117 (-4.88%)
Mutual labels:  baseline

Voice Conversion Challenge 2020 baseline: CycleVAE w/ PWG vocoder

Official homepage: http://www.vc-challenge.org/

News

  • 2020/10/18 update paper information.

  • 2020/4/17 upload the missed conversion pair of SEF2-TEM1 of reference_v.10.

  • 2020/3/18 release the generated samples of reference_v.10.

  • 2020/3/11 release the first version repo and the generated samples of development set (dv50_vcc2020_24kHz).

Introduction

This repo provides a cyclic variational autoencoder (CycleVAE)-based voice conversion (VC) system with parallel WaveGAN (PWG)-based vocoder for Voice Conversion Challenge 2020 (VCC2020). VCC2020 contains intra-lingual VC (Task1) and cross-lingual VC (Task2) tasks. Task1 includes four English source and four English target speakers. Task2 includes the same English source speakers but other six non-English (German/Finnish/Mandarin) target speakers. The goal is to convert the speaker identity of source speech to target speakers while keeping the same English contents.

CycleVAE w/ PWG vocoder

For this baseline VC system, WORLD-based acoustic features, which include spectral (further parameterized into mcep), pitch (f0), and aperiodic (ap) features, are adopted. The CycleVAE model only converts the spectral features. Logarithmic f0 is linearly converted and ap is kept the same as source speaker.

Two training processes of PWG vocoder are provided in this repo. The first PWG vocoder is trained with natural acoustic features and natural waveforms. The second PWG vocoder is trained with artificial and natural acoustic features and natural waveforms. Specifically, the artificial acoustic features include self-reconstructed and pseudo converted (target->source->target) acoustic features, which are generated by the CycleVAE and have the matched temporal structure with the natural waveforms. Because of the reduction of the mismatch between training and testing data, the second PWG vocoder achieves higher speech quality when the input is the converted acoustic features.

Model and demo

The trained CycleVAE and PWG models can be accessed here.
The generated samples can be accessed here.

Corpus

Only VCC2020 corpus is involved in both CycleVAE and PWG trainings.

  • VCC2020 contains all training data of the challenge. Please follow the instruction from the organizers to download it in the desired directory. (default is baseline/egs/cyclevae/wav_24kHz/)

Usage and requirements

Please check baseline/README.md.


References


Citation

If you find the code is helpful, please cite the following article.

@InProceedings{vcc20vaebaseline,
author={Tobing, Patrick Lumban and Wu, Yi-Chiao and Toda, Tomoki},
title={Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational
Autoencoder and Parallel WaveGAN},
booktitle="Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020",
year="2020",
month="Oct.",
}

Authors

Development:
Patrick Lumban Tobing @ Nagoya University (@patrickltobing)
Yi-Chiao Wu @ Nagoya University (@bigpon)

Advisor:
Tomoki Toda @ Nagoya University

E-mail:
[email protected]
[email protected]
[email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].