Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → auspicious3000 → Autovc

auspicious3000 / Autovc

Licence: mit

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Programming Languages

python

139335 projects - #7 most used programming language

Labels

unsupervised-learning speech-synthesis

Projects that are alternatives of or similar to Autovc

VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Stars: ✭ 66 (-86.39%)

Mutual labels: speech-synthesis, unsupervised-learning

Athena

an open-source implementation of sequence-to-sequence based speech processing engine

Stars: ✭ 542 (+11.75%)

Mutual labels: unsupervised-learning, speech-synthesis

Espeak

eSpeak NG is an open source speech synthesizer that supports 101 languages and accents.

Stars: ✭ 339 (-30.1%)

Mutual labels: speech-synthesis

Enlightengan

[IEEE TIP'2021] "EnlightenGAN: Deep Light Enhancement without Paired Supervision" by Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, Zhangyang Wang

Stars: ✭ 434 (-10.52%)

Mutual labels: unsupervised-learning

Recycle Gan

Unsupervised Video Retargeting (e.g. face to face, flower to flower, clouds and winds, sunrise and sunset)

Stars: ✭ 367 (-24.33%)

Mutual labels: unsupervised-learning

Pytorch Cortexnet

PyTorch implementation of the CortexNet predictive model

Stars: ✭ 349 (-28.04%)

Mutual labels: unsupervised-learning

Disentangling Vae

Experiments for understanding disentanglement in VAE latent representations

Stars: ✭ 398 (-17.94%)

Mutual labels: unsupervised-learning

Mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Stars: ✭ 3,729 (+668.87%)

Mutual labels: unsupervised-learning

Gantts

PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)

Stars: ✭ 460 (-5.15%)

Mutual labels: speech-synthesis

Espnet

End-to-End Speech Processing Toolkit

Stars: ✭ 4,533 (+834.64%)

Mutual labels: speech-synthesis

Sprocket

Voice Conversion Tool Kit

Stars: ✭ 425 (-12.37%)

Mutual labels: speech-synthesis

Voice Builder

An opensource text-to-speech (TTS) voice building tool

Stars: ✭ 362 (-25.36%)

Mutual labels: speech-synthesis

Pase

Problem Agnostic Speech Encoder

Stars: ✭ 348 (-28.25%)

Mutual labels: unsupervised-learning

Awesome Vaes

A curated list of awesome work on VAEs, disentanglement, representation learning, and generative models.

Stars: ✭ 418 (-13.81%)

Mutual labels: unsupervised-learning

Mmt

[ICLR-2020] Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification.

Stars: ✭ 345 (-28.87%)

Mutual labels: unsupervised-learning

Corex topic

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Stars: ✭ 439 (-9.48%)

Mutual labels: unsupervised-learning

Paragraph Vectors

📄 A PyTorch implementation of Paragraph Vectors (doc2vec).

Stars: ✭ 337 (-30.52%)

Mutual labels: unsupervised-learning

Libfaceid

libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.

Stars: ✭ 354 (-27.01%)

Mutual labels: speech-synthesis

Contrastive Predictive Coding

Keras implementation of Representation Learning with Contrastive Predictive Coding

Stars: ✭ 369 (-23.92%)

Mutual labels: unsupervised-learning

Sc Sfmlearner Release

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video (NeurIPS 2019)

Stars: ✭ 468 (-3.51%)

Mutual labels: unsupervised-learning

View All Similar Projects ➔

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Checkout our new project: Unsupervised Speech Decomposition for Rhythm, Pitch, and Timbre Conversion https://github.com/auspicious3000/SpeechSplit

This repository provides a PyTorch implementation of AUTOVC.

AUTOVC is a many-to-many non-parallel voice conversion framework.

If you find this work useful and use it in your research, please consider citing our paper.

@InProceedings{pmlr-v97-qian19c, title = {{A}uto{VC}: Zero-Shot Voice Style Transfer with Only Autoencoder Loss}, author = {Qian, Kaizhi and Zhang, Yang and Chang, Shiyu and Yang, Xuesong and Hasegawa-Johnson, Mark}, pages = {5210--5219}, year = {2019}, editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov}, volume = {97}, series = {Proceedings of Machine Learning Research}, address = {Long Beach, California, USA}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/qian19c/qian19c.pdf}, url = {http://proceedings.mlr.press/v97/qian19c.html} }

Audio Demo

The audio demo for AUTOVC can be found here

Dependencies

Python 3
Numpy
PyTorch >= v0.4.1
TensorFlow >= v1.3 (only for tensorboard)
librosa
tqdm
wavenet_vocoder pip install wavenet_vocoder for more information, please refer to https://github.com/r9y9/wavenet_vocoder

Pre-trained models

AUTOVC	Speaker Encoder	WaveNet Vocoder
link	link	link

0.Convert Mel-Spectrograms

Download pre-trained AUTOVC model, and run the conversion.ipynb in the same directory.

1.Mel-Spectrograms to waveform

Download pre-trained WaveNet Vocoder model, and run the vocoder.ipynb in the same the directory.

Please note the training metadata and testing metadata have different formats.

2.Train model

We have included a small set of training audio files in the wav folder. However, the data is very small and is for code verification purpose only. Please prepare your own dataset for training.

1.Generate spectrogram data from the wav files: python make_spect.py

2.Generate training metadata, including the GE2E speaker embedding (please use one-hot embeddings if you are not doing zero-shot conversion): python make_metadata.py

3.Run the main training script: python main.py

Converges when the reconstruction loss is around 0.0001.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 485

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (32) 🔗