All Projects → TheShadow29 → Vc With Gan

TheShadow29 / Vc With Gan

Voice Conversion with GANs

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Vc With Gan

Segan
A PyTorch implementation of SEGAN based on INTERSPEECH 2017 paper "SEGAN: Speech Enhancement Generative Adversarial Network"
Stars: ✭ 82 (+530.77%)
Mutual labels:  gan, voice
Inverse Style Gan
Looking up a generative latent vectors from (face) reference images.
Stars: ✭ 26 (+100%)
Mutual labels:  gan
Musegan
An AI for Music Generation
Stars: ✭ 794 (+6007.69%)
Mutual labels:  gan
Csmri Refinement
Code for "Adversarial and Perceptual Refinement Compressed Sensing MRI Reconstruction"
Stars: ✭ 21 (+61.54%)
Mutual labels:  gan
Utox
µTox the lightest and fluffiest Tox client
Stars: ✭ 820 (+6207.69%)
Mutual labels:  voice
Unsupnts
Unsupervised Neural Text Simplification
Stars: ✭ 23 (+76.92%)
Mutual labels:  gan
Instagan
InstaGAN: Instance-aware Image Translation (ICLR 2019)
Stars: ✭ 761 (+5753.85%)
Mutual labels:  gan
Transgan
[Preprint] "TransGAN: Two Transformers Can Make One Strong GAN", Yifan Jiang, Shiyu Chang, Zhangyang Wang
Stars: ✭ 864 (+6546.15%)
Mutual labels:  gan
Say
Convert text to audiable speech. Play it or save it to audio file.
Stars: ✭ 24 (+84.62%)
Mutual labels:  voice
Began Tensorflow
Tensorflow implementation of "BEGAN: Boundary Equilibrium Generative Adversarial Networks"
Stars: ✭ 904 (+6853.85%)
Mutual labels:  gan
Xdf Gan
A GAN for the generation of mock astronomical surveys
Stars: ✭ 17 (+30.77%)
Mutual labels:  gan
Tensorlayer
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥
Stars: ✭ 6,796 (+52176.92%)
Mutual labels:  gan
Bernard
Bernard is a voice assistant developed with gTTS. It can fulfill basic and simple tasks you give.
Stars: ✭ 24 (+84.62%)
Mutual labels:  voice
Generative Models
Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.
Stars: ✭ 6,701 (+51446.15%)
Mutual labels:  gan
Xunfei Clj
Clojure封装讯飞语音SDK, 可提供给Emacs/Vim编辑器使用,或者命令行, 实现语音提醒/语音识别/语音转为命令等
Stars: ✭ 26 (+100%)
Mutual labels:  voice
Pytorch Pretrained Biggan
🦋A PyTorch implementation of BigGAN with pretrained weights and conversion scripts.
Stars: ✭ 779 (+5892.31%)
Mutual labels:  gan
Lightning Bolts
Toolbox of models, callbacks, and datasets for AI/ML researchers.
Stars: ✭ 829 (+6276.92%)
Mutual labels:  gan
Advanced Deep Learning With Keras
Advanced Deep Learning with Keras, published by Packt
Stars: ✭ 917 (+6953.85%)
Mutual labels:  gan
St Cgan
Dataset and Code for our CVPR'18 paper ST-CGAN: "Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal"
Stars: ✭ 13 (+0%)
Mutual labels:  gan
Vonage Php Sdk Core
Vonage REST API client for PHP. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.
Stars: ✭ 849 (+6430.77%)
Mutual labels:  voice

VC-with-GAN

CS 753 ASR project

Usage Steps:

  1. Run bash download.sh to prepare the VCC2018 dataset.
  2. Run analyzer.py to extract features and write features into binary files. (This takes a few minutes.)
  3. Run build.py to record some stats, such as spectral extrema and pitch.
  4. To train a VAWGAN, for example, run
python main.py \
--model VAWGAN \
--trainer VAWGANTrainer \
--architecture architecture-vawgan-vcc2016.json
  1. You can find your models in ./logdir/train/[timestamp]
  2. To convert the voice, run
python convert.py \
--src VCC2SF1 \
--trg VCC2TM1 \
--model VAWGAN \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--file_pattern "./dataset/vcc2018/bin/Training Set/{}/[0-9]*.bin"

*Please fill in timestamp and model id.
7. You can find the converted wav files in ./logdir/output/[timestamp]
8. If you want to convert all the voices, run

./convert_all.sh \
--model VAWGAN \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--output_dir [directory to store converted audio]

Usage for Sentence Embeddings:

  1. Ensure you have w_prob_dict.pkl and w_vec_dict.pkl in data directory.

    1. For w_prob_dict.pkl you have two options. Either use get_word_prob_from_corpus this demands a corpus as an input. We used WikiText. Or you can get a csv file with unigram probabilities (we mentioned the source in the report http://norvig.com/ngrams/) and use the function get_w_prob_from_csv.
    2. For w_vec_dict.pkl initialize a Sentence_Embedding object and then call the function prune_word_vec. This essentially keeps only those embeddings which are present in the transcriptions since it takes a lot more time (and ram) to get the parse the whole fasttext data.
    3. All pickle files are shared here https://drive.google.com/drive/folders/1FWGGEQ9wTUewBDFq5ssT4BP4cMyt8lh1
  2. Download the dataset using bash download.sh

  3. Run python sentence_embedding.py. This should create sent_emb.pkl inside data directory.

  4. Run analyzer.py to extract features, store them along with sentence embeddings.

  5. Run build.py to find statistics about features.

  6. To train with sentence embedding, run

python main.py \
--model VAWGAN_S \
--trainer VAWGAN_S \
--architecture architecture-vawgan-sent.json
  1. For conversion, run
python convert.py \
--src VCC2SF1 \
--trg VCC2TM1 \
--model VAWGAN_S \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--file_pattern "./dataset/vcc2018/bin/Training Set/{}/[0-9]*.bin"

or

./convert_all.sh \
--model VAWGAN_S \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--output_dir [directory to store converted audio]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].