Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Dataset and Code for our CVPR'18 paper ST-CGAN: "Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal"

Stars: ✭ 13 (+0%)

Mutual labels: gan

Vonage Php Sdk Core

Vonage REST API client for PHP. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.

Stars: ✭ 849 (+6430.77%)

Mutual labels: voice

View All Similar Projects ➔

VC-with-GAN

CS 753 ASR project

Usage Steps:

Run bash download.sh to prepare the VCC2018 dataset.
Run analyzer.py to extract features and write features into binary files. (This takes a few minutes.)
Run build.py to record some stats, such as spectral extrema and pitch.
To train a VAWGAN, for example, run

python main.py \
--model VAWGAN \
--trainer VAWGANTrainer \
--architecture architecture-vawgan-vcc2016.json

You can find your models in ./logdir/train/[timestamp]
To convert the voice, run

python convert.py \
--src VCC2SF1 \
--trg VCC2TM1 \
--model VAWGAN \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--file_pattern "./dataset/vcc2018/bin/Training Set/{}/[0-9]*.bin"

*Please fill in timestamp and model id.
7. You can find the converted wav files in ./logdir/output/[timestamp]
8. If you want to convert all the voices, run

./convert_all.sh \
--model VAWGAN \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--output_dir [directory to store converted audio]

Usage for Sentence Embeddings:

Ensure you have w_prob_dict.pkl and w_vec_dict.pkl in data directory.
1. For w_prob_dict.pkl you have two options. Either use get_word_prob_from_corpus this demands a corpus as an input. We used WikiText. Or you can get a csv file with unigram probabilities (we mentioned the source in the report http://norvig.com/ngrams/) and use the function get_w_prob_from_csv.
2. For w_vec_dict.pkl initialize a Sentence_Embedding object and then call the function prune_word_vec. This essentially keeps only those embeddings which are present in the transcriptions since it takes a lot more time (and ram) to get the parse the whole fasttext data.
3. All pickle files are shared here https://drive.google.com/drive/folders/1FWGGEQ9wTUewBDFq5ssT4BP4cMyt8lh1
Download the dataset using bash download.sh
Run python sentence_embedding.py. This should create sent_emb.pkl inside data directory.
Run analyzer.py to extract features, store them along with sentence embeddings.
Run build.py to find statistics about features.
To train with sentence embedding, run

python main.py \
--model VAWGAN_S \
--trainer VAWGAN_S \
--architecture architecture-vawgan-sent.json

For conversion, run

python convert.py \
--src VCC2SF1 \
--trg VCC2TM1 \
--model VAWGAN_S \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--file_pattern "./dataset/vcc2018/bin/Training Set/{}/[0-9]*.bin"

./convert_all.sh \
--model VAWGAN_S \
--checkpoint logdir/train/[timestamp]/[model.ckpt-[id]] \
--output_dir [directory to store converted audio]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 13

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗