Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Tensorflow implementation of conditional Generative Adversarial Networks (cGAN) and conditional Deep Convolutional Adversarial Networks (cDCGAN) for MANIST dataset.

Stars: ✭ 122 (-3.94%)

Mutual labels: gan

Tts

Text-to-Speech for Arduino

Stars: ✭ 118 (-7.09%)

Mutual labels: speech

Msg Gan V1

MSG-GAN: Multi-Scale Gradients GAN (Architecture inspired from ProGAN but doesn't use layer-wise growing)

Stars: ✭ 116 (-8.66%)

Mutual labels: gan

Generate to adapt

Implementation of "Generate To Adapt: Aligning Domains using Generative Adversarial Networks"

Stars: ✭ 120 (-5.51%)

Mutual labels: gan

Sketch To Art

🖼 Create artwork from your casual sketch with GAN and style transfer

Stars: ✭ 115 (-9.45%)

Mutual labels: gan

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Stars: ✭ 11,151 (+8680.31%)

Mutual labels: speech

Pi Rec

🔥 PI-REC: Progressive Image Reconstruction Network With Edge and Color Domain. 🔥 图像翻译，条件GAN，AI绘画

Stars: ✭ 1,619 (+1174.8%)

Mutual labels: gan

Code Switching Papers

A curated list of research papers and resources on code-switching

Stars: ✭ 122 (-3.94%)

Mutual labels: speech

Speech And Text Unity Ios Android

Speed to text in Unity iOS use Native Speech Recognition

Stars: ✭ 117 (-7.87%)

Mutual labels: speech

Vae Gan Tensorflow

Tensorflow code of "autoencoding beyond pixels using a learned similarity metric"

Stars: ✭ 116 (-8.66%)

Mutual labels: gan

Capsule Gan

Code for my Master thesis on "Capsule Architecture as a Discriminator in Generative Adversarial Networks".

Stars: ✭ 120 (-5.51%)

Mutual labels: gan

Impersonator

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

Stars: ✭ 1,605 (+1163.78%)

Mutual labels: gan

Mlds2018spring

Machine Learning and having it Deep and Structured (MLDS) in 2018 spring

Stars: ✭ 124 (-2.36%)

Mutual labels: gan

Hccg Cyclegan

Handwritten Chinese Characters Generation

Stars: ✭ 115 (-9.45%)

Mutual labels: gan

O Gan

O-GAN: Extremely Concise Approach for Auto-Encoding Generative Adversarial Networks

Stars: ✭ 117 (-7.87%)

Mutual labels: gan

Nucleisegmentation

cGAN-based Multi Organ Nuclei Segmentation

Stars: ✭ 120 (-5.51%)

Mutual labels: gan

Cyclegan

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

Stars: ✭ 10,933 (+8508.66%)

Mutual labels: gan

View All Similar Projects ➔

Reconstructing faces from voices

Implementation of Reconstructing faces from voices paper

Yandong Wen, Rita Singh, and Bhiksha Raj

Machine Learning for Signal Processing Group

Carnegie Mellon University

Requirements

This implementation is based on Python 3.7 and Pytorch 1.1.

We recommend you use conda to install the dependencies. All the requirements are found in requirements.txt. Run the following command to create a new conda environment using all the dependencies.

$ ./install.sh

After you run the above script, you need to activate the environment where all the packages had been installed. The environment is called voice2face and can be run by:

$ source activate voice2face

NOTE: If you get an error complaining about "webrtcvad" not being found, then you need to make sure the pip in your PATH is the one found inside your environment. This could happen if you have multiple installations of pip (inside/outside environment).

Processed data

The following are the processed training data we used for this paper. Please feel free to download them.

Voice data (log mel-spectrograms): google drive

Face data (aligned face images): google drive

Once downloaded, update variables voice_dir and face_dir with the corresponding paths.

Configurations

See config.py on how to change configurations.

Train

We provide pretrained models including a voice embedding network and a trained generator in pretrained_models/. Or you can train your own generator by running the training script

$ python gan_train.py

The trained model is models/generator.pth

Test

We provide some examples of generated faces (in data/example_data/) using the model in pretrained_model/. If you want to generate faces for your own voice recordings using the trained model, specify the test_data (as the folder containing voice recordings) and model_path (as the path of the generator) variables in config.py and run:

$ python gan_test.py

Results will be in test_data folder. For each voice recording named <filename>.wav, we generate a face image named <filename>.png.

Note: Now we only support the voice recording with one channel at 16K sample rate. The file names of the voices and faces starting with A-E are validation or testing set, while those starting with F-Z are training set.

Citation

@article{wen2019reconstructing,
  title={Reconstructing faces from voices},
  author={Yandong Wen, Rita Singh, Bhiksha Raj},
  journal={arXiv preprint arXiv:1905.10604},
  year={2019}
}

Contribution

We welcome contributions from everyone and always working to make it better. Please give us a pull request or raise an issue and we will be happy to help.

License

This repository is licensed under GNU GPL-3.0. Please refer to LICENSE.md.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 127

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗