glam-imperial / EmotionalConversionStarGAN

Licence: other

This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".

Programming Languages

python

139335 projects - #7 most used programming language

shell

77523 projects

Projects that are alternatives of or similar to EmotionalConversionStarGAN

Data Augmentation Review

List of useful data augmentation resources. You will find here some not common techniques, libraries, links to github repos, papers and others.

Stars: ✭ 785 (+753.26%)

Mutual labels: generative-adversarial-network, data-augmentation

coursera-gan-specialization

Programming assignments and quizzes from all courses within the GANs specialization offered by deeplearning.ai

Stars: ✭ 277 (+201.09%)

Mutual labels: generative-adversarial-network, data-augmentation

Interaction-Aware-Attention-Network

[ICASSP19] An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs

Stars: ✭ 32 (-65.22%)

Mutual labels: emotion-recognition, icassp

OpenVINO-EmotionRecognition

OpenVINO+NCS2/NCS+MutiModel(FaceDetection, EmotionRecognition)+MultiStick+MultiProcess+MultiThread+USB Camera/PiCamera. RaspberryPi 3 compatible. Async.

Stars: ✭ 51 (-44.57%)

Mutual labels: emotion-recognition

polyagamma

An efficient and flexible sampler of the Pólya-Gamma distribution with a NumPy/SciPy compatible interface.

Stars: ✭ 15 (-83.7%)

Mutual labels: data-augmentation

FaceRecognition

Face Recognition in real-world images [ICASSP 2017]

Stars: ✭ 36 (-60.87%)

Mutual labels: icassp

RecycleGAN

The simplest implementation toward the idea of Re-cycle GAN

Stars: ✭ 68 (-26.09%)

Mutual labels: generative-adversarial-network

DeepSentiPers

Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"

Stars: ✭ 17 (-81.52%)

Mutual labels: data-augmentation

private-data-generation

A toolbox for differentially private data generation

Stars: ✭ 80 (-13.04%)

Mutual labels: generative-adversarial-network

keras-3dgan

Keras implementation of 3D Generative Adversarial Network.

Stars: ✭ 20 (-78.26%)

Mutual labels: generative-adversarial-network

DeepFlow

Pytorch implementation of "DeepFlow: History Matching in the Space of Deep Generative Models"

Stars: ✭ 24 (-73.91%)

Mutual labels: generative-adversarial-network

tt-vae-gan

Timbre transfer with variational autoencoding and cycle-consistent adversarial networks. Able to transfer the timbre of an audio source to that of another.

Stars: ✭ 37 (-59.78%)

Mutual labels: generative-adversarial-network

ezgan

An extremely simple generative adversarial network, built with TensorFlow

Stars: ✭ 36 (-60.87%)

Mutual labels: generative-adversarial-network

projects

things I help(ed) to build

Stars: ✭ 47 (-48.91%)

Mutual labels: generative-adversarial-network

GAN-Ensemble-for-Anomaly-Detection

This repository is the PyTorch implementation of GAN Ensemble for Anomaly Detection.

Stars: ✭ 26 (-71.74%)

Mutual labels: generative-adversarial-network

ADL2019

Applied Deep Learning (2019 Spring) @ NTU

Stars: ✭ 20 (-78.26%)

Mutual labels: generative-adversarial-network

celeba-gan-pytorch

Generative Adversarial Networks in PyTorch

Stars: ✭ 35 (-61.96%)

Mutual labels: generative-adversarial-network

TextBoxGAN

Generate text boxes from input words with a GAN.

Stars: ✭ 50 (-45.65%)

Mutual labels: generative-adversarial-network

voicekit-examples

Examples on how to use Tinkoff Voicekit

Stars: ✭ 35 (-61.96%)

Mutual labels: speech-synthesis

Tacotron pytorch

Tacotron implementation of pytorch

Stars: ✭ 12 (-86.96%)

Mutual labels: speech-synthesis

View All Similar Projects ➔

EmotionalConversionStarGAN

This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".

stargan: code for training the Emotional StarGAN and performing emotional generation (originally here - https://github.com/max-elliott/StarGAN-Emotional-VC).
aug_evaluation: code for performing the data augmentation experiments (coming soon)
samples: some samples selectively (coming soon - checking with IEMOCAP if we can publicly share according to GDPR)

The IEMOCAP database requires the signing of an EULA; please communicate with the handlers: https://sail.usc.edu/iemocap/

Preparing

- Requirements:

python>3.7.0
pytorch
numpy
argparse
librosa
scikit-learn
tensorflow < 2.0
pyworld
matplotlib
yaml

- Clone repository:

git clone https://github.com/glam-imperial/EmotionalConversionStarGAN.git
cd EmotionalConversionStarGAN

- Download IEMOCAP dataset from https://sail.usc.edu/iemocap/

IEMOCAP Preprocessing

Running the script run_preprocessing.py will prepare the IEMOCAP as needed for training the model. It assumes that IEMOCAP is already downloaded and is stored in an arbitrary directory

with this file structure

<DIR>
  |- Session1  
  |     |- Annotations  
  |     |- Ses01F_impro01  
  |     |- Ses01F_impro02  
  |     |- ...  
  |- ...
  |- Session5
        |- Annotations
        |- Ses05F_impro01
        |- Ses05F_impro02
        |- ...

where Annotations is a directory holding the label .txt files for all Session (Ses01F_impro01.txt etc.), and each other directory (Ses01F_impro01, Ses01F_impro02 etc.) holds the .wav files for each scene in the session.

To preprocess run

python run_preprocessing.py --iemocap_dir <DIR>

which will move all audio files to ./procesed_data/audio as well as extract all WORLD features and labels needed for training. It will only extract these for samples of the correct emotions (angry, sad, happy) and under the certain hardocded length threshold (to speed up training time). it will also create dictionaries for F0 statistics which are used to alter the F0 of a sample when converting. After running you should have a file structure:

./processed_data
 |- annotations
 |- audio
 |- f0
 |- labels
 |- world

Training EmotionStarGAN

Main training script is train_main.py. However to automatically train a three emotion model (angry, sad, happy) as it was trained for "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition", simply call:

./full_training_script.sh

This script runs three steps:

Runs classifier_train.py - Pretrains an auxiliary emotional classifier. Saves best checkpoint to ./checkpoints/cls_checkpoint.ckpt
Runs main training for 200k iterations in --recon_only mode, meaning model learns to simply reconstruct the input audio.
Trains model for a further 100k steps, introducing the pre-trained classifier.

A full training run will take ~24 hours on a decent GPU. The auxiliary emotional classifier can also be trained independently using classifier_train.py.

Sample Conversion

Once a model is trained you can convert IEMOCAP audio samples using convert.py. Running

python convert.py --checkpoint <path/to/model_checkpoint.ckpt> -o ./processed_data/converted

will load a model checkpoint and convert 10 random samples from the test set into each emotion and save the converted samples in /processed_data/converted (currently bugged: run conversion as stated below). Specifying an input directory will convert all the audio files in that directory:

python convert.py --checkpoint <path/to/model_checkpoint.ckpt> -i <path/to/wavs> -o ./processed_data/converted

They currently must be existing files in the IEMOCAP dataset. Code will be updated to convert arbitrary samples later.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

glam-imperial / EmotionalConversionStarGAN

Programming Languages

Labels

Projects that are alternatives of or similar to EmotionalConversionStarGAN

EmotionalConversionStarGAN

Preparing

IEMOCAP Preprocessing

Training EmotionStarGAN

Sample Conversion