All Projects → XiaoyuBIE1994 → DVAE

XiaoyuBIE1994 / DVAE

Licence: other
Official implementation of Dynamical VAEs

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to DVAE

GDPP
Generator loss to reduce mode-collapse and to improve the generated samples quality.
Stars: ✭ 32 (-57.33%)
Mutual labels:  generative-model, variational-autoencoders
generative deep learning
Generative Deep Learning Sessions led by Anugraha Sinha (Machine Learning Tokyo)
Stars: ✭ 24 (-68%)
Mutual labels:  generative-model
MidiTok
A convenient MIDI / symbolic music tokenizer for Deep Learning networks, with multiple strategies 🎶
Stars: ✭ 180 (+140%)
Mutual labels:  generative-model
3DCSGNet
CSGNet for voxel based input
Stars: ✭ 34 (-54.67%)
Mutual labels:  generative-model
py-msa-kdenlive
Python script to load a Kdenlive (OSS NLE video editor) project file, and conform the edit on video or numpy arrays.
Stars: ✭ 25 (-66.67%)
Mutual labels:  generative-model
shoe-design-using-generative-adversarial-networks
No description or website provided.
Stars: ✭ 18 (-76%)
Mutual labels:  generative-model
Generalization-Causality
关于domain generalization,domain adaptation,causality,robutness,prompt,optimization,generative model各式各样研究的阅读笔记
Stars: ✭ 482 (+542.67%)
Mutual labels:  generative-model
TriangleGAN
TriangleGAN, ACM MM 2019.
Stars: ✭ 28 (-62.67%)
Mutual labels:  generative-model
graph-nvp
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Stars: ✭ 69 (-8%)
Mutual labels:  generative-model
gans-in-action
"GAN 인 액션"(한빛미디어, 2020)의 코드 저장소입니다.
Stars: ✭ 29 (-61.33%)
Mutual labels:  generative-model
atomai
Deep and Machine Learning for Microscopy
Stars: ✭ 77 (+2.67%)
Mutual labels:  variational-autoencoders
GrabNet
GrabNet: A Generative model to generate realistic 3D hands grasping unseen objects (ECCV2020)
Stars: ✭ 146 (+94.67%)
Mutual labels:  generative-model
char-VAE
Inspired by the neural style algorithm in the computer vision field, we propose a high-level language model with the aim of adapting the linguistic style.
Stars: ✭ 18 (-76%)
Mutual labels:  generative-model
pytorch-CycleGAN
Pytorch implementation of CycleGAN.
Stars: ✭ 39 (-48%)
Mutual labels:  generative-model
deep-active-inference-mc
Deep active inference agents using Monte-Carlo methods
Stars: ✭ 41 (-45.33%)
Mutual labels:  variational-autoencoders
continuous Bernoulli
There are C language computer programs about the simulator, transformation, and test statistic of continuous Bernoulli distribution. More than that, the book contains continuous Binomial distribution and continuous Trinomial distribution.
Stars: ✭ 22 (-70.67%)
Mutual labels:  variational-autoencoders
cvaecaposr
Code for the Paper: "Conditional Variational Capsule Network for Open Set Recognition", Y. Guo, G. Camporese, W. Yang, A. Sperduti, L. Ballan, arXiv:2104.09159, 2021.
Stars: ✭ 29 (-61.33%)
Mutual labels:  variational-autoencoders
materials-synthesis-generative-models
Public release of data and code for materials synthesis generation
Stars: ✭ 47 (-37.33%)
Mutual labels:  generative-model
timbre painting
Hierarchical fast and high-fidelity audio generation
Stars: ✭ 67 (-10.67%)
Mutual labels:  generative-model
Generalized-PixelVAE
PixelVAE with or without regularization
Stars: ✭ 64 (-14.67%)
Mutual labels:  generative-model

Dynamical Variational Autoencoders A Comprehensive Review

This repository contains the code for:
Dynamical Variational Autoencoders: A Comprehensive Review, Foundations and Trends in Machine Learning, 2021.
Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda
[arXiv] [Paper] [Project] [Tutorial]

More precisely, this repo is a re-implementation of the following models in Pytorch:

  • VAE, Kingma et al., ICLR 2014
  • DKF, Krishnan et al., AAAI 2017
  • KVAE, Fraccaro et al., NeurIPS 2017
  • STORN, Bayer et al., arXiv 2014
  • VRNN, Chung et al., NeurIPS 2015
  • SRNN, Fraccaro et al., NeurIPS 2016
  • RVAE, Simon et al., ICASSP 2020
  • DSAE, Yingzhen et al. ICML 2018

For the results we report at Interspeech 2021, please visit the interspeech branch

We don't report the results of KVAE since we haven't make it work in our experiments, we still provide the code for research purpose

Prerequest

The PESQ value we report in our paper is a narrow-band PESQ value provide by pypesq package. If you want to get a wide-band PESQ value, please use pesq package instead

Dataset

In this version, DVAE models support two differnt data structure:

  • WSJ0, an audio speech data, we use the subset ChiME2-WSJ0 from ChiME-Challenge
  • Human3.6M, a 3D human motion data under license here, the exponential map version can be download here

If you want to use our models in other datasets, you can simply modify/re-write the dataloader and make minor changes in the training steps. Please remind that DVAE models accept data in the format of (seq_len, batch_size, x_dim)

Train

We provide all configuration examples of the above models in ./confg

# Train on DVAE (for example)
python train_model.py --cfg ./config/speech/cfg_rvae_Causal.ini
python train_model.py --cfg ./config/motion/cfg_srnn.ini

# Train DVAE with schedule sampling, w/o. pretrained model
python train_model.py --ss --cfg ./confgi/speech/cfg_srnn_ss.ini --use_pretrain --pretrain_dict /PATH_PRETRAIN_DIR

# Train DVAE with schedule sampling, w. pretrained model
python train_model.py --ss --cfg ./confgi/speech/cfg_srnn_ss.ini

# Resume training
python train_model.py --cfg ./config/speech/cfg_rvae_Causal.ini --reload --model_dir /PATH_RELOAD_DIR

Evaluation

# Evaluation on speech data
python eval_wsj.py --cfg PATH_TO_CONFIG --saved_dict PATH_TO_PRETRAINED_DICT
python eval_wsj.py --ss --cfg PATH_TO_CONFIG --saved_dict PATH_TO_PRETRAINED_DICT # schedule sampling

# Evaluation on human motion data
python eval_h36m.py --cfg PATH_TO_CONFIG --saved_dict PATH_TO_PRETRAINED_DICT
python eval_h36m.py --ss --cfg PATH_TO_CONFIG --saved_dict PATH_TO_PRETRAINED_DICT # schedule sampling

Bibtex

If you find this code useful, please star the project and consider citing:

@article{dvae2021,
  title={Dynamical Variational Autoencoders: A Comprehensive Review},
  author={Girin, Laurent and Leglaive, Simon and Bie, Xiaoyu and Diard, Julien and Hueber, Thomas and Alameda-Pineda, Xavier},
  journal={Foundations and Trends® in Machine Learning},
  year = {2021},
  volume = {15},
  doi = {10.1561/2200000089},
  issn = {1935-8237},
  number = {1-2},
  pages = {1-175}
}
@inproceedings{bie21_interspeech,
  author={Xiaoyu Bie and Laurent Girin and Simon Leglaive and Thomas Hueber and Xavier Alameda-Pineda},
  title={{A Benchmark of Dynamical Variational Autoencoders Applied to Speech Spectrogram Modeling}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={46--50},
  doi={10.21437/Interspeech.2021-256}
}

Main results

For speech data, using:

  • training dataset: wsj0_si_tr_s
  • validation dataset: wsj0_si_dt_05
  • test dataset: wsj0_si_et_05
DVAE SI-SDR(dB) PESQ ESTOI
VAE 5.3 2.97 0.83
DKF 9.3 3.53 0.91
STORN 6.9 3.42 0.90
VRNN 10.0 3.61 0.92
SRNN 11.0 3.68 0.93
RVAE-Causal 9.0 3.49 0.90
RVAE-NonCausal 8.9 3.58 0.91
DSAE 9.2 3.55 0.91
SRNN-TF-GM -1.0 1.93 0.64
SRNN-GM 7.8 3.37 0.88

For human motion data, using:

  • training dataset: S1, S6, S7, S8, S9
  • validation dataset: S5
  • test dataset: S11
DVAE MPJPE (mm)
VAE 48.69
DKF 42.21
STORN 9.47
VRNN 9.22
SRNN 7.86
RVAE-Causal 31.09
RVAE-NonCausal 28.59
DSAE 28.61
SRNN-TF-GM 221.87
SRNN-GM 43.98

More results can be found in Chapter 13 Experiments of our article.

Contact

For any further questions, you can drop me an email via xiaoyu[dot]bie[at]inria[dot]fr

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].