All Projects → hiwonjoon → Tf Vqvae

hiwonjoon / Tf Vqvae

Tensorflow Implementation of the paper [Neural Discrete Representation Learning](https://arxiv.org/abs/1711.00937) (VQ-VAE).

Projects that are alternatives of or similar to Tf Vqvae

Pytorch Mnist Vae
Stars: ✭ 32 (-85.84%)
Mutual labels:  jupyter-notebook, generative-model, mnist, vae
Generative adversarial networks 101
Keras implementations of Generative Adversarial Networks. GANs, DCGAN, CGAN, CCGAN, WGAN and LSGAN models with MNIST and CIFAR-10 datasets.
Stars: ✭ 138 (-38.94%)
Mutual labels:  jupyter-notebook, mnist, cifar10
srVAE
VAE with RealNVP prior and Super-Resolution VAE in PyTorch. Code release for https://arxiv.org/abs/2006.05218.
Stars: ✭ 56 (-75.22%)
Mutual labels:  generative-model, vae, cifar10
Spectralnormalizationkeras
Spectral Normalization for Keras Dense and Convolution Layers
Stars: ✭ 100 (-55.75%)
Mutual labels:  jupyter-notebook, generative-model, cifar10
Tensorflow Generative Model Collections
Collection of generative models in Tensorflow
Stars: ✭ 3,785 (+1574.78%)
Mutual labels:  generative-model, mnist, vae
Relativistic Average Gan Keras
The implementation of Relativistic average GAN with Keras
Stars: ✭ 36 (-84.07%)
Mutual labels:  jupyter-notebook, mnist, cifar10
Vae protein function
Protein function prediction using a variational autoencoder
Stars: ✭ 57 (-74.78%)
Mutual labels:  jupyter-notebook, generative-model, vae
Psgan
Periodic Spatial Generative Adversarial Networks
Stars: ✭ 108 (-52.21%)
Mutual labels:  jupyter-notebook, generative-model
Vae Tensorflow
A Tensorflow implementation of a Variational Autoencoder for the deep learning course at the University of Southern California (USC).
Stars: ✭ 117 (-48.23%)
Mutual labels:  jupyter-notebook, vae
First Order Model
This repository contains the source code for the paper First Order Motion Model for Image Animation
Stars: ✭ 11,964 (+5193.81%)
Mutual labels:  jupyter-notebook, generative-model
Mnist draw
This is a sample project demonstrating the use of Keras (Tensorflow) for the training of a MNIST model for handwriting recognition using CoreML on iOS 11 for inference.
Stars: ✭ 139 (-38.5%)
Mutual labels:  jupyter-notebook, mnist
Cross Lingual Voice Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
Stars: ✭ 106 (-53.1%)
Mutual labels:  jupyter-notebook, vae
Dmm
Deep Markov Models
Stars: ✭ 103 (-54.42%)
Mutual labels:  jupyter-notebook, generative-model
Vq Vae
Minimalist implementation of VQ-VAE in Pytorch
Stars: ✭ 224 (-0.88%)
Mutual labels:  mnist, vae
Generative Models
Comparison of Generative Models in Tensorflow
Stars: ✭ 96 (-57.52%)
Mutual labels:  mnist, vae
Deep Learning With Python
Example projects I completed to understand Deep Learning techniques with Tensorflow. Please note that I do no longer maintain this repository.
Stars: ✭ 134 (-40.71%)
Mutual labels:  jupyter-notebook, vae
Tensorflow Mnist Cvae
Tensorflow implementation of conditional variational auto-encoder for MNIST
Stars: ✭ 139 (-38.5%)
Mutual labels:  mnist, vae
Nnpulearning
Non-negative Positive-Unlabeled (nnPU) and unbiased Positive-Unlabeled (uPU) learning reproductive code on MNIST and CIFAR10
Stars: ✭ 181 (-19.91%)
Mutual labels:  mnist, cifar10
Pytorch Vae
A CNN Variational Autoencoder (CNN-VAE) implemented in PyTorch
Stars: ✭ 181 (-19.91%)
Mutual labels:  jupyter-notebook, vae
Dragan
A stable algorithm for GAN training
Stars: ✭ 189 (-16.37%)
Mutual labels:  jupyter-notebook, generative-model

VQ-VAE (Neural Discrete Representation Learning) Tensorflow

Intro

This repository implements the paper, Neural Discrete Representation Learning (VQ-VAE) in Tensorflow.

⚠️ This is not an official implementation, and might have some glitch (,or a major defect).

Requirements

  • Python 3.5
  • Tensorflow (v1.3 or higher)
  • numpy, better_exceptions, tqdm, etc.
  • ffmpeg

Updated Result: ImageNet

  • [x] ImageNet

    Validation Set Images Reconstructed Images
    Imagenet original images Imagenet Reconstructed Images
    • Class Conditioned Sampled Image (Not cherry-picked, just random sample)

      alp

      admiral

      coral reef

      gray_whale

      brown bear

      pickup truck

    • I could not reproduce as sharp images as the author produced.

    • But, some of results seems understandable.

    • Usually, natural scene images having consistent pixel orders shows better result, such as Alp or coral reef.

    • More tuning might provide better result.

Updated Result: Sampling with PixelCNN

  • [x] Pixel CNN

    • MNIST Sampled Image (Conditioned on class labels)

      MNIST Sampled Images

    • Cifar10 Sampled Image (Conditioned on class labels)

      Cifar10 Sampled Imagesl

      From top row to bottom, the sampled images for classes (airplane,auto,bird,cat,deer,dog,frog,horse,ship,truck)

      Not that satisfying so far; I guess hyperparameters for VQ-VAE should be tuned first to generate more sharper result.

Results

All training is done with Quadro M4000 GPU. Training MNIST only takes less than 10 minutes.

  • [x] MNIST

    Original Images Reconstructed Images
    MNIST original images MNIST Reconstructed Images

    The result on MNIST test dataset. (K=20, D=64, latent space=3 by 3)

    I also observed its latent space by changing single value for each latent space from one of the observed latent code. The result is shown below. MNIST Latent Observation

    It seems that spatial location of latent code is improtant. By changing latent code on a specific location, the pixel matches with the location is disturbed.

    MNIST Latent Observation - Random Walk

    This results shows the 1000 generated images starting from knwon latent codes and changing aa single latent code at radnom location by +1 or -1. Most of the images are redundant (unrealistic), so it indicates that there are much room for compression.

    If you want to further explore the latent space, then try to play with notebook files I provided.

  • [x] CIFAR 10

    Original Images Reconstructed Images
    MNIST original images MNIST Reconstructed Images

    I was able to get 4.65 bits/dims. (K=10, D=256, latent space=8 by 8)

Training

It will download required datasets on the directory ./datasets/{mnist,cifar10} by itself. Hence, just run the code will do the trick.

Run train

  • Run mnist: python mnist.py
  • Run cifar10: python cifar10.py

Change the hyperparameters accordingly as you want. Please check at the bottom of each script.

Evaluation

I provide the model and the code for generating (,or reconstructing) images in the form of Jupyter notebook. Run jupyter notebook server, then run it to see more results with provided models.

If you want to test NLL, then run test() function on cifar.py by uncomment the line. You can find it at the bottom of the file.

TODO

  • [ ] WaveNet?

Contributions are welcome!

Thoughts and Help request

  • The results seems correct, but there is a chance that the implmentation is not perfectly correct (especially, gradient copying...). If you find any glitches (or, a major defect) then, please let me know!
  • I am currently not sure how exactly NLL should be computed. Anyone who wants me a proper explantion on this?

Acknowledgement

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].