All Projects → snakers4 → playing_with_vae

snakers4 / playing_with_vae

Licence: other
Comparing FC VAE / FCN VAE / PCA / UMAP on MNIST / FMNIST

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to playing with vae

AnnA Anki neuronal Appendix
Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity
Stars: ✭ 39 (-26.42%)
Mutual labels:  pca, embedding, umap
Tensorflow Generative Model Collections
Collection of generative models in Tensorflow
Stars: ✭ 3,785 (+7041.51%)
Mutual labels:  mnist, variational-autoencoder, fashion-mnist
haskell-vae
Learning about Haskell with Variational Autoencoders
Stars: ✭ 18 (-66.04%)
Mutual labels:  mnist, variational-autoencoder
Disentangling Vae
Experiments for understanding disentanglement in VAE latent representations
Stars: ✭ 398 (+650.94%)
Mutual labels:  mnist, variational-autoencoder
Ml code
A repository for recording the machine learning code
Stars: ✭ 75 (+41.51%)
Mutual labels:  mnist, pca
MNIST-multitask
6️⃣6️⃣6️⃣ Reproduce ICLR '18 under-reviewed paper "MULTI-TASK LEARNING ON MNIST IMAGE DATASETS"
Stars: ✭ 34 (-35.85%)
Mutual labels:  mnist, fashion-mnist
VAE-Gumbel-Softmax
An implementation of a Variational-Autoencoder using the Gumbel-Softmax reparametrization trick in TensorFlow (tested on r1.5 CPU and GPU) in ICLR 2017.
Stars: ✭ 66 (+24.53%)
Mutual labels:  mnist, variational-autoencoder
Deep Generative Models
Deep generative models implemented with TensorFlow 2.0: eg. Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Deep Boltzmann Machine (DBM), Convolutional Variational Auto-Encoder (CVAE), Convolutional Generative Adversarial Network (CGAN)
Stars: ✭ 34 (-35.85%)
Mutual labels:  mnist, variational-autoencoder
Tensorflow Mnist Cvae
Tensorflow implementation of conditional variational auto-encoder for MNIST
Stars: ✭ 139 (+162.26%)
Mutual labels:  mnist, variational-autoencoder
Pytorch Generative Model Collections
Collection of generative models in Pytorch version.
Stars: ✭ 2,296 (+4232.08%)
Mutual labels:  mnist, fashion-mnist
Vae Cvae Mnist
Variational Autoencoder and Conditional Variational Autoencoder on MNIST in PyTorch
Stars: ✭ 229 (+332.08%)
Mutual labels:  mnist, variational-autoencoder
mnist-challenge
My solution to TUM's Machine Learning MNIST challenge 2016-2017 [winner]
Stars: ✭ 68 (+28.3%)
Mutual labels:  mnist, pca
NMFADMM
A sparsity aware implementation of "Alternating Direction Method of Multipliers for Non-Negative Matrix Factorization with the Beta-Divergence" (ICASSP 2014).
Stars: ✭ 39 (-26.42%)
Mutual labels:  pca, embedding
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-69.81%)
Mutual labels:  pca, embedding
VAE-Latent-Space-Explorer
Interactive exploration of MNIST variational autoencoder latent space with React and tensorflow.js.
Stars: ✭ 30 (-43.4%)
Mutual labels:  mnist, variational-autoencoder
Tensorflow Mnist Vae
Tensorflow implementation of variational auto-encoder for MNIST
Stars: ✭ 422 (+696.23%)
Mutual labels:  mnist, variational-autoencoder
Fashion Mnist
A MNIST-like fashion product database. Benchmark 👇
Stars: ✭ 9,675 (+18154.72%)
Mutual labels:  mnist, fashion-mnist
Fun-with-MNIST
Playing with MNIST. Machine Learning. Generative Models.
Stars: ✭ 23 (-56.6%)
Mutual labels:  mnist, pca
gans-2.0
Generative Adversarial Networks in TensorFlow 2.0
Stars: ✭ 76 (+43.4%)
Mutual labels:  mnist, fashion-mnist
CVAE-AnomalyDetection-PyTorch
Example of Anomaly Detection using Convolutional Variational Auto-Encoder (CVAE)
Stars: ✭ 23 (-56.6%)
Mutual labels:  variational-autoencoder

Latent vector spaces

FMNIST reconstructions

Intro

This is a test task I did for some reason. It contains evaluation of:

  • FC VAE / FCN VAE on MNIST / FMNIST for image reconstruction;
  • Comparison of embeddings produced by VAE / PCA / UMAP for classification;

TLDR

What you can find here:

  • A working VAE example on PyTorch with a lot of flags (both FC and FCN, as well as a number of failed experiments);
  • Some experiment boilerplate code;
  • Comparison between embeddings produced by PCA / UMAP / VAEs (spoiler - VAEs win);
  • A step-by step logic of what I did in main.ipynb

Docker environment

To build the docker image from the Dockerfile located in dockerfile please do:

cd dockerfile
docker build -t vae_docker .

(you can replace public ssh key with yours, ofc)

Also please make sure that nvidia-docker2 and proper nvidia drivers are installed.

To test the installation run

docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

Then launch the container as follows:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 -it -v /your/folder/:/home/keras/notebook/your_folder -p 8888:8888 -p 6006:6006 --name vae --shm-size 16G vae_docker

Please note that w/o --shm-size 16G PyTorch dataloader classes will not work. The above command will start a container with a Jupyter notebook server available via port 8888. Port 6006 is for tensorboard, if necessary.

Then you can exec into the container like this. All the scripts were run as root, but they must also work under user keras

docker exec -it --user root REPLACE_WITH_CONTAINER_ID /bin/bash

or

docker exec -it --user keras REPLACE_WITH_CONTAINER_ID /bin/bash

To find out the container ID run

 docker container ls

Most important dependencies (if you do not want docker)

These are the most important dependencies (others you can just install in the progress):

Ubuntu 16.04
cuda 9.0
cudnn 7
python 3.6
pip
PIL
tensorflow-gpu (for tensorboard)
pandas
numpy
matplotlib
seaborn
tqdm
scikit-learn
pytorch 0.4.0 (cuda90)
torchvision 2.0
datashader
umap

If you have trouble with these, look up how I install them in the Dockerfile / jupyter notebook.

Results

VAE

The best model can be trained as follows

python3 train.py \
	--epochs 30 --batch-size 512 --seed 42 \
	--model_type fc_conv --dataset_type fmnist --latent_space_size 10 \
	--do_augs False \
	--lr 1e-3 --m1 40 --m2 50 \
	--optimizer adam \
	--do_running_mean False --img_loss_weight 1.0 --kl_loss_weight 1.0 \
	--image_loss_type bce --ssim_window_size 5 \
	--print-freq 10 \
	--lognumber fmnist_fc_conv_l10_rebalance_no_norm \
	--tensorboard True --tensorboard_images True \

If you launch this code, the copy of FMNIST dataset will be dowloaded automatically.

Suggested alternative values for the flags for playing with them:

  • dataset_type - can be set to mnist and fmnist. In each case will download the necessary dataset
  • latent_space_size - will affect the latent space in combination with model_type fc_conv or fc. Other model types do not work properly
  • m1 and m2 control lr decay, but it did not really help here
  • image_loss_type can be set to bce, mse or ssim. In practice bce works best. mse is worse. I suppose that proper scaling is required to make it work with ssim (it does not train now)
  • tensorboard and tensorboard_images can also be set to False. But they just write logs, so you may just not bother

These flags are optional --tensorboard True --tensorboard_images True, in order to use them, you have to

  • install tensorboard (installs with tensorflow)
  • launch tensorboard with the following command tensorboard --logdir='path/to/tb_logs' --port=6006

You can also resume from the best checkpoint using these flags:

python3 train.py \
	--resume weights/fmnist_fc_conv_l10_rebalance_no_norm_best.pth.tar \
	--epochs 60 --batch-size 512 --seed 42 \
	--model_type fc_conv --dataset_type fmnist --latent_space_size 10 \
	--do_augs False \
	--lr 1e-3 --m1 50 --m2 100 \
	--optimizer adam \
	--do_running_mean False --img_loss_weight 1.0 --kl_loss_weight 1.0 \
	--image_loss_type bce --ssim_window_size 5 \
	--print-freq 10 \
	--lognumber fmnist_resume \
	--tensorboard True --tensorboard_images True \

The best reconstructions are supposed to look like this (top row - original images, bottow row - reconstructions):

Brief ablation analysis of the results

✓ What worked

  1. Using BCE loss + KLD loss
  2. Converting a plain FC model into a conv model in the most straight-forward fashion possible, i.e. replacing this
        self.fc1 = nn.Linear(784, 400)
        self.fc21 = nn.Linear(400, latent_space_size)
        self.fc22 = nn.Linear(400, latent_space_size)
        self.fc3 = nn.Linear(latent_space_size, 400)
        self.fc4 = nn.Linear(400, 784)

with this

        self.fc1 = nn.Conv2d(1,32, kernel_size=(28,28), stride=1, padding=0)
        self.fc21 = nn.Conv2d(32,latent_space_size, kernel_size=(1,1), stride=1, padding=0)
        self.fc22 = nn.Conv2d(32,latent_space_size, kernel_size=(1,1), stride=1, padding=0)
        
        self.fc3 = nn.ConvTranspose2d(latent_space_size,118, kernel_size=(1,1),  stride=1, padding=0)
        self.fc4 = nn.ConvTranspose2d(118,1, kernel_size=(28,28),  stride=1, padding=0)
  1. Using SSIM as visualization metric. It correlates awesomely with perceived visual similarity of the image and its reconstruction

✗ What did not work

  1. Extracting mean and std from images - removing this feature boosted SSIM on FMNIST 4-5x
  2. Doing any simple augmentations (unsurprisingly - it adds a complexity level to a simple task)
  3. Any architectures beyond the most obvious ones:
    • UNet inspired architectures (my speculation - this is because image size is very small, and very global features work best, i.e. feature extraction cascade is overkill)
    • I tried various combinations of convolution weights, all of them did not work
    • 1xN convolutions
  4. MSE loss performed poorly, SSIM loss did not work at all
  5. LR decay, as well as any LR besides 1e-3 (with adam) does not really help
  6. Increasing latent space to 20 or 100 does not really change much

** ¯|(ツ)/¯ What I did not try**

  1. Ensembling or building meta-architectures
  2. Conditional VAEs
  3. Increasing network capacity

PCA vs. UMAP vs. VAE

Please refer to section 5 of the main.ipynb

Is notable that:

  • VAEs visually worked better than PCA;
  • Using the VAE embedding for classification produces higher accuracty (~80% vs. 73%);
  • A similar accuracy on train/val can be obtained using UMAP;

Jupyter notebook (.ipynb file) is best viewed using these Jupiter notebook extensions (installed with the below command, then to be turned on in the Jupyter GUI)

pip install git+https://github.com/ipython-contrib/jupyter_contrib_nbextensions
# conda install html5lib==0.9999999
jupyter contrib nbextension install --system

Sometims there is a html5lib conflict. Excluded from the Dockerfile because of this conflict (sometimes occurs, sometimes not).

Further reading

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].