All Projects → aleju → gan-reverser

aleju / gan-reverser

Licence: MIT license
Reversing GAN image generation for similarity search and error/artifact fixing

Programming Languages

lua
6591 projects

Projects that are alternatives of or similar to gan-reverser

Floorplantransformation
Raster-to-Vector: Revisiting Floorplan Transformation
Stars: ✭ 243 (+1769.23%)
Mutual labels:  torch
yann
Yet Another Neural Network Library 🤔
Stars: ✭ 26 (+100%)
Mutual labels:  torch
ALIGNet
code to train a neural network to align pairs of shapes without needing ground truth warps for supervision
Stars: ✭ 58 (+346.15%)
Mutual labels:  torch
Artificial Intelligence Deep Learning Machine Learning Tutorials
A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.
Stars: ✭ 2,966 (+22715.38%)
Mutual labels:  torch
ThArrays.jl
A Julia interface for PyTorch's C++ backend, focusing on Tensor, AD, and JIT
Stars: ✭ 23 (+76.92%)
Mutual labels:  torch
eccv16 attr2img
Torch Implemention of ECCV'16 paper: Attribute2Image
Stars: ✭ 93 (+615.38%)
Mutual labels:  torch
Alphaction
Spatio-Temporal Action Localization System
Stars: ✭ 221 (+1600%)
Mutual labels:  torch
sentence2vec
Deep sentence embedding using Sequence to Sequence learning
Stars: ✭ 23 (+76.92%)
Mutual labels:  torch
deepgenres.torch
Predict the genre of a song using the Torch deep learning library
Stars: ✭ 18 (+38.46%)
Mutual labels:  torch
hypnettorch
Package for working with hypernetworks in PyTorch.
Stars: ✭ 66 (+407.69%)
Mutual labels:  torch
multiclass-semantic-segmentation
Experiments with UNET/FPN models and cityscapes/kitti datasets [Pytorch]
Stars: ✭ 96 (+638.46%)
Mutual labels:  torch
lantern
[Android Library] Handling device flash as torch for Android.
Stars: ✭ 81 (+523.08%)
Mutual labels:  torch
inpainting FRRN
Progressive Image Inpainting (Kolmogorov Team solution for Huawei Hackathon 2019 summer)
Stars: ✭ 30 (+130.77%)
Mutual labels:  torch
Pixeldtgan
A torch implementation of "Pixel-Level Domain Transfer"
Stars: ✭ 248 (+1807.69%)
Mutual labels:  torch
vrn-torch-to-keras
Transfer pre-trained VRN model from torch to Keras/Tensorflow
Stars: ✭ 63 (+384.62%)
Mutual labels:  torch
Gpt2 Newstitle
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。
Stars: ✭ 235 (+1707.69%)
Mutual labels:  torch
image-background-remove-tool
✂️ Automated high-quality background removal framework for an image using neural networks. ✂️
Stars: ✭ 767 (+5800%)
Mutual labels:  torch
torch-dataframe
Utility class to manipulate dataset from CSV file
Stars: ✭ 67 (+415.38%)
Mutual labels:  torch
flambeau
Nim bindings to libtorch
Stars: ✭ 60 (+361.54%)
Mutual labels:  torch
Captcha-Cracking
Crack number and Chinese captcha with both traditional and deep learning methods, based on Torch and python.
Stars: ✭ 35 (+169.23%)
Mutual labels:  torch

About

This project introduces a new network to the GAN architecture, called the "Reverser" (R). The Reverser receives images generated by the Generator (G) and tries to reproduce the noise vectors that were initially fed into G. This seems to have some potential for Unsupervised Learning, as well as fixing errors in the images generated by G.

Classic architecture of GANs:

Classic GAN architecture

Changed architecture of GANs with a Reverser (R):

GAN architecture with R

R is a standard convolutional neural network, similar to D.

Results

The following images are generated in grayscale mode to reduce computational complexity. The results should be similar for RGB images.

Embedding learned by G

The following image shows roughly the low dimensional representation of faces learned by G (R is not yet used here). To generate the images, a random noise vector of 32 normal distributed (mean 0, var 1.0) components was first generated. Then each component of the noise vector was picked (one by one) and set to values between -3.0 and +3.0 (in 16 equally spaced steps). That led to 32*16 = 512 noise vectors. For each noise vector one face was generated.

Varying single components of a noise vector

Apparently G does not connect single features with single components (e.g. one component for the gender, one for the size of the nose...). Instead, changing single components morphs the whole face. Note that adding components to the noise vectors (100 and 256 instead of 32) did not seem to change this behavior.

Sorting by similarity / search by example

The following images show some of 10,000 faces that were randomly generated with G. All faces were reversed to noise vectors using R, resulting in one noise vector per face. Then 4 faces were selected as search examples. These face's recovered noise vectors were compared to all other recovered noise vectors using cosine similarity and the most similar faces were added to the result. The same process was repeated using cosine similarity on the raw pixel values instead of the recovered noise vectors (i.e. left columns shows "with R", right column shows "without R").

Searching for similar images

The images indicate that R's features are on a higher level than the raw pixel values. The pair of G and R might have some potential for unsupervised learning.

Clustering

Just like in the previous section ("Sorting by similarity"), 10000 faces were generated with G. For each face the noise vector was recovered using R. These 10,000 noise vectors were then grouped into 20 clusters using kmeans. For each cluster, the 71 images closest to that cluster's centroid were picked (resulting in one block in the below image). Additionally, for each cluster the average of all 71 faces was calculated (add pixel values, divide by 71) and added at the start of the block.

8 clusters of images

The clusters do contain faces that have some similarity. However, the similarity is rather low and seems to be on a broader level than desired. Maybe other clustering methods would produce better results.

Fixing errors in generated images

R can be used to fix errors/artifacts in images generated by G. The method is straight-forward:

  1. Generate an image using a random noise vector,
  2. apply R to the image to recover the noise vector,
  3. feed the recovered noise vector back into G to generate a new image.

The new image tends to be a regression to the mean, i.e. many errors are fixed and some characteristic details get lost or are replaced by more common details. The error fixing effect seems to be overall more prominent (compared to the removal of details), making this process useful. Note that R usually does not fix catastrophic failures of G (e.g. mostly white/black images, completely distorted faces).

The following image shows faces before and after fixing.

Images before and after R

To get the best possible effects, it seemed to be a good choice to add a 50% dropout layer at the very start of R (between input and the first layer). For technical reasons this layer was kept active after training (deactivating it produced broken images).

Anomaly detection

The method of fixing images can also be used to detect anomalous images. In order to do that, one has to pick a generated image, fix it with R and G and then measure the euclidian distance between the original image and the fixed one. If the distance is above a threshold, the image can be considered an anomaly.

The following image shows some generated faces. Detected anomalies are marked with a red border.

Images and detected anomalies

Usage

Requirements are:

  • Torch
    • Required packages (most of them should be part of the default torch install, install missing ones with luarocks install packageName): cudnn, nn, pl, paths, image, optim, cutorch, cunn, cudnn, dpnn, display, unsup
  • Dataset from face-generator (requires LFW and Python to generate)
  • NVIDIA GPU with cudnn3 and 4GB or more memory

To train a network:

  • th -ldisplay.start - This will start display, which is used to plot results in the browser
  • Open http://localhost:8000/ in your browser (display interface)
  • Train a pair of G and D with th train.lua --dataset="/path/to/your/images" . (Add --colorSpace="y" to generate grayscale images instead of RGB ones.) Manually stop (ctrl+c) this script when you like the results. Might take a few hundred epochs.
  • Train R using th train_r.lua --dataset="/path/to/your/images" --nbBatches=2000 . (Runs for 2000 batches. Less batches results in more "average" faces.)
  • Train R for fixing images using th train_r.lua --dataset="/path/to/your/images" --nbBatches=2000 --fixer .
  • Apply R using th apply_r.lua --dataset="/path/to/your/images" .

For train_r.lua and apply_r.lua you can change the used networks with --G, --R and --R_fixer, e.g. --G="logs/foo.net". You will have to use that if you chose grayscale images (instead of RGB), other image heights/widths or other noise vector dimensions.

Possible further research

  • Compare to Autoencoders, especially VAEs.
  • Run experiments with more images in training set. Maybe then components get more associated with specific features?
  • Analyze results for RGB images.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].