All Projects → YadiraF → Gan_theories

YadiraF / Gan_theories

Resources and Implementations of Generative Adversarial Nets which are focusing on how to stabilize training process and generate high quality images: DCGAN, WGAN, EBGAN, BEGAN, etc.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Gan theories

Anime gan
GAN models with Anime.
Stars: ✭ 56 (-64.56%)
Mutual labels:  dcgan, wgan
GANs-Keras
GANs Implementations in Keras
Stars: ✭ 24 (-84.81%)
Mutual labels:  dcgan, wgan
coursera-gan-specialization
Programming assignments and quizzes from all courses within the GANs specialization offered by deeplearning.ai
Stars: ✭ 277 (+75.32%)
Mutual labels:  dcgan, wgan
Gan Tutorial
Simple Implementation of many GAN models with PyTorch.
Stars: ✭ 227 (+43.67%)
Mutual labels:  dcgan, wgan
Wasserstein Gan
Chainer implementation of Wasserstein GAN
Stars: ✭ 95 (-39.87%)
Mutual labels:  dcgan, wgan
Generative adversarial networks 101
Keras implementations of Generative Adversarial Networks. GANs, DCGAN, CGAN, CCGAN, WGAN and LSGAN models with MNIST and CIFAR-10 datasets.
Stars: ✭ 138 (-12.66%)
Mutual labels:  dcgan, wgan
GAN-Anime-Characters
Applied several Generative Adversarial Networks (GAN) techniques such as: DCGAN, WGAN and StyleGAN to generate Anime Faces and Handwritten Digits.
Stars: ✭ 43 (-72.78%)
Mutual labels:  dcgan, wgan
Pytorch-Basic-GANs
Simple Pytorch implementations of most used Generative Adversarial Network (GAN) varieties.
Stars: ✭ 101 (-36.08%)
Mutual labels:  dcgan, wgan
Ganotebooks
wgan, wgan2(improved, gp), infogan, and dcgan implementation in lasagne, keras, pytorch
Stars: ✭ 1,446 (+815.19%)
Mutual labels:  dcgan, wgan
Tf.gans Comparison
Implementations of (theoretical) generative adversarial networks and comparison without cherry-picking
Stars: ✭ 477 (+201.9%)
Mutual labels:  dcgan, wgan
Dcgan wgan wgan Gp lsgan sngan rsgan began acgan pggan tensorflow
Implementation of some different variants of GANs by tensorflow, Train the GAN in Google Cloud Colab, DCGAN, WGAN, WGAN-GP, LSGAN, SNGAN, RSGAN, RaSGAN, BEGAN, ACGAN, PGGAN, pix2pix, BigGAN
Stars: ✭ 166 (+5.06%)
Mutual labels:  dcgan, wgan
Deeplearningmugenknock
でぃーぷらーにんぐを無限にやってディープラーニングでDeepLearningするための実装CheatSheet
Stars: ✭ 684 (+332.91%)
Mutual labels:  dcgan, wgan
Generative-Model
Repository for implementation of generative models with Tensorflow 1.x
Stars: ✭ 66 (-58.23%)
Mutual labels:  dcgan, wgan
Dcgan Lsgan Wgan Gp Dragan Tensorflow 2
DCGAN LSGAN WGAN-GP DRAGAN Tensorflow 2
Stars: ✭ 373 (+136.08%)
Mutual labels:  dcgan, wgan
Awesome Gans
Awesome Generative Adversarial Networks with tensorflow
Stars: ✭ 585 (+270.25%)
Mutual labels:  dcgan, wgan
Tf Exercise Gan
Tensorflow implementation of different GANs and their comparisions
Stars: ✭ 110 (-30.38%)
Mutual labels:  dcgan, wgan
Pix2pix
Image-to-image translation with conditional adversarial nets
Stars: ✭ 8,765 (+5447.47%)
Mutual labels:  dcgan
Voxel Dcgan
A deep generative model of 3D volumetric shapes
Stars: ✭ 117 (-25.95%)
Mutual labels:  dcgan
Matlab Gan
MATLAB implementations of Generative Adversarial Networks -- from GAN to Pixel2Pixel, CycleGAN
Stars: ✭ 63 (-60.13%)
Mutual labels:  dcgan
Pytorch cpp
Deep Learning sample programs using PyTorch in C++
Stars: ✭ 114 (-27.85%)
Mutual labels:  dcgan

All have been tested with python2.7+ and tensorflow1.0+ in linux.

  • Samples: save generated data, each folder contains a figure to show the results.
  • utils: contains 2 files
    • data.py: prepreocessing data.
    • nets.py: Generator and Discriminator are saved here.

For research purpose,
Network architecture: all GANs used the same network architecture(the Discriminator of EBGAN and BEGAN are the combination of traditional D and G)
Learning rate: all initialized by 1e-4 and decayed by a factor of 2 each 5000 epoches (Maybe it is unfair for some GANs, but the influences are small, so I ignored)
Dataset: celebA cropped with 128 and resized to 64, users should copy all celebA images to ./Datas/celebA for training

  • [x] DCGAN
  • [x] EBGAN
  • [x] WGAN
  • [x] BEGAN
    And for comparsion, I added VAE here.
  • [x] VAE

The generated results are shown in the end of this page.


Theories

✨DCGAN

Main idea: Techniques(of architecture) to stabilize GAN
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[2015]

Loss Function (the same as Vanilla GAN)

DCGAN_loss

Architecture guidelines for stable Deep Convolutional GANs

  • Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).
  • Use batchnorm in both the generator and the discriminator
  • Remove fully connected hidden layers for deeper architectures. Just use average pooling at the end.
  • Use ReLU activation in generator for all layers except for the output, which uses Tanh.
  • Use LeakyReLU activation in the discriminator for all layers.

✨EBGAN

Main idea: Views the discriminator as an energy function Energy-based Generative Adversarial Network[2016]
(Here introduce EBGAN just for BEGAN, they use the same network structure)

What is energy function?
EBGAN_structure
The figure is from LeCun, Yann, et al. "A tutorial on energy-based learning."

In EBGAN, we want the Discriminator to distinguish the real images and the generated(fake) images. How? A simple idea is to set X as the real image and Y as the reconstructed image, and then minimize the energy of X and Y. So we need a auto-encoder to get Y from X, and a measure to calcuate the energy (here are MSE, so simple).
Finally we get the structure of Discriminator as shown below.

EBGAN_structure

So the task of D is to minimize the MSE of real image and the corresponding reconstructed image, and maximize the MSE of fake image from the G and the corresponding reconstructed fake image. And G is to do the adversarial task: minimize the MSE of fake images...
Then obviously the loss function can be written as:
EBGAN_loss

And for comparison with BEGAN, we can set the D only as the auto-encoder and L(*) for the MSE loss. Loss Function EBGAN_loss

m is a positive margin here, when L(G(z)) is close to zero, the L_D is L(x) + m, which means to train D more heavily, and on the contrary, when L(G(z))>m, the L_D is L(x), which means the the D loosens the judgement of the fake images.

Finally, there is a quetion for EBGAN, why use auto-encoder in D instead of the traditonal one? What are the benifits?
I have not read the paper carefully, but one reason I think is that (said in the paper) auto-encoders have the ability to learn an energy manifold without supervision or negative examples. So, rather than simply judge the real or fake of images, the new D can catch the primary distribution of data then distinguish them. And the generated result shown in EBGAN also illustrated that(my understanding): the generated images of celebA from dcgan can hardly distinguish the face and the complex background, but the images from EBGAN focus more heavily on generating faces.


✨Wasserstein GAN

Main idea: Stabilize the training by using Wasserstein-1 distance instead of Jenson-Shannon(JS) divergence
GAN before using JS divergence has the problem of non-overlapping, leading to mode collapse and convergence difficulty.
Use EM distance or Wasserstein-1 distance, so GAN can solve the two problems above without particular architecture (like dcgan).
Wasserstein GAN[2017]

Mathmatics Analysis
Why JS divergence has problems? pleas see Towards Principled Methods for Training Generative Adversarial Networks

Anyway, this highlights the fact that the KL, JS, and TV distances are not sensible cost functions when learning distributions supported by low dimensional manifolds.

so the author use Wasserstein distance
WGAN_loss
Apparently, the G is to maximize the distance, while the D is to minimize the distance.

However, it is difficult to directly calculate the original formula, ||f||_L<=1 is hard to express. So the authors change it to the clip of varibales in D after some mathematical analysis, then the Wasserstein distance version of GAN loss function can be: Loss Function
WGAN_loss

Algorithm guidelines for stable GANs

  • No log in the loss. The output of D is no longer a probability, hence we do not apply sigmoid at the output of D
	G_loss = -tf.reduce_mean(D_fake)
	D_loss = tf.reduce_mean(D_fake) - tf.reduce_mean(D_real) 
  • Clip the weight of D (0.01)
	self.clip_D = [var.assign(tf.clip_by_value(var, -0.01, 0.01)) for var in self.discriminator.vars]
  • Train D more than G (5:1)
  • Use RMSProp instead of ADAM
  • Lower learning rate (0.00005)

✨ BEGAN

Main idea: Match auto-encoder loss distributions using a loss derived from the Wasserstein distance
BEGAN: Boundary Equilibrium Generative Adversarial Networks[2017]

Mathmatics Analysis
We have already introduced the structure of EBGAN, which is also used in BEGAN.
Then, instead of calculating the Wasserstein distance of the samples distribution in WGAN, BEGAN calculates the wasserstein distance of loss distribution.
(The mathematical analysis in BEGAN I think is more clear and intuitive than in WGAN)
So, simply replace the E of L, we get the loss function:
BEGAN_loss

Then, the most intereting part is comming:
a new hyper-paramer to control the trade-off between image diversity and visual quality.
BEGAN_loss
Lower values of γ lead to lower image diversity because the discriminator focuses more heavily on auto-encoding real images.

The final loss function is:
Loss Function
BEGAN_loss

The intuition behind the function is easy to understand:
(Here I describe my understanding roughly...)
(1). In the beginning, the G and D are initialized randomly and k_0 = 0, so the L_real is larger than L_fake, leading to a short increase of k.
(2). After several iterations, the D easily learned how to reconstruct the real data, so gamma x L_real - L_fake is negative, k decreased to 0, now D is only to reconstruct the real data and G is to learn real data distrubition so as to minimize the reconstruction error in D.
(3). Along with the improvement of the ability of G to generate images like real data, L_fake becomes smaller and k becomes larger, so D focuses more on discriminating the real and fake data, then G trained more following.
(4). In the end, k becomes a constant, which means gamma x L_real - L_fake=0, so the optimization is done.

And the global loss is defined the addition of L_real (how well D learns the distribution of real data) and |gamma*L_real - L_fake| (how closed of the generated data from G and the real data)
BEGAN_loss

I set gamma=0.75, learning rate of k = 0.001, then the learning curve of loss and k is shown below.
BEGAN_loss

Results

DCGAN
DCGAN_samples

EBGAN (not trained enough)
EBGAN_samples

WGAN (not trained enough)
WGAN_samples

BEGAN: gamma=0.75 learning rate of k=0.001
BEGAN_samples

BEGAN: gamma= 0.5 learning rate of k = 0.002
BEGAN_samples

VAE
BEGAN_samples

References

http://wiseodd.github.io/techblog/2016/12/10/variational-autoencoder/ (a good blog to introduce VAE)
https://github.com/wiseodd/generative-models/tree/master/GAN
https://github.com/artcg/BEGAN

Others

Tensorflow style: https://www.tensorflow.org/community/style_guide

A good website to convert latex equation to img(then insert into README): http://www.sciweavers.org/free-online-latex-equation-editor

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].