videoGAN

My smal project to implement GAN to generate video with an LSTM encodes the transformation of video frames in noise space.

Datasets:

The generator consist of two parts:

The discriminator is the from the paper "Generating Videos With Scence Dynamics" (http://web.mit.edu/vondrick/tinyvideo/paper.pdf). No Bach Normalization in the discriminator.

Train is done by using Wasserstein GAN with Gradient Policy, lambda = 200.0. Basic Loss and Alternative Loss for GAN training is also available.

Adam Optimizer is used for the discrimnator and the DCGAN part of the generator.

Stochastic Gradient Descent with momentum is used for the LSTM part of the generator.

Basic version of GAN loss doesn't work for this model. The LSTM part of the generator didn't receive much gradients to learn from
Alternative -log(D(G(noise))) version of GAN loss is able to provide big and unstable gradient to LSTM part and learn quickly at first but unable to go further than learning some of the moving dynamics and some vague digit shapes. After that, it diverge and video quality looks worse overtime. Also, extreme mode collapse.
WGAN-GP learn quicker than alternative loss, gradients are stable. At first the discriminator loss go down but after a few thousand iterations, it goes up steadily. Video quality goes up as discriminator loss goes up. Generator loss doesn't mean that much. No witnessed mode collapse

If not use gradient clipping, the gradients of some parameters can go upto 25-30. Those big values doesn't seem to have any effects though.
Using layer normalization in the discriminator seem to make it harder for the discriminator to learn (shown as gradients of the generator go down to 0 for some iteration) and make it worse for everything. (this is confirmed after 2nd try).
The discriminator should be trained more than the generator to reach optimum. From the github of Wasserstein-GAN original, the first few iteration we should train the discriminator 100 iters instead of 5 iters for 1 generator iter.
For LSTM one should not use Adam, instead use SGD+momentum (0.001, 0.95 seems to work well)