All Projects → Justin-Tan → Generative Compression

Justin-Tan / Generative Compression

Licence: mit
TensorFlow Implementation of Generative Adversarial Networks for Extreme Learned Image Compression

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Generative Compression

Simgan Captcha
Solve captcha without manually labeling a training set
Stars: ✭ 405 (-5.37%)
Mutual labels:  gan, generative-adversarial-network
Deep Learning Resources
由淺入深的深度學習資源 Collection of deep learning materials for everyone
Stars: ✭ 422 (-1.4%)
Mutual labels:  gan, generative-adversarial-network
Psgan
PyTorch code for "PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer" (CVPR 2020 Oral)
Stars: ✭ 318 (-25.7%)
Mutual labels:  gan, generative-adversarial-network
Deep Generative Prior
Code for deep generative prior (ECCV2020 oral)
Stars: ✭ 308 (-28.04%)
Mutual labels:  gan, generative-adversarial-network
Igan
Interactive Image Generation via Generative Adversarial Networks
Stars: ✭ 3,845 (+798.36%)
Mutual labels:  gan, generative-adversarial-network
Few Shot Patch Based Training
The official implementation of our SIGGRAPH 2020 paper Interactive Video Stylization Using Few-Shot Patch-Based Training
Stars: ✭ 313 (-26.87%)
Mutual labels:  gan, generative-adversarial-network
Gan Playground
GAN Playground - Experiment with Generative Adversarial Nets in your browser. An introduction to GANs.
Stars: ✭ 336 (-21.5%)
Mutual labels:  gan, generative-adversarial-network
Faceswap Gan
A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.
Stars: ✭ 3,099 (+624.07%)
Mutual labels:  gan, generative-adversarial-network
Pytorch Rl
This repository contains model-free deep reinforcement learning algorithms implemented in Pytorch
Stars: ✭ 394 (-7.94%)
Mutual labels:  gan, generative-adversarial-network
Pytorch Mnist Celeba Gan Dcgan
Pytorch implementation of Generative Adversarial Networks (GAN) and Deep Convolutional Generative Adversarial Networks (DCGAN) for MNIST and CelebA datasets
Stars: ✭ 363 (-15.19%)
Mutual labels:  gan, generative-adversarial-network
Pytorch Srgan
A modern PyTorch implementation of SRGAN
Stars: ✭ 289 (-32.48%)
Mutual labels:  gan, generative-adversarial-network
Wassersteingan.tensorflow
Tensorflow implementation of Wasserstein GAN - arxiv: https://arxiv.org/abs/1701.07875
Stars: ✭ 419 (-2.1%)
Mutual labels:  gan, generative-adversarial-network
Makegirlsmoe web
Create Anime Characters with MakeGirlsMoe
Stars: ✭ 3,144 (+634.58%)
Mutual labels:  gan, generative-adversarial-network
Tensorflow Tutorial
Tensorflow tutorial from basic to hard, 莫烦Python 中文AI教学
Stars: ✭ 4,122 (+863.08%)
Mutual labels:  gan, generative-adversarial-network
Dcgan
The Simplest DCGAN Implementation
Stars: ✭ 286 (-33.18%)
Mutual labels:  gan, generative-adversarial-network
Seq2seq Chatbot For Keras
This repository contains a new generative model of chatbot based on seq2seq modeling.
Stars: ✭ 322 (-24.77%)
Mutual labels:  gan, generative-adversarial-network
UEGAN
[TIP2020] Pytorch implementation of "Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network"
Stars: ✭ 68 (-84.11%)
Mutual labels:  generative-adversarial-network, gan
Alae
[CVPR2020] Adversarial Latent Autoencoders
Stars: ✭ 3,178 (+642.52%)
Mutual labels:  gan, generative-adversarial-network
Sdv
Synthetic Data Generation for tabular, relational and time series data.
Stars: ✭ 360 (-15.89%)
Mutual labels:  gan, generative-adversarial-network
Anycost Gan
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing
Stars: ✭ 367 (-14.25%)
Mutual labels:  gan, generative-adversarial-network

generative-compression

TensorFlow Implementation for learned compression of images using Generative Adversarial Networks. The method was developed by Agustsson et. al. in Generative Adversarial Networks for Extreme Learned Image Compression. The proposed idea is very interesting and their approach is well-described.

Results from authors using C=4 bottleneck channels, global compression without semantic maps on the Kodak dataset


Usage

The code depends on Tensorflow 1.8

# Clone
$ git clone https://github.com/Justin-Tan/generative-compression.git
$ cd generative-compression

# To train, check command line arguments
$ python3 train.py -h
# Run
$ python3 train.py -opt momentum --name my_network

Training is conducted with batch size 1 and reconstructed samples / tensorboard summaries will be periodically written every certain number of steps (default is 128). Checkpoints are saved every 10 epochs.

To compress a single image:

# Compress
$ python3 compress.py -r /path/to/model/checkpoint -i /path/to/image -o path/to/output/image

The compressed image will be saved as a side-by-side comparison with the original image under the path specified in directories.samples in config.py. If you are using the provided pretrained model with noise sampling, retain the hyperparameters under config_test in config.py, otherwise the parameters during test time should match the parameters set during training.

Note: If you're willing to pay higher bitrates in exchange for much higher perceptual quality, you may want to check out this implementation of "High-Fidelity Generative Image Compression", which is in the same vein but operates in higher bitrate regimes. Furthermore, it is capable of working with images of arbitrary size and resolution.

Results

These globally compressed images are from the test split of the Cityscapes leftImg8bit dataset. The decoder seems to hallunicate greenery in buildings, and vice-versa.

Global conditional compression: Multiscale discriminator + feature-matching losses, C=8 channels - (compression to 0.072 bbp)

Epoch 38 cityscapes_e38 Epoch 44 cityscapes_e44 Epoch 47 cityscapes_e44 Epoch 48 cityscapes_e44

Show quantized C=4,8,16 channels image comparison
Generator Loss Discriminator Loss
gen_loss discriminator_loss

Pretrained Model

You can find the pretrained model for global compression with a channel bottleneck of C = 8 (corresponding to a 0.072 bpp representation) below. The model was subject to the multiscale discriminator and feature matching losses. Noise is sampled from a 128-dim normal distribution, passed through a DCGAN-like generator and concatenated to the quantized image representation. The model was trained for 55 epochs on the train split of the Cityscapes leftImg8bit dataset for the images and used the gtFine dataset for the corresponding semantic maps. This should work with the default settings under config_test in config.py.

A pretrained model for global conditional compression with a C=8 bottleneck is also included. This model was, trained for 50 epochs with the same losses as above. Reconstruction is conditioned on semantic label maps (see the cGAN/ folder and 'Conditional GAN usage').

** Warning: Tensorflow 1.3 was used to train the models, but it appears to load without problems on Tensorflow 1.8. Please raise an issue if you have any problems.

Details / extensions

The network architectures are based on the description provided in the appendix of the original paper, which is in turn based on the paper Perceptual Losses for Real-Time Style Transfer and Super-Resolution The multiscale discriminator loss used was originally proposed in the project High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs, consult network.py for the implementation. If you would like to add an extension you can create a new method under the Network class, e.g.

@staticmethod
def my_generator(z, **kwargs):
    """
    Inputs:
    z: sampled noise

    Returns:
    upsampled image
    """

    return tf.random_normal([z.get_shape()[0], height, width, channels], seed=42)

To change hyperparameters/toggle features use the knobs in config.py. (Bad form maybe. but I find it easier than a 20-line argparse specification).

Data / Setup

Training was done using the ADE 20k dataset and the Cityscapes leftImg8bit dataset. In the former case images are rescaled to width 512 px, and in the latter images are resampled to [512 x 1024] prior to training. An example script for resampling using Imagemagick is provided under data/. In each case, you will need to create a Pandas dataframe containing a single column: path, which holds the absolute/relative path to the images. This should be saved as a HDF5 file, and you should provide the path to this under the directories class in config.py. Examples for the Cityscapes dataset are provided in the data directory.

Conditional GAN usage

The conditional GAN implementation for global compression is in the cGAN directory. The cGAN implementation appears to yield images with the highest image quality, but this implementation remains experimental. In this implementation generation is conditioned on the information in the semantic label map of the selected image. You will need to download the gtFine dataset of annotation maps and append a separate column semantic_map_paths to the Pandas dataframe pointing to the corresponding images from the gtFine dataset.

Dependencies

Todo:

  • Incorporate GAN noise sampling into the reconstructed image. The authors state that this step is optional and that the sampled noise is combined with the quantized representation but don't provide further details. Currently the model samples from a normal distribution and upsamples this using a DCGAN-like generator (see network.py) to be concatenated with the quantized image representation w_hat, but this appears to substantially increase the 'hallunication factor' in the reconstructed images.
  • Integrate VGG loss.
  • Experiment with WGAN-GP.
  • Experiment with spectral normalization/
  • Experiment with different generator architectures with noise sampling.
  • Extend to selective compression using semantic maps (contributions welcome).

Resources

More Results

Global compression: Noise sampling, multiscale discriminator + feature-matching losses, C=8 channels - Compression to 0.072 bbp

cityscapes_e45 cityscapes_e47 cityscapes_e51 cityscapes_e53 cityscapes_e54 cityscapes_e55 cityscapes_e56

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].