All Projects → USTC-JialunPeng → Diverse-Structure-Inpainting

USTC-JialunPeng / Diverse-Structure-Inpainting

Licence: MIT License
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Diverse-Structure-Inpainting

Gansformer
Generative Adversarial Transformers
Stars: ✭ 421 (+221.37%)
Mutual labels:  attention, generative-adversarial-networks
attention-target-detection
[CVPR2020] "Detecting Attended Visual Targets in Video"
Stars: ✭ 105 (-19.85%)
Mutual labels:  attention
shoe-design-using-generative-adversarial-networks
No description or website provided.
Stars: ✭ 18 (-86.26%)
Mutual labels:  generative-adversarial-networks
RSTNet
RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words (CVPR 2021)
Stars: ✭ 71 (-45.8%)
Mutual labels:  multimodal
ntua-slp-semeval2018
Deep-learning models of NTUA-SLP team submitted in SemEval 2018 tasks 1, 2 and 3.
Stars: ✭ 79 (-39.69%)
Mutual labels:  attention
Base-On-Relation-Method-Extract-News-DA-RNN-Model-For-Stock-Prediction--Pytorch
基於關聯式新聞提取方法之雙階段注意力機制模型用於股票預測
Stars: ✭ 33 (-74.81%)
Mutual labels:  attention
visualization
a collection of visualization function
Stars: ✭ 189 (+44.27%)
Mutual labels:  attention
SBR
⌛ Introducing Self-Attention to Target Attentive Graph Neural Networks (AISP '22)
Stars: ✭ 22 (-83.21%)
Mutual labels:  attention
keras cv attention models
Keras/Tensorflow attention models including beit,botnet,CMT,CoaT,CoAtNet,convnext,cotnet,davit,efficientdet,efficientnet,fbnet,gmlp,halonet,lcnet,levit,mlp-mixer,mobilevit,nfnets,regnet,resmlp,resnest,resnext,resnetd,swin,tinynet,uniformer,volo,wavemlp,yolor,yolox
Stars: ✭ 159 (+21.37%)
Mutual labels:  attention
NTUA-slp-nlp
💻Speech and Natural Language Processing (SLP & NLP) Lab Assignments for ECE NTUA
Stars: ✭ 19 (-85.5%)
Mutual labels:  attention
RNNSearch
An implementation of attention-based neural machine translation using Pytorch
Stars: ✭ 43 (-67.18%)
Mutual labels:  attention
edge2view
This is a pix2pix demo that learns from edge and translates this into view. A interactive application is also provided that translates edge to view.
Stars: ✭ 22 (-83.21%)
Mutual labels:  generative-adversarial-networks
interpretable-han-for-document-classification-with-keras
Keras implementation of hierarchical attention network for document classification with options to predict and present attention weights on both word and sentence level.
Stars: ✭ 18 (-86.26%)
Mutual labels:  attention
automatic-personality-prediction
[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings
Stars: ✭ 43 (-67.18%)
Mutual labels:  attention
GAN-Project-2018
GAN in Tensorflow to be run via Linux command line
Stars: ✭ 21 (-83.97%)
Mutual labels:  generative-adversarial-networks
Image-Captioning
Image Captioning with Keras
Stars: ✭ 60 (-54.2%)
Mutual labels:  attention
awesome-GAN-papers
papers and codes about GAN
Stars: ✭ 55 (-58.02%)
Mutual labels:  generative-adversarial-networks
generative deep learning
Generative Deep Learning Sessions led by Anugraha Sinha (Machine Learning Tokyo)
Stars: ✭ 24 (-81.68%)
Mutual labels:  generative-adversarial-networks
interactive-spectrogram-inpainting
Implementation of the framework described in the paper Spectrogram Inpainting for Interactive Generation of Instrument Sounds published at the 2020 Joint Conference on AI Music Creativity.
Stars: ✭ 26 (-80.15%)
Mutual labels:  vq-vae
dhs summit 2019 image captioning
Image captioning using attention models
Stars: ✭ 34 (-74.05%)
Mutual labels:  attention

Diverse Structure Inpainting

Paper | Supplementary Material | ArXiv | BibTex

This repository is for the CVPR 2021 paper, "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE".

Introduction

(Top) Input incomplete image, where the missing region is depicted in gray. (Middle) Visualization of the generated diverse structures. (Bottom) Output images of our method.

Places2 Results

Results on the Places2 validation set using the center-mask Places2 model.

CelebA-HQ Results

Results on one CelebA-HQ test image with different holes using the random-mask CelebA-HQ model.

Installation

This code was tested with TensorFlow 1.12.0 (later versions may work, excluding 2.x), CUDA 9.0, Python 3.6 and Ubuntu 16.04

Clone this repository:

git clone https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting.git

Datasets

  • CelebA-HQ: the high-resolution face images from Growing GANs. 24183 images for training, 2993 images for validation and 2824 images for testing.
  • Places2: the challenge data from 365 scene categories. 8 Million images for training, 36K images for validation and 328K images for testing.
  • ImageNet: the data from 1000 natural categories. 1 Million images for training and 50K images for validation.

Training

  • Collect the dataset. For CelebA-HQ, we collect the 1024x1024 version. For Places2 and ImageNet, we collect the original version.
  • Prepare the file list. Collect the path of each image and make a file, where each line is a path (end with a carriage return except the last line).
  • Modify checkpoints_dir, dataset, train_flist and valid_flist arguments in train_vqvae.py, train_structure_generator.py and train_texture_generator.py.
  • Modify data/data_loader.py according to the dataset. For CelebA-HQ, we resize each image to 266x266 and randomly crop a 256x256. For Places2 and ImageNet, we randomly crop a 256x256
  • Run python train_vqvae.py to train VQ-VAE.
  • Modify vqvae_network_dir argument in train_structure_generator.py and train_texture_generator.py based on the path of pre-trained VQ-VAE.
  • Modify the mask setting arguments in train_structure_generator.py and train_texture_generator.py to choose center mask or random mask.
  • Run python train_structure_generator.py to train the structure generator.
  • Run python train_texture_generator.py to train the texture generator.
  • Modify structure_generator_dir and texture_generator_dir arguments in save_full_model.py based on the paths of pre-trained structure generator and texture generator.
  • Run python save_full_model.py to save the whole model.

Testing

  • Collect the testing set. For CelebA-HQ, we resize each image to 256x256. For Places2 and ImageNet, we crop a center 256x256.
  • Collect the corresponding mask set (2D grayscale, 0 indicates the known region, 255 indicates the missing region).
  • Prepare the img file list and the mask file list as training. An example can be seen here.
  • Modify checkpoints_dir, dataset, img_flist and mask_flist arguments in test.py.
  • Download the pre-trained model and put model.ckpt.meta, model.ckpt.index, model.ckpt.data-00000-of-00001 and checkpoint under model_logs/ directory.
  • Run python test.py

Pre-trained Models

Download the pre-trained models using the following links and put them under model_logs/ directory.

The center_mask models are trained with images of 256x256 resolution with center 128x128 holes. The random_mask models are trained with random regular and irregular holes.

Inference Time

One advantage of GAN-based and VAE-based methods is their fast inference speed. We measure that Mutual Encoder-Decoder with Feature Equalizations runs at 0.2 second per image on a single NVIDIA 1080 Ti GPU for images of resolution 256×256. In contrast, our model runs at 45 seconds per image. Naively sampling our autoregressive network is the major source of computational time. Fortunately, this time can be reduced by an order of magnitude using an incremental sampling technique which caches and reuses intermediate states of the network. Consider using this technique for faster inference.

Citing

If our method is useful for your research, please consider citing.

@inproceedings{peng2021generating,
  title={Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE},
  author={Peng, Jialun and Liu, Dong and Xu, Songcen and Li, Houqiang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={10775-10784},
  year={2021}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].