All Projects → clovaai → Stargan V2

clovaai / Stargan V2

Licence: other
StarGAN v2 - Official PyTorch Implementation (CVPR 2020)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to Stargan V2

overlord
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.
Stars: ✭ 35 (-98.7%)
Mutual labels:  image-to-image-translation, generative-models
PanoDR
Code and models for "PanoDR: Spherical Panorama Diminished Reality for Indoor Scenes" presented at the OmniCV workshop of CVPR21.
Stars: ✭ 22 (-99.19%)
Mutual labels:  image-to-image-translation, generative-models
Stargan
StarGAN - Official PyTorch Implementation (CVPR 2018)
Stars: ✭ 4,946 (+83.19%)
Mutual labels:  image-to-image-translation, generative-models
nemar
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
Stars: ✭ 120 (-95.56%)
Mutual labels:  image-to-image-translation, cvpr2020
Cvpr2021 Papers With Code
CVPR 2021 论文和开源项目合集
Stars: ✭ 7,138 (+164.37%)
Mutual labels:  cvpr2020
RecycleGAN
The simplest implementation toward the idea of Re-cycle GAN
Stars: ✭ 68 (-97.48%)
Mutual labels:  image-to-image-translation
C3Net
C3Net: Demoireing Network Attentive in Channel, Color and Concatenation (CVPRW 2020)
Stars: ✭ 17 (-99.37%)
Mutual labels:  cvpr2020
Meta-Fine-Tuning
[CVPR 2020 VL3] The repository for meta fine-tuning in cross-domain few-shot learning.
Stars: ✭ 29 (-98.93%)
Mutual labels:  cvpr2020
Deblurgan
Image Deblurring using Generative Adversarial Networks
Stars: ✭ 2,033 (-24.7%)
Mutual labels:  image-to-image-translation
Pix2pix
Image-to-image translation with conditional adversarial nets
Stars: ✭ 8,765 (+224.63%)
Mutual labels:  image-to-image-translation
Alae
[CVPR2020] Adversarial Latent Autoencoders
Stars: ✭ 3,178 (+17.7%)
Mutual labels:  cvpr2020
DeepDream
Generative deep learning: DeepDream
Stars: ✭ 17 (-99.37%)
Mutual labels:  generative-models
HiCMD
[CVPR2020] Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification
Stars: ✭ 64 (-97.63%)
Mutual labels:  cvpr2020
private-data-generation
A toolbox for differentially private data generation
Stars: ✭ 80 (-97.04%)
Mutual labels:  generative-models
Pi Rec
🔥 PI-REC: Progressive Image Reconstruction Network With Edge and Color Domain. 🔥 图像翻译,条件GAN,AI绘画
Stars: ✭ 1,619 (-40.04%)
Mutual labels:  image-to-image-translation
handobjectconsist
[cvpr 20] Demo, training and evaluation code for joint hand-object pose estimation in sparsely annotated videos
Stars: ✭ 100 (-96.3%)
Mutual labels:  cvpr2020
Cvpr2021 Paper Code Interpretation
cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理
Stars: ✭ 8,075 (+199.07%)
Mutual labels:  cvpr2020
Awesome-ICCV2021-Low-Level-Vision
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation
Stars: ✭ 163 (-93.96%)
Mutual labels:  image-to-image-translation
pytorch-neural-enhance
Experiments on CNN-based image enhancement in pytorch
Stars: ✭ 29 (-98.93%)
Mutual labels:  image-to-image-translation
Tensorflow Generative Model Collections
Collection of generative models in Tensorflow
Stars: ✭ 3,785 (+40.19%)
Mutual labels:  generative-models

StarGAN v2 - Official PyTorch Implementation

StarGAN v2: Diverse Image Synthesis for Multiple Domains
Yunjey Choi*, Youngjung Uh*, Jaejun Yoo*, Jung-Woo Ha
In CVPR 2020. (* indicates equal contribution)

Paper: https://arxiv.org/abs/1912.01865
Video: https://youtu.be/0EVh5Ki4dIY

Abstract: A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain variations. The code, pre-trained models, and dataset are available at clovaai/stargan-v2.

Teaser video

Click the figure to watch the teaser video.

IMAGE ALT TEXT HERE

TensorFlow implementation

The TensorFlow implementation of StarGAN v2 by our team member junho can be found at clovaai/stargan-v2-tensorflow.

Software installation

Clone this repository:

git clone https://github.com/clovaai/stargan-v2.git
cd stargan-v2/

Install the dependencies:

conda create -n stargan-v2 python=3.6.7
conda activate stargan-v2
conda install -y pytorch=1.4.0 torchvision=0.5.0 cudatoolkit=10.0 -c pytorch
conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge
pip install opencv-python==4.1.2.30 ffmpeg-python==0.2.0 scikit-image==0.16.2
pip install pillow==7.0.0 scipy==1.2.1 tqdm==4.43.0 munch==2.5.0

Datasets and pre-trained networks

We provide a script to download datasets used in StarGAN v2 and the corresponding pre-trained networks. The datasets and network checkpoints will be downloaded and stored in the data and expr/checkpoints directories, respectively.

CelebA-HQ. To download the CelebA-HQ dataset and the pre-trained network, run the following commands:

bash download.sh celeba-hq-dataset
bash download.sh pretrained-network-celeba-hq
bash download.sh wing

AFHQ. To download the AFHQ dataset and the pre-trained network, run the following commands:

bash download.sh afhq-dataset
bash download.sh pretrained-network-afhq

Generating interpolation videos

After downloading the pre-trained networks, you can synthesize output images reflecting diverse styles (e.g., hairstyle) of reference images. The following commands will save generated images and interpolation videos to the expr/results directory.

CelebA-HQ. To generate images and interpolation videos, run the following command:

python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1 \
               --checkpoint_dir expr/checkpoints/celeba_hq \
               --result_dir expr/results/celeba_hq \
               --src_dir assets/representative/celeba_hq/src \
               --ref_dir assets/representative/celeba_hq/ref

To transform a custom image, first crop the image manually so that the proportion of face occupied in the whole is similar to that of CelebA-HQ. Then, run the following command for additional fine rotation and cropping. All custom images in the inp_dir directory will be aligned and stored in the out_dir directory.

python main.py --mode align \
               --inp_dir assets/representative/custom/female \
               --out_dir assets/representative/celeba_hq/src/female

AFHQ. To generate images and interpolation videos, run the following command:

python main.py --mode sample --num_domains 3 --resume_iter 100000 --w_hpf 0 \
               --checkpoint_dir expr/checkpoints/afhq \
               --result_dir expr/results/afhq \
               --src_dir assets/representative/afhq/src \
               --ref_dir assets/representative/afhq/ref

Evaluation metrics

To evaluate StarGAN v2 using Fréchet Inception Distance (FID) and Learned Perceptual Image Patch Similarity (LPIPS), run the following commands:

# celeba-hq
python main.py --mode eval --num_domains 2 --w_hpf 1 \
               --resume_iter 100000 \
               --train_img_dir data/celeba_hq/train \
               --val_img_dir data/celeba_hq/val \
               --checkpoint_dir expr/checkpoints/celeba_hq \
               --eval_dir expr/eval/celeba_hq

# afhq
python main.py --mode eval --num_domains 3 --w_hpf 0 \
               --resume_iter 100000 \
               --train_img_dir data/afhq/train \
               --val_img_dir data/afhq/val \
               --checkpoint_dir expr/checkpoints/afhq \
               --eval_dir expr/eval/afhq

Note that the evaluation metrics are calculated using random latent vectors or reference images, both of which are selected by the seed number. In the paper, we reported the average of values from 10 measurements using different seed numbers. The following table shows the calculated values for both latent-guided and reference-guided synthesis.

Dataset FID (latent) LPIPS (latent) FID (reference) LPIPS (reference) Elapsed time
celeba-hq 13.73 ± 0.06 0.4515 ± 0.0006 23.84 ± 0.03 0.3880 ± 0.0001 49min 51s
afhq 16.18 ± 0.15 0.4501 ± 0.0007 19.78 ± 0.01 0.4315 ± 0.0002 64min 49s

Training networks

To train StarGAN v2 from scratch, run the following commands. Generated images and network checkpoints will be stored in the expr/samples and expr/checkpoints directories, respectively. Training takes about three days on a single Tesla V100 GPU. Please see here for training arguments and a description of them.

# celeba-hq
python main.py --mode train --num_domains 2 --w_hpf 1 \
               --lambda_reg 1 --lambda_sty 1 --lambda_ds 1 --lambda_cyc 1 \
               --train_img_dir data/celeba_hq/train \
               --val_img_dir data/celeba_hq/val

# afhq
python main.py --mode train --num_domains 3 --w_hpf 0 \
               --lambda_reg 1 --lambda_sty 1 --lambda_ds 2 --lambda_cyc 1 \
               --train_img_dir data/afhq/train \
               --val_img_dir data/afhq/val

Animal Faces-HQ dataset (AFHQ)

We release a new dataset of animal faces, Animal Faces-HQ (AFHQ), consisting of 15,000 high-quality images at 512×512 resolution. The figure above shows example images of the AFHQ dataset. The dataset includes three domains of cat, dog, and wildlife, each providing about 5000 images. By having multiple (three) domains and diverse images of various breeds per each domain, AFHQ sets a challenging image-to-image translation problem. For each domain, we select 500 images as a test set and provide all remaining images as a training set. To download the dataset, run the following command:

bash download.sh afhq-dataset

[Update: 2021.07.01] We rebuild the original AFHQ dataset by using high-quality resize filtering (i.e., Lanczos resampling). Please see the clean FID paper that brings attention to the unfortunate software library situation for downsampling. We thank to Alias-Free GAN authors for their suggestion and contribution to the updated AFHQ dataset. If you use the updated dataset, we recommend to cite not only our paper but also their paper.

The differences from the original dataset are as follows:

  • We resize the images using Lanczos resampling instead of nearest neighbor downsampling.
  • About 2% of the original images had been removed. So the set is now has 15803 images, whereas the original had 16130.
  • Images are saved as PNG format to avoid compression artifacts. This makes the files bigger than the original, but it's worth it.

To download the updated dataset, run the following command:

bash download.sh afhq-v2-dataset

License

The source code, pre-trained models, and dataset are available under Creative Commons BY-NC 4.0 license by NAVER Corporation. You can use, copy, tranform and build upon the material for non-commercial purposes as long as you give appropriate credit by citing our paper, and indicate if changes were made.

For business inquiries, please contact [email protected].
For technical and other inquires, please contact [email protected].

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{choi2020starganv2,
  title={StarGAN v2: Diverse Image Synthesis for Multiple Domains},
  author={Yunjey Choi and Youngjung Uh and Jaejun Yoo and Jung-Woo Ha},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Acknowledgements

We would like to thank the full-time and visiting Clova AI Research (now NAVER AI Lab) members for their valuable feedback and an early review: especially Seongjoon Oh, Junsuk Choe, Muhammad Ferjad Naeem, and Kyungjune Baek. We also thank Alias-Free GAN authors for their contribution to the updated AFHQ dataset.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].