All Projects → microsoft → CoCosNet-v2

microsoft / CoCosNet-v2

Licence: MIT license
CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to CoCosNet-v2

Gesturegan
[ACM MM 2018 Oral] GestureGAN for Hand Gesture-to-Gesture Translation in the Wild
Stars: ✭ 136 (-56.41%)
Mutual labels:  image-manipulation, image-generation, gans, image-translation
Selectiongan
[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation
Stars: ✭ 366 (+17.31%)
Mutual labels:  image-manipulation, image-generation, gans, image-translation
Pytorch Cyclegan And Pix2pix
Image-to-Image Translation in PyTorch
Stars: ✭ 16,477 (+5181.09%)
Mutual labels:  image-manipulation, image-generation, gans
Img2imggan
Implementation of the paper : "Toward Multimodal Image-to-Image Translation"
Stars: ✭ 49 (-84.29%)
Mutual labels:  image-generation, gans, image-translation
Cocosnet
Cross-domain Correspondence Learning for Exemplar-based Image Translation. (CVPR 2020 Oral)
Stars: ✭ 211 (-32.37%)
Mutual labels:  image-manipulation, gans, image-translation
Contrastive Unpaired Translation
Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)
Stars: ✭ 822 (+163.46%)
Mutual labels:  image-manipulation, image-generation, gans
Lggan
[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
Stars: ✭ 97 (-68.91%)
Mutual labels:  image-manipulation, image-generation, image-translation
Cyclegan
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
Stars: ✭ 10,933 (+3404.17%)
Mutual labels:  image-manipulation, image-generation, gans
Attentiongan
AttentionGAN for Unpaired Image-to-Image Translation & Multi-Domain Image-to-Image Translation
Stars: ✭ 341 (+9.29%)
Mutual labels:  image-generation, gans, image-translation
AODA
Official implementation of "Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis"(WACV 2022/CVPRW 2021)
Stars: ✭ 44 (-85.9%)
Mutual labels:  image-manipulation, image-generation, gans
Finegan
FineGAN: Unsupervised Hierarchical Disentanglement for Fine-grained Object Generation and Discovery
Stars: ✭ 240 (-23.08%)
Mutual labels:  image-manipulation, image-generation, gans
Fq Gan
Official implementation of FQ-GAN
Stars: ✭ 137 (-56.09%)
Mutual labels:  image-generation, gans, image-translation
Anycost Gan
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing
Stars: ✭ 367 (+17.63%)
Mutual labels:  image-manipulation, image-generation, gans
Mixnmatch
Pytorch implementation of MixNMatch
Stars: ✭ 694 (+122.44%)
Mutual labels:  image-manipulation, image-generation, gans
Pix2pix
Image-to-image translation with conditional adversarial nets
Stars: ✭ 8,765 (+2709.29%)
Mutual labels:  image-manipulation, image-generation
Bringing Old Photos Back To Life
Bringing Old Photo Back to Life (CVPR 2020 oral)
Stars: ✭ 9,525 (+2952.88%)
Mutual labels:  image-manipulation, gans
Graphite
Open source 2D node-based raster/vector graphics editor (Photoshop + Illustrator + Houdini = Graphite)
Stars: ✭ 223 (-28.53%)
Mutual labels:  image-manipulation, image-generation
Image To Image Papers
🦓<->🦒 🌃<->🌆 A collection of image to image papers with code (constantly updating)
Stars: ✭ 949 (+204.17%)
Mutual labels:  image-manipulation, image-translation
Distancegan
Pytorch implementation of "One-Sided Unsupervised Domain Mapping" NIPS 2017
Stars: ✭ 180 (-42.31%)
Mutual labels:  image-manipulation, image-generation
Exprgan
Facial Expression Editing with Controllable Expression Intensity
Stars: ✭ 98 (-68.59%)
Mutual labels:  image-manipulation, image-generation

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation)

teaser

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
CVPR 2021, oral presentation
Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen

Paper | Slides

Abstract

We present the full-resolution correspondence learning for cross-domain images, which aids image translation. We adopt a hierarchical strategy that uses the correspondence from coarse level to guide the fine levels. At each hierarchy, the correspondence can be efficiently computed via PatchMatch that iteratively leverages the matchings from the neighborhood. Within each PatchMatch iteration, the ConvGRU module is employed to refine the current correspondence considering not only the matchings of larger context but also the historic estimates. The proposed CoCosNet v2, a GRU-assisted PatchMatch approach, is fully differentiable and highly efficient. When jointly trained with image translation, full-resolution semantic correspondence can be established in an unsupervised manner, which in turn facilitates the exemplar-based image translation. Experiments on diverse translation tasks show that CoCosNet v2 performs considerably better than state-of-the-art literature on producing high-resolution images.

News

2022.12 We propose Paint by Example which allows in the wild image editing according to an examplar based on stable diffusion. One can have a try for our online demo.

2022.8 We recently propose PITI which is a SOTA image-to-image translation method based on prtrained diffusion model.

Installation

First please install dependencies for the experiment:

pip install -r requirements.txt

We recommend to install Pytorch version after Pytorch 1.6.0 since we made use of automatic mixed precision for accelerating. (we used Pytorch 1.7.0 in our experiments)

Prepare the dataset

First download the Deepfashion dataset (high resolution version) from this link. Note the file name is img_highres.zip. Unzip the file and rename it as img.
If the password is necessary, please contact this link to access the dataset.
We use OpenPose to estimate pose of DeepFashion(HD). We offer the keypoints detection results used in our experiment in this link. Download and unzip the results file.
Since the original resolution of DeepfashionHD is 750x1101, we use a Python script to process the images to the resolution 512x512. You can find the script in data/preprocess.py. Note you need to download our train-val split lists train.txt and val.txt from this link in this step.
Download the train-val lists from this link, and the retrival pair lists from this link. Note train.txt and val.txt are our train-val lists. deepfashion_ref.txt, deepfashion_ref_test.txt and deepfashion_self_pair.txt are the paring lists used in our experiment. Download them all and move below the folder data/.
Finally create the root folder deepfashionHD, and move the folders img and pose below it. Now the the directory structure is like:

deepfashionHD
│
└─── img
│   │
│   └─── MEN
│   │   │   ...
│   │
│   └─── WOMEN
│       │   ...
│   
└─── pose
│   │
│   └─── MEN
│   │   │   ...
│   │
│   └─── WOMEN
│       │   ...

Inference Using Pretrained Model

The inference results are saved in the folder checkpoints/deepfashionHD/test. Download the pretrained model from this link.
Move the models below the folder checkpoints/deepfashionHD. Then run the following command.

python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

The inference results are saved in the folder checkpoints/deepfashionHD/test.

Training from scratch

Make sure you have prepared the DeepfashionHD dataset as the instruction.
Download the pretrained VGG model from this link, move it to vgg/ folder. We use this model to calculate training loss.

Run the following command for training from scratch.

python train.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --niter 100 --niter_decay 0 --real_reference_probability 0.0 --hard_reference_probability 0.0 --which_perceptual 4_2 --weight_perceptual 0.001 --PONO --PONO_C --vgg_normal_correct --weight_fm_ratio 1.0 --no_flip --video_like --batchSize 16 --gpu_ids 0,1,2,3,4,5,6,7 --netCorr NoVGGHPM --match_kernel 1 --featEnc_kernel 3 --display_freq 500 --print_freq 50 --save_latest_freq 2500 --save_epoch_freq 5 --nThreads 16 --weight_warp_self 500.0 --lr 0.0001 --nef 32 --amp --weight_warp_cycle 1.0 --display_winsize 512 --iteration_count 5 --temperature 0.01 --continue_train --load_size 550 --crop_size 512 --which_epoch 15

Note that --dataroot parameter is your DeepFashionHD dataset root, e.g. dataset/DeepFashionHD.
We use 8 32GB Tesla V100 GPUs to train the network. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.

Citation

If you use this code for your research, please cite our papers.

@InProceedings{Zhou_2021_CVPR,
author={Zhou, Xingran and Zhang, Bo and Zhang, Ting and Zhang, Pan and Bao, Jianmin and Chen, Dong and Zhang, Zhongfei and Wen, Fang},
title={CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2021},
pages={11465-11475}
}

Also, welcome to refer to our CoCosNet v1:

@InProceedings{Zhang_2020_CVPR,
author={Zhang, Pan and Zhang, Bo and Chen, Dong and Yuan, Lu and Wen, Fang},
title={Cross-Domain Correspondence Learning for Exemplar-Based Image Translation},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020},
pages={5143-5153}
}

Acknowledgments

This code borrows heavily from CocosNet and DeepPruner. We also thank SPADE and RAFT.

License

The codes and the pretrained model in this repository are under the MIT license as specified by the LICENSE file.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].