All Projects → bryandlee → Malnyun_faces

bryandlee / Malnyun_faces

침착한 생성모델 학습기

Projects that are alternatives of or similar to Malnyun faces

Awesome-ICCV2021-Low-Level-Vision
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation
Stars: ✭ 163 (-75.92%)
Mutual labels:  image-generation
Texturize
🤖🖌️ Generate photo-realistic textures based on source images. Remix, remake, mashup! Useful if you want to create variations on a theme or elaborate on an existing texture.
Stars: ✭ 366 (-45.94%)
Mutual labels:  image-generation
Awesome Image Translation
A collection of awesome resources image-to-image translation.
Stars: ✭ 408 (-39.73%)
Mutual labels:  image-generation
Anime Face Dataset
🖼 A collection of high-quality anime faces.
Stars: ✭ 272 (-59.82%)
Mutual labels:  image-generation
Freezeg
Freezing generator for pseudo image translation
Stars: ✭ 328 (-51.55%)
Mutual labels:  image-generation
Anycost Gan
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing
Stars: ✭ 367 (-45.79%)
Mutual labels:  image-generation
AsymmetricGAN
[ACCV 2018 Oral] Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
Stars: ✭ 42 (-93.8%)
Mutual labels:  image-generation
Hidt
Official repository for the paper "High-Resolution Daytime Translation Without Domain Labels" (CVPR2020, Oral)
Stars: ✭ 513 (-24.22%)
Mutual labels:  image-generation
Attentiongan
AttentionGAN for Unpaired Image-to-Image Translation & Multi-Domain Image-to-Image Translation
Stars: ✭ 341 (-49.63%)
Mutual labels:  image-generation
Deepnude An Image To Image Technology
DeepNude's algorithm and general image generation theory and practice research, including pix2pix, CycleGAN, UGATIT, DCGAN, SinGAN, ALAE, mGANprior, StarGAN-v2 and VAE models (TensorFlow2 implementation). DeepNude的算法以及通用生成对抗网络(GAN,Generative Adversarial Network)图像生成的理论与实践研究。
Stars: ✭ 4,029 (+495.13%)
Mutual labels:  image-generation
Text To Image Synthesis
Pytorch implementation of Generative Adversarial Text-to-Image Synthesis paper
Stars: ✭ 288 (-57.46%)
Mutual labels:  image-generation
Few Shot Patch Based Training
The official implementation of our SIGGRAPH 2020 paper Interactive Video Stylization Using Few-Shot Patch-Based Training
Stars: ✭ 313 (-53.77%)
Mutual labels:  image-generation
Sean
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)
Stars: ✭ 387 (-42.84%)
Mutual labels:  image-generation
Inpainting gmcnn
Image Inpainting via Generative Multi-column Convolutional Neural Networks, NeurIPS2018
Stars: ✭ 256 (-62.19%)
Mutual labels:  image-generation
Gansformer
Generative Adversarial Transformers
Stars: ✭ 421 (-37.81%)
Mutual labels:  image-generation
clip-guided-diffusion
A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
Stars: ✭ 260 (-61.6%)
Mutual labels:  image-generation
Selectiongan
[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation
Stars: ✭ 366 (-45.94%)
Mutual labels:  image-generation
Pytorch Cyclegan
A clean and readable Pytorch implementation of CycleGAN
Stars: ✭ 558 (-17.58%)
Mutual labels:  image-generation
Apdrawinggan
Code for APDrawingGAN: Generating Artistic Portrait Drawings from Face Photos with Hierarchical GANs (CVPR 2019 Oral)
Stars: ✭ 510 (-24.67%)
Mutual labels:  image-generation
Snappy
PHP library allowing thumbnail, snapshot or PDF generation from a url or a html page. Wrapper for wkhtmltopdf/wkhtmltoimage
Stars: ✭ 3,986 (+488.77%)
Mutual labels:  image-generation

침착한 생성모델

Introduction

Out of pure curiosity, I built a dataset of malnyun cartoon faces and tested some of the recently proposed deep generative models on it. With a pre-trained face generating model and special training techniques, I was able to train a generator at 256x256 resolution in about 10hrs on a single RTX 2080ti GPU using only 500 images.

Data Preparation

I used the webcomics images of 이말년 aka 침착맨. The first attempt was to use a simple face detector on the comics images, but the cascade classifier for human faces provided in opencv does not work well on this cartoon domain.

Neither does the Nagadomi's animeface version, so I decied to manually mark the boxes and ended up with 500 cartoon face images. Since the size of faces varies a lot depending on the scene, I used CRAN-based cartoon super-resolution model to upscale the small images to 256x256 resolution.

StyleGAN + FreezeD

FreezeD: A Simple Baseline for Fine-tuning GANs freezes the first few layers of a trained discriminator and finetunes the model on a new dataset. I used styleGAN model trained on FFHQ dataset, and it took about 10hrs to finetune the pre-trained model for 50k steps on my environment.

[original code]

Interestingly, some of the originally learned semantic features are mapped to the corresponding elements in the early stage of the training. The sunglasses turned into glaring eyes, and the hats turned into hairbands.

The FID converged after 20k steps, and there was no significant improvement in sample quality since then. Below are the style mixing results of the trained generator. The identity of each character and the facial expression/direction are disentangled quite well.

StyleGAN2 + ADA

Training Generative Adversarial Networks with Limited Data uses differentiable non-leaking data augmentation on both the real and generated images, and the augmentation probability is adaptively chosen according to the discriminator output distribution. With the proposed augmentation, the authors managed to train the generator with only 1k images.

[original code]

In my experiments with the default settings, the model collapsed and could not recover from it when training from scratch.

Train from scratch: 100k steps

Starting from the FFHQ pre-trained model, it successfully learned to generate realistic cartoon images.

Transfer from FFHQ: 30k steps

Latent Space Exploration

GANSpace

GANSpace samples a large number of style vectors and estimates the principal axes using the activated features.

[original code]

Close-form factorization

Close-form factorization does not require sampling. It simply uses the eigen direction of the first affine layer's weight.

[original code]

I found gradio module pretty useful for the latent space exploration. It lets one test the deep models interactively on web browsers.

seed

hair

mouth

tilt

*Meme generation time*

       

U-GAT-IT

U-GAT-IT is an image-to-image translation method that achieved great success in face2anime task. It uses CAM modules to extract the attention and AdaLIN modules to learn instance/layer norm balance. I used 1000 samples from Asian face dataset for the input faces. The model uses cycle-consistency loss with multiple discriminators and generators, so I had to downscale the images to 128x128 to fit batchsize 4 in 11GB GPU memory.

[original code]

250k steps

The model captures the direction and the shape of the face but does not preserve detailed attributes. It is mainly because U-GAT-IT is an unsupervised method that finds one-to-one mapping between the two distributions, but the attributes differ in the input and output domain.

StyleGAN2 + U-GAT-IT

I attached the output of the FFHQ-trained styleGAN2 to the trained U-GAT-IT model to explore the learned space. Even though the face generation model is trained mostly on caucasian faces and the image translation model is trained on Asian faces, they work pretty well together.

   
   

Conclusion

  • By transferring the model that has already been trained on large data sharing similar semantics with small target dataset, it is possible to learn a 256 image generation model within half a day.
  • Since the generative model essentially learns the distribution of prepared data, characteristics that are not present in the train data cannot be learned nor generated. If one wants to train an unsupervised image translation model that preserves the characteristics of a person (such as hairstyle, gender, etc.), one may need to prepare the data so that the distributions of the feature match, or approach it with a style transfer method that only changes the low-level texture. A simple yet effective baseline: FreezeG

Additional Results

Method FID (30k iter)
Baseline 75.96
FreezeD 50.74
DiffAug 63.74
ADA 46.23
FreezeD + DiffAug 45.24
FreezeD + ADA 38.94
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].