All Projects → IIGROUP → MM-CelebA-HQ-Dataset

IIGROUP / MM-CelebA-HQ-Dataset

Licence: other
[CVPR 2021] A large-scale face image dataset that allows text-to-image generation, text-guided image manipulation, sketch-to-image generation, GANs for face generation and editing, image caption, and VQA

Projects that are alternatives of or similar to MM-CelebA-HQ-Dataset

Caffe BEGAN
Caffe/C++ implementation of Boundary Equilibrium Generative Adversarial Networks paper for face generation
Stars: ✭ 22 (-82.26%)
Mutual labels:  face-generation
amr
Official adversarial mixup resynthesis repository
Stars: ✭ 31 (-75%)
Mutual labels:  celeba-dataset
stylegan2-landmark-projection
Experimental repository attempting to project facial landmarks into the StyleGAN2 latent space.
Stars: ✭ 14 (-88.71%)
Mutual labels:  face-generation
StyleCLIPDraw
Styled text-to-drawing synthesis method. Featured at IJCAI 2022 and the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design
Stars: ✭ 247 (+99.19%)
Mutual labels:  text-to-image-synthesis
Pytorch Vae
A Collection of Variational Autoencoders (VAE) in PyTorch.
Stars: ✭ 2,704 (+2080.65%)
Mutual labels:  celeba-dataset
celeba-gan-pytorch
Generative Adversarial Networks in PyTorch
Stars: ✭ 35 (-71.77%)
Mutual labels:  celeba-dataset
Alae
[CVPR2020] Adversarial Latent Autoencoders
Stars: ✭ 3,178 (+2462.9%)
Mutual labels:  face-generation
Awesome Face recognition
papers about Face Detection; Face Alignment; Face Recognition && Face Identification && Face Verification && Face Representation; Face Reconstruction; Face Tracking; Face Super-Resolution && Face Deblurring; Face Generation && Face Synthesis; Face Transfer; Face Anti-Spoofing; Face Retrieval;
Stars: ✭ 3,220 (+2496.77%)
Mutual labels:  face-generation
Child-Face-Generation
Deep Convolutional Conditional GAN and Supervised CNN for generating children's faces given parents' faces
Stars: ✭ 26 (-79.03%)
Mutual labels:  face-generation
keras-deep-learning
Various implementations and projects on CNN, RNN, LSTM, GAN, etc
Stars: ✭ 22 (-82.26%)
Mutual labels:  face-generation
clip-guided-diffusion
A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
Stars: ✭ 260 (+109.68%)
Mutual labels:  text-to-image-synthesis

Multi-Modal-CelebA-HQ

Paper Maintenance PR's Welcome Images 30000

Multi-Modal-CelebA-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has high-quality segmentation mask, sketch, descriptive text, and image with transparent background.

Multi-Modal-CelebA-HQ can be used to train and evaluate algorithms of text-to-image generation, text-guided image manipulation, sketch-to-image generation, image caption, and VQA. This dataset is proposed and used in TediGAN.

Data Generation

  • The textual descriptions are generated using probabilistic context-free grammar (PCFG) based on the given attributes. We create ten unique single sentence descriptions per image to obtain more training data following the format of the popular CUB dataset and COCO dataset. The previous study proposed CelebTD-HQ, but it is not publicly available.
  • For label, we use CelebAMask-HQ dataset, which contains manually-annotated semantic mask of facial attributes corresponding to CelebA-HQ.
  • For sketches, we follow the same data generation pipeline as in DeepFaceDrawing. We first apply Photocopy filter in Photoshop to extract edges, which preserves facial details and introduces excessive noise, then apply the sketch-simplification to get edge maps resembling hand-drawn sketches.
  • For background removing, we use an open-source tool Rembg and a commercial software removebg. Different backgrounds can be further added using image composition or harmonization methods like DoveNet.

Overview

image

Note: Upon request, the download links of raw data and annotations have been removed from this repo. Please redirect to their original site for the raw data and email me for the post-processing scripts.

All data is hosted on Google Drive (not available).

Path Size Files Format Description
multi-modal-celeba ~20 GB 420,002 Main folder
├  image ~2 GB 30,000 JPG images from celeba-hq of size 512×512
├  text 11 MB 30,0000 TXT 10 descriptions of each image in celeba-hq
├  train 347 KB 1 PKL filenames of training images
├  test 81 KB 1 PKL filenames of test images

Pretrained Models

We provide the pretrained models of AttnGAN, ControlGAN, DMGAN, DFGAN, and ManiGAN. Please consider citing our paper if you use these pretrained models. Feel free to pull requests if you have any updates. Feel free to pull requests if you have any updates.

Method FID LPIPIS Download
AttnGAN 125.98 0.512 Google Drive
ControlGAN 116.32 0.522 Google Drive
DFGAN 137.60 0.581 Google Drive
DM-GAN 131.05 0.544 Google Drive
TediGAN 106.37 0.456 Google Drive

The pretrained model of ManiGAN is here. The training scripts and pretrained models on faces of sketch-to-to-image and label-to-image can be found here. Those with problems accessing Google Drive can refer to an alternative link at Baidu Cloud (code: b273) for the dataset and pretrained models.

Related Works

  • CelebA dataset:
    Ziwei Liu, Ping Luo, Xiaogang Wang and Xiaoou Tang, "Deep Learning Face Attributes in the Wild", in IEEE International Conference on Computer Vision (ICCV), 2015
  • CelebA-HQ was collected from CelebA and further post-processed by the following paper :
    Karras et. al., "Progressive Growing of GANs for Improved Quality, Stability, and Variation", in Internation Conference on Reoresentation Learning (ICLR), 2018
  • CelebAMask-HQ manually-annotated masks with the size of 512 x 512 and 19 classes including all facial components and accessories such as skin, nose, eyes, eyebrows, ears, mouth, lip, hair, hat, eyeglass, earring, necklace, neck, and cloth. It was collected by the following paper :
    Lee et. al., "MaskGAN: Towards Diverse and Interactive Facial Image Manipulation", in Computer Vision and Pattern Recognition (CVPR), 2020

License and Citation

If you find the dataset and pretrained models helpful for your research, please consider to cite:

@inproceedings{xia2021tedigan,
  title={TediGAN: Text-Guided Diverse Face Image Generation and Manipulation},
  author={Xia, Weihao and Yang, Yujiu and Xue, Jing-Hao and Wu, Baoyuan},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

@article{xia2021open,
  title={Towards Open-World Text-Guided Face Image Generation and Manipulation},
  author={Xia, Weihao and Yang, Yujiu and Xue, Jing-Hao and Wu, Baoyuan},
  journal={arxiv preprint arxiv: 2104.08910},
  year={2021}
}

@inproceedings{karras2017progressive,
  title={Progressive growing of gans for improved quality, stability, and variation},
  author={Karras, Tero and Aila, Timo and Laine, Samuli and Lehtinen, Jaakko},
  journal={International Conference on Learning Representations (ICLR)},
  year={2018}
}

@inproceedings{liu2015faceattributes,
 title = {Deep Learning Face Attributes in the Wild},
 author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
 booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
 year = {2015} 
}

If you use the labels, please cite:

@inproceedings{CelebAMask-HQ,
  title={MaskGAN: Towards Diverse and Interactive Facial Image Manipulation},
  author={Lee, Cheng-Han and Liu, Ziwei and Wu, Lingyun and Luo, Ping},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

The use of this software is RESTRICTED to non-commercial research and educational purposes. The license is the same as in CelebAMask-HQ.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].