All Projects → danielroich → PTI

danielroich / PTI

Licence: MIT license
Official Implementation for "Pivotal Tuning for Latent-based editing of Real Images" (ACM TOG 2022) https://arxiv.org/abs/2106.05744

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to PTI

HistoGAN
Reference code for the paper HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms (CVPR 2021).
Stars: ✭ 158 (-69.79%)
Mutual labels:  image-editing, stylegan
HFGI
CVPR 2022 HFGI: High-Fidelity GAN Inversion for Image Attribute Editing
Stars: ✭ 384 (-26.58%)
Mutual labels:  image-editing, gan-inversion
Styleflow
StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
Stars: ✭ 1,982 (+278.97%)
Mutual labels:  stylegan
SwiftyJot
Use your finger to annotate images.
Stars: ✭ 14 (-97.32%)
Mutual labels:  image-editing
hms-image-vision-java
This sample code is to guide the developer how to integrate the Image Vision Sub-service of the Image Kit, calling the image filter function. This sub-service provides 24 unique filter effects to enhance the artistic conception and artistic sense of the images.
Stars: ✭ 22 (-95.79%)
Mutual labels:  image-editing
DeepSIM
Official PyTorch implementation of the paper: "DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample" (ICCV 2021 Oral)
Stars: ✭ 389 (-25.62%)
Mutual labels:  image-editing
Jspaint
🎨 Classic MS Paint, REVIVED + ✨Extras
Stars: ✭ 5,972 (+1041.87%)
Mutual labels:  image-editing
steam-stylegan2
Train a StyleGAN2 model on Colaboratory to generate Steam banners.
Stars: ✭ 30 (-94.26%)
Mutual labels:  stylegan
StyleGAN-nada
stylegan-nada.github.io/
Stars: ✭ 1,018 (+94.65%)
Mutual labels:  stylegan
powerpaint
Kreative PowerPaint - Library and Application for Bitmap and Vector Image Editing
Stars: ✭ 27 (-94.84%)
Mutual labels:  image-editing
ManTraNet-pytorch
Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch
Stars: ✭ 47 (-91.01%)
Mutual labels:  image-editing
SDEdit
PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations
Stars: ✭ 394 (-24.67%)
Mutual labels:  image-editing
Flameshot
Powerful yet simple to use screenshot software 🖥️ 📸
Stars: ✭ 15,429 (+2850.1%)
Mutual labels:  image-editing
pytorch sscr
A PyTorch implementation of SSCR
Stars: ✭ 25 (-95.22%)
Mutual labels:  image-editing
Litrato
Android photo editing app with various filters and tools. Included advanced features like masking, histogram, color picker, EXIF viewer...
Stars: ✭ 54 (-89.67%)
Mutual labels:  image-editing
Alae
[CVPR2020] Adversarial Latent Autoencoders
Stars: ✭ 3,178 (+507.65%)
Mutual labels:  stylegan
Bild
Image processing algorithms in pure Go
Stars: ✭ 3,431 (+556.02%)
Mutual labels:  image-editing
goat
Annotate Images (or goats) On The Web™
Stars: ✭ 75 (-85.66%)
Mutual labels:  image-editing
latent space adventures
Buckle up, adventure in the styleGAN2-ada-pytorch network latent space awaits
Stars: ✭ 59 (-88.72%)
Mutual labels:  stylegan2-ada-pytorch
Paddlegan
PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, wav2lip, picture repair, image editing, photo2cartoon, image style transfer, and so on.
Stars: ✭ 4,987 (+853.54%)
Mutual labels:  image-editing

PTI: Pivotal Tuning for Latent-based editing of Real Images (ACM TOG 2022)


Inference Notebook:


Pivotal Tuning Inversion (PTI) enables employing off-the-shelf latent based semantic editing techniques on real images using StyleGAN. PTI excels in identity preserving edits, portrayed through recognizable figures — Serena Williams and Robert Downey Jr. (top), and in handling faces which are clearly out-of-domain, e.g., due to heavy makeup (bottom).

Description

Official Implementation of our PTI paper + code for evaluation metrics. PTI introduces an optimization mechanizem for solving the StyleGAN inversion task. Providing near-perfect reconstruction results while maintaining the high editing abilitis of the native StyleGAN latent space W. For more details, see

Recent Updates

2021.07.01: Fixed files download phase in the inference notebook. Which might caused the notebook not to run smoothly.

2021.06.29: Added support for CPU. In order to run PTI on CPU please change device parameter under configs/global_config.py to "cpu" instead of "cuda".

2021.06.25 : Adding mohawk edit using StyleCLIP+PTI in inference notebook. Updating documentation in inference notebook due to Google Drive rate limit reached. Currently, Google Drive does not allow to download the pretrined models using Colab automatically. Manual intervention might be needed.

Getting Started

Prerequisites

  • Linux or macOS
  • NVIDIA GPU + CUDA CuDNN (Not mandatory bur recommended)
  • Python 3

Installation

  • Dependencies:
    1. lpips
    2. wandb
    3. pytorch
    4. torchvision
    5. matplotlib
    6. dlib
  • All dependencies can be installed using pip install and the package name

Pretrained Models

Please download the pretrained models from the following links.

Auxiliary Models

We provide various auxiliary models needed for PTI inversion task.
This includes the StyleGAN generator and pre-trained models used for loss computation.

Path Description
FFHQ StyleGAN StyleGAN2-ada model trained on FFHQ with 1024x1024 output resolution.
Dlib alignment Dlib alignment used for images preproccessing.
FFHQ e4e encoder Pretrained e4e encoder. Used for StyleCLIP editing.

Note: The StyleGAN model is used directly from the official stylegan2-ada-pytorch implementation. For StyleCLIP pretrained mappers, please see StyleCLIP's official routes

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models. However, you may use your own paths by changing the necessary values in configs/path_configs.py.

Inversion

Preparing your Data

In order to invert a real image and edit it you should first align and crop it to the correct size. To do so you should perform One of the following steps:

  1. Run notebooks/align_data.ipynb and change the "images_path" variable to the raw images path
  2. Run utils/align_data.py and change the "images_path" variable to the raw images path

Weights And Biases

The project supports Weights And Biases framework for experiment tracking. For the inversion task it enables visualization of the losses progression and the generator intermediate results during the initial inversion and the Pivotal Tuning(PT) procedure.

The log frequency can be adjusted using the parameters defined at configs/global_config.py under the "Logs" subsection.

There is no no need to have an account. However, in order to use the features provided by Weights and Biases you first have to register on their site.

Running PTI

The main training script is scripts/run_pti.py. The script receives aligned and cropped images from paths configured in the "Input info" subscetion in configs/paths_config.py. Results are saved to directories found at "Dirs for output files" under configs/paths_config.py. This includes inversion latent codes and tuned generators. The hyperparametrs for the inversion task can be found at configs/hyperparameters.py. They are intilized to the default values used in the paper.

Editing

By default, we assume that all auxiliary edit directions are downloaded and saved to the directory editings. However, you may use your own paths by changing the necessary values in configs/path_configs.py under "Edit directions" subsection.

Example of editing code can be found at scripts/latent_editor_wrapper.py

Inference Notebooks

To help visualize the results of PTI we provide a Jupyter notebook found in notebooks/inference_playground.ipynb.
The notebook will download the pretrained models and run inference on a sample image found online or on images of your choosing. It is recommended to run this in Google Colab.

The notebook demonstrates how to:

  • Invert an image using PTI
  • Visualise the inversion and use the PTI output
  • Edit the image after PTI using InterfaceGAN and StyleCLIP
  • Compare to other inversion methods

Evaluation

Currently the repository supports qualitative evaluation for reconstruction of: PTI, SG2 (W Space), e4e, SG2Plus (W+ Space). As well as editing using InterfaceGAN and GANSpace for the same inversion methods. To run the evaluation please see evaluation/qualitative_edit_comparison.py. Examples of the evaluation scripts are:


Reconsturction comparison between different methods. The images order is: Original image, W+ inversion, e4e inversion, W inversion, PTI inversion


InterfaceGAN pose edit comparison between different methods. The images order is: Original, W+, e4e, W, PTI


Image per edit or several edits without comparison

Coming Soon - Quantitative evaluation and StyleCLIP qualitative evaluation

Repository structure

Path Description
├  configs Folder containing configs defining Hyperparameters, paths and logging
├  criteria Folder containing various loss and regularization criterias for the optimization
├  dnnlib Folder containing internal utils for StyleGAN2-ada
├  docs Folder containing the latent space edit directions
├  editings Folder containing images displayed in the README
├  environment Folder containing Anaconda environment used in our experiments
├  licenses Folder containing licenses of the open source projects used in this repository
├  models Folder containing models used in different editing techniques and first phase inversion
├  notebooks Folder with jupyter notebooks to demonstrate the usage of PTI end-to-end
├  scripts Folder with running scripts for inversion, editing and metric computations
├  torch_utils Folder containing internal utils for StyleGAN2-ada
├  training Folder containing the core training logic of PTI
├  utils Folder with various utility functions

Credits

StyleGAN2-ada model and implementation:
https://github.com/NVlabs/stylegan2-ada-pytorch Copyright © 2021, NVIDIA Corporation.
Nvidia Source Code License https://nvlabs.github.io/stylegan2-ada-pytorch/license.html

LPIPS model and implementation:
https://github.com/richzhang/PerceptualSimilarity
Copyright (c) 2020, Sou Uchida
License (BSD 2-Clause) https://github.com/richzhang/PerceptualSimilarity/blob/master/LICENSE

e4e model and implementation:
https://github.com/omertov/encoder4editing Copyright (c) 2021 omertov
License (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE

StyleCLIP model and implementation:
https://github.com/orpatashnik/StyleCLIP Copyright (c) 2021 orpatashnik
License (MIT) https://github.com/orpatashnik/StyleCLIP/blob/main/LICENSE

InterfaceGAN implementation:
https://github.com/genforce/interfacegan Copyright (c) 2020 genforce
License (MIT) https://github.com/genforce/interfacegan/blob/master/LICENSE

GANSpace implementation:
https://github.com/harskish/ganspace Copyright (c) 2020 harkish
License (Apache License 2.0) https://github.com/harskish/ganspace/blob/master/LICENSE

Acknowledgments

This repository structure is based on encoder4editing and ReStyle repositories

Contact

For any inquiry please contact us at our email addresses: [email protected] or [email protected]

Citation

If you use this code for your research, please cite:

@article{roich2021pivotal,
  title={Pivotal Tuning for Latent-based Editing of Real Images},
  author={Roich, Daniel and Mokady, Ron and Bermano, Amit H and Cohen-Or, Daniel},
  publisher = {Association for Computing Machinery},
  journal={ACM Trans. Graph.},
  year={2021}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].