All Projects → ysymyth → 3d Sdn

ysymyth / 3d Sdn

Licence: other
[NeurIPS 2018] 3D-Aware Scene Manipulation via Inverse Graphics

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to 3d Sdn

Mimicry
[CVPR 2020 Workshop] A PyTorch GAN library that reproduces research results for popular GANs.
Stars: ✭ 458 (+78.91%)
Mutual labels:  gans, generative-adversarial-networks
Torchgan
Research Framework for easy and efficient training of GANs based on Pytorch
Stars: ✭ 1,156 (+351.56%)
Mutual labels:  gans, generative-adversarial-networks
Gansformer
Generative Adversarial Transformers
Stars: ✭ 421 (+64.45%)
Mutual labels:  gans, generative-adversarial-networks
Pytorch Gans
My implementation of various GAN (generative adversarial networks) architectures like vanilla GAN (Goodfellow et al.), cGAN (Mirza et al.), DCGAN (Radford et al.), etc.
Stars: ✭ 271 (+5.86%)
Mutual labels:  gans, generative-adversarial-networks
Generative adversarial networks 101
Keras implementations of Generative Adversarial Networks. GANs, DCGAN, CGAN, CCGAN, WGAN and LSGAN models with MNIST and CIFAR-10 datasets.
Stars: ✭ 138 (-46.09%)
Mutual labels:  gans, generative-adversarial-networks
Ylg
[CVPR 2020] Official Implementation: "Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models".
Stars: ✭ 109 (-57.42%)
Mutual labels:  gans, generative-adversarial-networks
Delving Deep Into Gans
Generative Adversarial Networks (GANs) resources sorted by citations
Stars: ✭ 834 (+225.78%)
Mutual labels:  gans, generative-adversarial-networks
stylegan-v
[CVPR 2022] StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Stars: ✭ 136 (-46.87%)
Mutual labels:  gans, generative-adversarial-networks
Gdwct
Official PyTorch implementation of GDWCT (CVPR 2019, oral)
Stars: ✭ 122 (-52.34%)
Mutual labels:  gans, generative-adversarial-networks
Energy based generative models
PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models
Stars: ✭ 114 (-55.47%)
Mutual labels:  gans, generative-adversarial-networks
gan-vae-pretrained-pytorch
Pretrained GANs + VAEs + classifiers for MNIST/CIFAR in pytorch.
Stars: ✭ 134 (-47.66%)
Mutual labels:  gans, generative-adversarial-networks
generative deep learning
Generative Deep Learning Sessions led by Anugraha Sinha (Machine Learning Tokyo)
Stars: ✭ 24 (-90.62%)
Mutual labels:  gans, generative-adversarial-networks
fusion gan
Codes for the paper 'Learning to Fuse Music Genres with Generative Adversarial Dual Learning' ICDM 17
Stars: ✭ 18 (-92.97%)
Mutual labels:  generative-adversarial-networks
transganformer
Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper
Stars: ✭ 137 (-46.48%)
Mutual labels:  generative-adversarial-networks
mSRGAN-A-GAN-for-single-image-super-resolution-on-high-content-screening-microscopy-images.
Generative Adversarial Network for single image super-resolution in high content screening microscopy images
Stars: ✭ 52 (-79.69%)
Mutual labels:  generative-adversarial-networks
Diverse-Structure-Inpainting
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"
Stars: ✭ 131 (-48.83%)
Mutual labels:  generative-adversarial-networks
EigenGAN-Tensorflow
EigenGAN: Layer-Wise Eigen-Learning for GANs (ICCV 2021)
Stars: ✭ 294 (+14.84%)
Mutual labels:  gans
MNIST-invert-color
Invert the color of MNIST images with PyTorch
Stars: ✭ 13 (-94.92%)
Mutual labels:  generative-adversarial-networks
ACCV TinyGAN
BigGAN; Knowledge Distillation; Black-Box; Fast Training; 16x compression
Stars: ✭ 62 (-75.78%)
Mutual labels:  gans
Machine-Learning
The projects I do in Machine Learning with PyTorch, keras, Tensorflow, scikit learn and Python.
Stars: ✭ 54 (-78.91%)
Mutual labels:  gans

3D Scene De-rendering Networks (3D-SDN)

Project | Paper | Poster

PyTorch implementation for 3D-aware scene de-rendering and editing. Our method integrates disentangled representations for semantics, geometry, and appearance into a deep generative model. The disentanglement of semantics, geometry, and appearance supports 3D-aware scene manipulation such as (a) translation, (b) rotation, (c) color and texture editing, and (d) object removal and occlusion recovery.

3D-Aware Scene Manipulation via Inverse Graphics
Shunyu Yao*, Tzu-Ming Harry Hsu*, Jun-Yan Zhu, Jiajun Wu, Antonio Torralba, William T. Freeman, Joshua B. Tenenbaum
In Neural Information Processing Systems (NeurIPS) 2018.
MIT CSAIL, Tsinghua University, and Google Research.

Framework

Our de-renderer consists of a semantic-, a textural- and a geometric branch. The textural renderer and geometric renderer then learn to reconstruct the original image from the representations obtained by the de-renderer modules.

Example Results on Cityscapes

Example user editing results on Cityscapes. (a) We move two cars closer to the camera.
(b) We rotate the car with different angles.
(c) We recover a tiny and occluded car and move it closer. Our model can synthesize the occluded region.
(d) We move a small car closer and then change its locations.

Prerequisites

  • Linux
  • Python 3.6+
  • PyTorch 0.4
  • NVIDIA GPU (GPU memory > 8GB) + CUDA 9.0

Getting Started

Installation

  1. Clone this repository

    git clone https://github.com/ysymyth/3D-SDN.git && cd 3D-SDN
    
  2. Download the pre-trained weights

    ./models/download_models.sh
    
  3. Set up the conda environment

    conda env create -f environment.yml && conda activate 3dsdn
    
  4. Compile dependencies in geometric/maskrcnn

    ./scripts/build.sh
    
  5. Set up environment variables

    source ./scripts/env.sh
    

Image Editing

We are using ./assets/0006_30-deg-right_00043.png as the example image for editing.

Semantic Branch

python semantic/vkitti_test.py \
    --ckpt ./models \
    --id vkitti-semantic \
    --root_dataset ./assets \
    --test_img 0006_30-deg-right_00043.png \
    --result ./assets/example/semantic

Geometric Branch

python geometric/scripts/main.py \
    --do test \
    --dataset vkitti \
    --mode extend \
    --source maskrcnn \
    --ckpt_dir ./models/vkitti-geometric-derender3d \
    --maskrcnn_path ./models/vkitti-geometric-maskrcnn/mask_rcnn_vkitti_0100.pth \
    --edit_json ./assets/vkitti_edit_example.json \
    --input_file ./assets/0006_30-deg-right_00043.png \
    --output_dir ./assets/example/geometric

Textural Branch

python textural/edit_vkitti.py \
    --name vkitti-textural \
    --checkpoints_dir ./models \
    --edit_dir ./assets/example/geometric/vkitti/maskrcnn/0006/30-deg-right \
    --edit_source ./assets/0006_30-deg-right_00043.png \
    --edit_num 5 \
    --segm_precomputed_path ./assets/example/semantic/0006_30-deg-right_00043.png \
    --results_dir ./assets/example \
    --feat_pose True \
    --feat_normal True

Then the edit results can be viewed at ./assets/example/vkitti-textural_edit_edit_60/index.html.

Simply do cd ./assets/example/vkitti-textural_edit_edit_60 && python -m http.server 1234 and use your browser to connect to the server. You should see the results with intermediate 2.5D representations rendered as follows.

Training/Testing

Please set up the datasets first and refer to semantic/README.md, geometric/README.md, and textural/README.md for training and testing details.

./datasets/download_vkitti.sh

Please cite their paper if you use their data.

Experiments

Virtual KITTI Benchmark

Here is a fragment of our Virtual KITTI benchmark edit specification, in the form of a json file. For each edit pair, a source image would be world/topic/source.png and a target image would be world/topic/target.png. A list of operations is specified to transform the source image to the target image. Aligned with human cognition, each operation is either moving (modify) an object from a position to another, or delete it from our view. Additionally, we may enlarge (zoom) the object or rotate the object along the y-axis (ry). Note the y-axis points downwards, consistent with the axis specification of the Virtual KITTI dataset. The u's and v's denote the objects' 3D center projected onto the image plane. We indicate a target region of interest roi on top of the target (u, v) position. There are 92 such pairs in the benchmark.

{
    "world": "0006",
    "topic": "fog",
    "source": "00055",
    "target": "00050",
    "operations": [
        {
            "type": "modify",
            "from": {"u": "750.9", "v": "213.9"},
            "to": {"u": "804.4", "v": "227.1", "roi": [194, 756, 269, 865]},
            "zoom": "1.338",
            "ry": "0.007"
        }
    ]
}

Semantic Branch

python semantic/vkitti_test.py \
    --ckpt ./models \
    --id vkitti-semantic \
    --root_dataset ./datasets/vkitti \
    --test_img benchmark \
    --benchmark_json ./assets/vkitti_edit_benchmark.json \
    --result ./assets/vkitti-benchmark/semantic

Geometric Branch

python geometric/scripts/main.py \
    --do test \
    --dataset vkitti \
    --mode extend \
    --source maskrcnn \
    --ckpt_dir ./models/vkitti-geometric-derender3d \
    --maskrcnn_path ./models/vkitti-geometric-maskrcnn/mask_rcnn_vkitti_0100.pth \
    --output_dir ./assets/vkitti-benchmark/geometric \
    --edit_json ./assets/vkitti_edit_benchmark.json

Textural Branch

python textural/edit_benchmark.py \
    --name vkitti-textural \
    --checkpoints_dir ./models \
    --dataroot ./datasets/vkitti \
    --edit_dir ./assets/vkitti-benchmark/geometric/vkitti/maskrcnn \
    --edit_list ./assets/vkitti_edit_benchmark.json \
    --experiment_name benchmark_3D \
    --segm_precomputed_path ./assets/vkitti-benchmark/semantic \
    --results_dir ./assets/vkitti-benchmark/ \
    --feat_pose True \
    --feat_normal True

Then the benchmark edit results can be viewed at ./assets/vkitti-benchmark/vkitti-textural_benchmark_3D_edit_60/index.html.

Reference

If you find this useful for your research, please cite the following paper.

@inproceedings{3dsdn2018,
  title={3D-Aware Scene Manipulation via Inverse Graphics},
  author={Yao, Shunyu and Hsu, Tzu Ming Harry and Zhu, Jun-Yan and Wu, Jiajun and Torralba, Antonio and Freeman, William T. and Tenenbaum, Joshua B.},
  booktitle={Advances in Neural Information Processing Systems},
  year={2018}
}

For any question, please contact Shunyu Yao and Tzu-Ming Harry Hsu.

Acknowledgements

This work is supported by NSF #1231216, NSF #1524817, ONR MURI N00014-16-1-2007, Toyota Research Institute, and Facebook.

The semantic branch borrows from Semantic Segmentation on MIT ADE20K dataset in PyTorch, the geometric branch borrows from pytorch-mask-rcnn and neural_renderer, and the textural branch borrows from pix2pixHD.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].