All Projects → SeokjuLee → Insta Dm

SeokjuLee / Insta Dm

Licence: mit
Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency (AAAI 2021)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Insta Dm

Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+12558.21%)
Mutual labels:  unsupervised-learning
Tadw
An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-35.82%)
Mutual labels:  unsupervised-learning
Dgi
TensorFlow implementation of Deep Graph Infomax
Stars: ✭ 58 (-13.43%)
Mutual labels:  unsupervised-learning
Uc Davis Cs Exams Analysis
📈 Regression and Classification with UC Davis student quiz data and exam data
Stars: ✭ 33 (-50.75%)
Mutual labels:  unsupervised-learning
Susi
SuSi: Python package for unsupervised, supervised and semi-supervised self-organizing maps (SOM)
Stars: ✭ 42 (-37.31%)
Mutual labels:  unsupervised-learning
Voxelmorph
Unsupervised Learning for Image Registration
Stars: ✭ 1,057 (+1477.61%)
Mutual labels:  unsupervised-learning
Summary loop
Codebase for the Summary Loop paper at ACL2020
Stars: ✭ 26 (-61.19%)
Mutual labels:  unsupervised-learning
Neuralhmm
code for unsupervised learning Neural Hidden Markov Models paper
Stars: ✭ 64 (-4.48%)
Mutual labels:  unsupervised-learning
Student Teacher Anomaly Detection
Student–Teacher Anomaly Detection with Discriminative Latent Embeddings
Stars: ✭ 43 (-35.82%)
Mutual labels:  unsupervised-learning
Hypergan
Composable GAN framework with api and user interface
Stars: ✭ 1,104 (+1547.76%)
Mutual labels:  unsupervised-learning
Iva
IVA: Independent Vector Analysis implementation
Stars: ✭ 35 (-47.76%)
Mutual labels:  unsupervised-learning
Unsuprevised seg via cnn
Stars: ✭ 38 (-43.28%)
Mutual labels:  unsupervised-learning
Lir For Unsupervised Ir
This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"
Stars: ✭ 53 (-20.9%)
Mutual labels:  unsupervised-learning
Discogan Pytorch
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"
Stars: ✭ 961 (+1334.33%)
Mutual labels:  unsupervised-learning
Weakly Supervised 3d Object Detection
Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020
Stars: ✭ 61 (-8.96%)
Mutual labels:  unsupervised-learning
Domain Transfer Network
TensorFlow Implementation of Unsupervised Cross-Domain Image Generation
Stars: ✭ 850 (+1168.66%)
Mutual labels:  unsupervised-learning
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+11691.04%)
Mutual labels:  unsupervised-learning
Sine
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).
Stars: ✭ 67 (+0%)
Mutual labels:  unsupervised-learning
Dmgi
Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)
Stars: ✭ 62 (-7.46%)
Mutual labels:  unsupervised-learning
Rakun
Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation
Stars: ✭ 54 (-19.4%)
Mutual labels:  unsupervised-learning

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

[ Install | Datasets | Training | Models | Evaluation | Demo | References | License ]

This is the official PyTorch implementation for the system proposed in the paper :

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

Seokju Lee, Sunghoon Im, Stephen Lin, and In So Kweon

AAAI-21 [PDF] [Project]

⟹ Unified Visual Odometry : Our holistic visualization of depth and motion estimation from self-supervised monocular training.

If you find our work useful in your research, please consider citing our paper :

@inproceedings{lee2021learning,
  title={Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency},
  author={Lee, Seokju and Im, Sunghoon and Lin, Stephen and Kweon, In So},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},
  year={2021}
}

Install

Our code is tested with CUDA 10.2/11.0, Python 3.7.x (conda environment), and PyTorch 1.4.0/1.7.0.

At least 2 GPUs (each 12 GB) are required to train the models with batch_size=4 and maximum_number_of_instances_per_frame=3.

Create a conda environment with PyTorch library as :

conda create -n my_env python=3.7.4 pytorch=1.7.0 torchvision torchaudio cudatoolkit=11.0 -c pytorch
conda activate my_env

Install prerequisite packages listed in :

pip3 install -r requirements.txt

or install manually the following packages :

opencv-python
imageio
matplotlib
scipy==1.1.0
scikit-image
argparse
tensorboardX
blessings
progressbar2
path
tqdm
pypng
open3d==0.8.0.0

Please install torch-scatter and torch-sparse following this link.

pip3 install torch-scatter torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu110.html

Datasets

We provide our KITTI-VIS and Cityscapes-VIS dataset (download link), which is composed of pre-processed images, auto-annotated instance segmentation, and optical flow.

  • Images are pre-processed with SC-SfMLearner.

  • Instance segmentation is pre-processed with PANet.

  • Optical flow is pre-processed with PWC-Net.

We associate them to operate video instance segmentation as implemented in datasets/sequence_folders.py.

Please allocate the dataset as the following file structure :

kitti_256 (or cityscapes_256)
    └ image
        └ $SCENE_DIR
    └ segmentation
        └ $SCENE_DIR
    └ flow_f
        └ $SCENE_DIR
    └ flow_b
        └ $SCENE_DIR
    ├ train.txt
    └ val.txt

Training and validation scenes can be randomly generated in train.txt and val.txt.

Training

You can train the models on KITTI-VIS by running :

sh scripts/train_resnet_256_kt.sh

You can train the models on Cityscapes-VIS by running :

sh scripts/train_resnet_256_cs.sh

Please indicate the location of the dataset with $TRAIN_SET.

The hyperparameters (batch size, learning rate, loss weight, etc.) are defined in each script file and default arguments in train.py. Please also check our main paper.

During training, checkpoints will be saved in checkpoints/.

You can also start a tensorboard session by running :

tensorboard --logdir=checkpoints/ --port 8080 --bind_all

and visualize the training progress by opening https://localhost:8080 on your browser.

For convenience, we provide two breakpoints (supported with pdb), commented as BREAKPOINT in train.py. Each breakpoint represents an important point in projecting the object.

BREAKPOINT-1 : Breakpoint after the 1st projection with camera motion. Visualize ego-warped images.
BREAKPOINT-2 : Breakpoint after the 2nd projection with each object motion. Visualize fully-warped images and motion fields.

You can visualize the intermediate outputs with the commented code. This will improve your visibility on debugging the code.

Models

We provide KITTI-VIS and Cityscapes-VIS pretrained models (download link).

The architectures are based on the ResNet18 encoder. Please see the details of them in models/.

Models trained under three different conditions are released :

KITTI : Trained on KITTI-VIS using ImageNet (ResNet18) pretrained model.
CS : Trained on Cityscapes-VIS using ImageNet (ResNet18) pretrained model. This model is only for the pretraining and demo.
CS+KITTI : Pretrained on Cityscapes-VIS, and finetuned on KITTI-VIS.

Evaluation

We evaluate our depth estimation following the KITTI Eigen split. For the evaluation, it is required to download the KITTI raw dataset provided on the official website. Tested scenes are listed in kitti_eval/test_files_eigen.txt.

You can evaluate the models by running :

sh scripts/run_eigen_test.sh

Please indicate the location of the raw dataset with $DATA_ROOT, and the models with $DISP_NET.

We demonstrate our results as follows :

Models Abs Rel Sq Rel RMSE RMSE log Acc 1 Acc 2 Acc 3
ResNet18, 832x256, ImageNet → KITTI 0.112 0.777 4.772 0.191 0.872 0.959 0.982
ResNet18, 832x256, Cityscapes → KITTI 0.109 0.740 4.547 0.184 0.883 0.962 0.983

For convenience, we also provide precomputed depth maps in this link.

Demo

We demonstrate Unified Visual Odometry, which shows the results of depth, ego-motion, and object motion holistically.

You can visualize them by running :

sh scripts/run_demo.sh

Please indicate the location of the image samples with $SCENE. We recommend to visualize Cityscapes scenes since it contains more dynamic objects than KITTI.

More results are demonstrated in this link.

References

License

The source code is released under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].