Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → SeokjuLee → Insta Dm

SeokjuLee / Insta Dm

Licence: mit

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency (AAAI 2021)

Programming Languages

139335 projects - #7 most used programming language

Labels

pytorch unsupervised-learning

Projects that are alternatives of or similar to Insta Dm

Transferlearning

Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习

Stars: ✭ 8,481 (+12558.21%)

Mutual labels: unsupervised-learning

An implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).

Stars: ✭ 43 (-35.82%)

Mutual labels: unsupervised-learning

TensorFlow implementation of Deep Graph Infomax

Stars: ✭ 58 (-13.43%)

Mutual labels: unsupervised-learning

Uc Davis Cs Exams Analysis

📈 Regression and Classification with UC Davis student quiz data and exam data

Stars: ✭ 33 (-50.75%)

Mutual labels: unsupervised-learning

SuSi: Python package for unsupervised, supervised and semi-supervised self-organizing maps (SOM)

Stars: ✭ 42 (-37.31%)

Mutual labels: unsupervised-learning

Unsupervised Learning for Image Registration

Stars: ✭ 1,057 (+1477.61%)

Mutual labels: unsupervised-learning

Codebase for the Summary Loop paper at ACL2020

Stars: ✭ 26 (-61.19%)

Mutual labels: unsupervised-learning

code for unsupervised learning Neural Hidden Markov Models paper

Stars: ✭ 64 (-4.48%)

Mutual labels: unsupervised-learning

Student Teacher Anomaly Detection

Student–Teacher Anomaly Detection with Discriminative Latent Embeddings

Stars: ✭ 43 (-35.82%)

Mutual labels: unsupervised-learning

Composable GAN framework with api and user interface

Stars: ✭ 1,104 (+1547.76%)

Mutual labels: unsupervised-learning

IVA: Independent Vector Analysis implementation

Stars: ✭ 35 (-47.76%)

Mutual labels: unsupervised-learning

Unsuprevised seg via cnn

Stars: ✭ 38 (-43.28%)

Mutual labels: unsupervised-learning

Lir For Unsupervised Ir

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Stars: ✭ 53 (-20.9%)

Mutual labels: unsupervised-learning

Discogan Pytorch

PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

Stars: ✭ 961 (+1334.33%)

Mutual labels: unsupervised-learning

Weakly Supervised 3d Object Detection

Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020

Stars: ✭ 61 (-8.96%)

Mutual labels: unsupervised-learning

Domain Transfer Network

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Stars: ✭ 850 (+1168.66%)

Mutual labels: unsupervised-learning

PHP-ML - Machine Learning library for PHP

Stars: ✭ 7,900 (+11691.04%)

Mutual labels: unsupervised-learning

A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Stars: ✭ 67 (+0%)

Mutual labels: unsupervised-learning

Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)

Stars: ✭ 62 (-7.46%)

Mutual labels: unsupervised-learning

Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation

Stars: ✭ 54 (-19.4%)

Mutual labels: unsupervised-learning

View All Similar Projects ➔

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

[ Install | Datasets | Training | Models | Evaluation | Demo | References | License ]

This is the official PyTorch implementation for the system proposed in the paper :

Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

Seokju Lee, Sunghoon Im, Stephen Lin, and In So Kweon

AAAI-21 [PDF] [Project]

&Longrightarrow; Unified Visual Odometry : Our holistic visualization of depth and motion estimation from self-supervised monocular training.

If you find our work useful in your research, please consider citing our paper :

@inproceedings{lee2021learning,
  title={Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency},
  author={Lee, Seokju and Im, Sunghoon and Lin, Stephen and Kweon, In So},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},
  year={2021}
}

Install

Our code is tested with CUDA 10.2/11.0, Python 3.7.x (conda environment), and PyTorch 1.4.0/1.7.0.

At least 2 GPUs (each 12 GB) are required to train the models with batch_size=4 and maximum_number_of_instances_per_frame=3.

Create a conda environment with PyTorch library as :

conda create -n my_env python=3.7.4 pytorch=1.7.0 torchvision torchaudio cudatoolkit=11.0 -c pytorch
conda activate my_env

Install prerequisite packages listed in :

pip3 install -r requirements.txt

or install manually the following packages :

opencv-python
imageio
matplotlib
scipy==1.1.0
scikit-image
argparse
tensorboardX
blessings
progressbar2
path
tqdm
pypng
open3d==0.8.0.0

Please install torch-scatter and torch-sparse following this link.

pip3 install torch-scatter torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu110.html

Datasets

We provide our KITTI-VIS and Cityscapes-VIS dataset (download link), which is composed of pre-processed images, auto-annotated instance segmentation, and optical flow.

Images are pre-processed with SC-SfMLearner.
Instance segmentation is pre-processed with PANet.
Optical flow is pre-processed with PWC-Net.

We associate them to operate video instance segmentation as implemented in datasets/sequence_folders.py.

Please allocate the dataset as the following file structure :

kitti_256 (or cityscapes_256)
    └ image
        └ $SCENE_DIR
    └ segmentation
        └ $SCENE_DIR
    └ flow_f
        └ $SCENE_DIR
    └ flow_b
        └ $SCENE_DIR
    ├ train.txt
    └ val.txt

Training and validation scenes can be randomly generated in train.txt and val.txt.

Training

You can train the models on KITTI-VIS by running :

sh scripts/train_resnet_256_kt.sh

You can train the models on Cityscapes-VIS by running :

sh scripts/train_resnet_256_cs.sh

Please indicate the location of the dataset with $TRAIN_SET.

The hyperparameters (batch size, learning rate, loss weight, etc.) are defined in each script file and default arguments in train.py. Please also check our main paper.

During training, checkpoints will be saved in checkpoints/.

You can also start a tensorboard session by running :

tensorboard --logdir=checkpoints/ --port 8080 --bind_all

and visualize the training progress by opening https://localhost:8080 on your browser.

For convenience, we provide two breakpoints (supported with pdb), commented as BREAKPOINT in train.py. Each breakpoint represents an important point in projecting the object.

BREAKPOINT-1 : Breakpoint after the 1st projection with camera motion. Visualize ego-warped images.
BREAKPOINT-2 : Breakpoint after the 2nd projection with each object motion. Visualize fully-warped images and motion fields.

You can visualize the intermediate outputs with the commented code. This will improve your visibility on debugging the code.

Models

We provide KITTI-VIS and Cityscapes-VIS pretrained models (download link).

The architectures are based on the ResNet18 encoder. Please see the details of them in models/.

Models trained under three different conditions are released :

KITTI : Trained on KITTI-VIS using ImageNet (ResNet18) pretrained model.
CS : Trained on Cityscapes-VIS using ImageNet (ResNet18) pretrained model. This model is only for the pretraining and demo.
CS+KITTI : Pretrained on Cityscapes-VIS, and finetuned on KITTI-VIS.

Evaluation

We evaluate our depth estimation following the KITTI Eigen split. For the evaluation, it is required to download the KITTI raw dataset provided on the official website. Tested scenes are listed in kitti_eval/test_files_eigen.txt.

You can evaluate the models by running :

sh scripts/run_eigen_test.sh

Please indicate the location of the raw dataset with $DATA_ROOT, and the models with $DISP_NET.

We demonstrate our results as follows :

Models	Abs Rel	Sq Rel	RMSE	RMSE log	Acc 1	Acc 2	Acc 3
ResNet18, 832x256, ImageNet → KITTI	0.112	0.777	4.772	0.191	0.872	0.959	0.982
ResNet18, 832x256, Cityscapes → KITTI	0.109	0.740	4.547	0.184	0.883	0.962	0.983

For convenience, we also provide precomputed depth maps in this link.

Demo

We demonstrate Unified Visual Odometry, which shows the results of depth, ego-motion, and object motion holistically.

You can visualize them by running :

sh scripts/run_demo.sh

Please indicate the location of the image samples with $SCENE. We recommend to visualize Cityscapes scenes since it contains more dynamic objects than KITTI.

More results are demonstrated in this link.

References

SC-SfMLearner (NeurIPS 2019, our baseline framework)
PANet (CVPR 2018, instance segmentation for data pre-processing)
PWC-Net (CVPR 2018, optical flow for data pre-processing)
PyTorch-Sparse (PyTorch library for sparse tensor representation)
Struct2Depth (AAAI 2019, object scale loss)
Depth from Video in the Wild (ICCV 2019, motion field representation)

License

The source code is released under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 67

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗