All Projects → alexklwong → learning-topology-synthetic-data

alexklwong / learning-topology-synthetic-data

Licence: other
Tensorflow implementation of Learning Topology from Synthetic Data for Unsupervised Depth Completion (RAL 2021 & ICRA 2021)

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to learning-topology-synthetic-data

void-dataset
Visual Odometry with Inertial and Depth (VOID) dataset
Stars: ✭ 74 (+236.36%)
Mutual labels:  3d-reconstruction, ral, 3d-vision, icra, depth-completion
adareg-monodispnet
Repository for Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction (CVPR2019)
Stars: ✭ 22 (+0%)
Mutual labels:  unsupervised-learning, 3d-reconstruction, 3d-vision, self-supervised-learning
EgoNet
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"
Stars: ✭ 111 (+404.55%)
Mutual labels:  3d-vision, kitti, self-supervised-learning
GeoSup
Code for Geo-Supervised Visual Depth Prediction
Stars: ✭ 27 (+22.73%)
Mutual labels:  depth, sensor-fusion, icra
FisheyeDistanceNet
FisheyeDistanceNet
Stars: ✭ 33 (+50%)
Mutual labels:  depth, depth-estimation, self-supervised-learning
Unsupervised Depth Completion Visual Inertial Odometry
Tensorflow implementation of Unsupervised Depth Completion from Visual Inertial Odometry (in RA-L January 2020 & ICRA 2020)
Stars: ✭ 109 (+395.45%)
Mutual labels:  depth, unsupervised-learning, 3d-reconstruction
M4Depth
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"
Stars: ✭ 62 (+181.82%)
Mutual labels:  depth, depth-estimation
dynamic plane convolutional onet
[WACV 2021] Dynamic Plane Convolutional Occupancy Networks
Stars: ✭ 25 (+13.64%)
Mutual labels:  3d-reconstruction, 3d-vision
sc depth pl
Pytorch Lightning Implementation of SC-Depth (V1, V2...) for Unsupervised Monocular Depth Estimation.
Stars: ✭ 86 (+290.91%)
Mutual labels:  depth-estimation, self-supervised-learning
Structured-Light-Depth-Acquisition
Matlab Implementation of a 3D Reconstruction algorithm
Stars: ✭ 48 (+118.18%)
Mutual labels:  3d-reconstruction, depth-estimation
PaiConvMesh
Official repository for the paper "Learning Local Neighboring Structure for Robust 3D Shape Representation"
Stars: ✭ 19 (-13.64%)
Mutual labels:  3d-reconstruction, 3d-vision
improving segmentation with selfsupervised depth
[CVPR21] Implementation of our work "Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation"
Stars: ✭ 189 (+759.09%)
Mutual labels:  depth-estimation, self-supervised-learning
awesome-contrastive-self-supervised-learning
A comprehensive list of awesome contrastive self-supervised learning papers.
Stars: ✭ 748 (+3300%)
Mutual labels:  unsupervised-learning, self-supervised-learning
Normal-Assisted-Stereo
[CVPR 2020] Normal Assisted Stereo Depth Estimation
Stars: ✭ 95 (+331.82%)
Mutual labels:  depth-estimation, 3d-vision
EagerMOT
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]
Stars: ✭ 249 (+1031.82%)
Mutual labels:  sensor-fusion, icra
SimCLR-in-TensorFlow-2
(Minimally) implements SimCLR (https://arxiv.org/abs/2002.05709) in TensorFlow 2.
Stars: ✭ 75 (+240.91%)
Mutual labels:  unsupervised-learning, self-supervised-learning
Indoor-SfMLearner
[ECCV'20] Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
Stars: ✭ 115 (+422.73%)
Mutual labels:  unsupervised-learning, depth-estimation
G2LTex
Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor
Stars: ✭ 104 (+372.73%)
Mutual labels:  depth, 3d-reconstruction
PIC
Parametric Instance Classification for Unsupervised Visual Feature Learning, NeurIPS 2020
Stars: ✭ 41 (+86.36%)
Mutual labels:  unsupervised-learning, self-supervised-learning
continuous-fusion
(ROS) Sensor fusion algorithm for camera+lidar.
Stars: ✭ 26 (+18.18%)
Mutual labels:  sensor-fusion, kitti

Learning Topology from Synthetic Data for Unsupervised Depth Completion

Tensorflow implementation of Learning Topology from Synthetic Data for Unsupervised Depth Completion

Published in RA-L January 2021 and ICRA 2021

[publication] [arxiv] [talk]

Model have been tested on Ubuntu 16.04, 20.04 using Python 3.5, 3.6, Tensorflow 1.14, 1.15

Authors: Alex Wong, Safa Cicek

If this work is useful to you, please cite our paper:

@article{wong2021learning,
    title={Learning topology from synthetic data for unsupervised depth completion},
    author={Wong, Alex and Cicek, Safa and Soatto, Stefano},
    journal={IEEE Robotics and Automation Letters},
    volume={6},
    number={2},
    pages={1495--1502},
    year={2021},
    publisher={IEEE}
}

Table of Contents

  1. About sparse-to-dense depth completion
  2. About ScaffNet and FusionNet
  3. Setting up
  4. Downloading pretrained models
  5. Running ScaffNet and FusionNet
  6. Training ScaffNet and FusionNet
  7. Related projects
  8. License and disclaimer

About sparse-to-dense depth completion

In the sparse-to-dense depth completion problem, we seek to infer the dense depth map of a 3-D scene using an RGB image and its associated sparse depth measurements in the form of a sparse depth map, obtained either from computational methods such as SfM (Strcuture-from-Motion) or active sensors such as lidar or structured light sensors.

RGB image from the VOID dataset Our densified depth map -- colored and backprojected to 3D
RGB image from the KITTI dataset Our densified depth map -- colored and backprojected to 3D

To follow the literature and benchmarks for this task, you may visit: Awesome State of Depth Completion

About ScaffNet and FusionNet

We propose a method that leverages the abundance of synthetic data (where groundtruth comes for free) and unannotated real data to learn cross modal fusion for depth completion.

The challenge of Sim2Real: There exists a covariate shift, mostly photometric, between synthetic and real domains, making it difficult to transfer models trained on synthetic source data to the target real data. Instead one might observe that, unlike photometry, the geometry persists for a given scene across domains. So we can bypass the photometric domain gap by learning the association not from photometry to geometry or from images to shapes, but from sparse geometry (point clouds) to topology by using the abundance of synthetic data. In doing so we can bypass the synthetic to real domain gap without having to face concerns about covariate shift and domain adaptation.

ScaffNet: The challenge of sparse-to-dense depth comppletion is precisely the sparsity. To learn a representation of the sparse point cloud that can capture the complex geometry of objects, we introduce ScaffNet, an encoder decoder network augmented with our version of Spatial Pyramid Pooling (SPP) module. Our SPP module performs max pooling with various kernel sizes to densify the inputs and to capture different receptive fields and learns to balance the tradeoff between density and details of the sparse point cloud.

FusionNet: Because the topology estimated by ScaffNet is only informed by sparse points, if there are very few points or no points at all then we can expect the performance of ScaffNet to degrade. This is where the image comes back into the picture. We propose a second network that refines the initial estimate by incorporating the information from the image to amend any mistakes. Here we show our full inference pipeline:

First, ScaffNet estimates an initial scene topology from the sparse point cloud. Then FusionNet performs cross modality fusion and learns the residual beta from the image to refine the coarse topology estimate. By learning the residual around the initial estimate, we alleviate Fusionnet from the need to learn depth from scratch, which allows us to achieve better results with fewer parameters and faster inference.

Setting up your virtual environment

We will create a virtual environment with the necessary dependencies

virtualenv -p /usr/bin/python3 scaffnet-fusionnet-py3env
source scaffnet-fusionnet-py3env/bin/activate
pip install opencv-python scipy scikit-learn scikit-image Pillow matplotlib gdown
pip install tensorflow-gpu==1.15

Setting up your datasets

For datasets, we will use Virtual KITTI 1 and KITTI for outdoors and SceneNet and VOID for indoors.

mkdir data
ln -s /path/to/virtual_kitti data/
ln -s /path/to/kitti_raw_data data/
ln -s /path/to/kitti_depth_completion data/
ln -s /path/to/scenenet data/
ln -s /path/to/void_release data/

In case you do not already have KITTI and VOID datasets downloaded, we provide download scripts for them:

bash bash/setup_dataset_kitti.sh
bash bash/setup_dataset_void.sh

The bash/setup_dataset_void.sh script downloads the VOID dataset using gdown. However, gdown intermittently fails. As a workaround, you may download them via:

https://drive.google.com/open?id=1GGov8MaBKCEcJEXxY8qrh8Ldt2mErtWs
https://drive.google.com/open?id=1c3PxnOE0N8tgkvTgPbnUZXS6ekv7pd80
https://drive.google.com/open?id=14PdJggr2PVJ6uArm9IWlhSHO2y3Q658v

which will give you three files void_150.zip, void_500.zip, void_1500.zip.

Assuming you are in the root of the repository, to construct the same dataset structure as the setup script above:

mkdir void_release
unzip -o void_150.zip -d void_release/
unzip -o void_500.zip -d void_release/
unzip -o void_1500.zip -d void_release/
bash bash/setup_dataset_void.sh unpack-only

For more detailed instructions on downloading and using VOID and obtaining the raw rosbags, you may visit the VOID dataset webpage.

Downloading our pretrained models

To use our ScaffNet models trained Virtual KITTI and SceneNet and our FusionNet models trained on KITTI and VOID models, you can download them from Google Drive

gdown https://drive.google.com/uc?id=1K5aiI3aIwsMC85LcwgeUAeEQkxK-vEdH
unzip pretrained_models.zip

Note: gdown fails intermittently and complains about permission. If that happens, you may also download the models via:

https://drive.google.com/file/d/1K5aiI3aIwsMC85LcwgeUAeEQkxK-vEdH/view?usp=sharing

We note that if you would like to directly train FusionNet, you may use our pretrained ScaffNet model.

In addition to models trained with code at the time of the submission of our paper, for reproducibility, we've retrained both ScaffNet and FusionNet after code clean up. You will find both paper and retrained models in the pretrained_models directory. For example

pretrained_models/fusionnet/kitti/paper/fusionnet.ckpt-kitti
pretrained_models/fusionnet/kitti/retrained/fusionnet.ckpt-kitti

For KITTI:

Model MAE RMSE iMAE iRMSE
ScaffNet (paper) 318.42 1425.54 1.40 5.01
ScaffNet (retrained) 317.17 1425.95 1.40 4.95
FusionNet (paper) 286.32 1182.78 1.18 3.55
FusionNet (retrained) 282.97 1184.36 1.17 3.48

For VOID:

Model MAE RMSE iMAE iRMSE
ScaffNet (paper) 72.88 162.75 42.56 90.15
ScaffNet (retrained) 65.90 153.96 35.62 77.73
FusionNet (paper) 60.68 122.01 35.24 67.34
FusionNet (retrained) 56.24 117.94 31.58 63.78

Running ScaffNet and FusionNet

To run our pretrained ScaffNet on the KITTI dataset, you may use

bash bash/run_scaffnet_kitti.sh

To run our pretrained ScaffNet on the VOID dataset, you may use

bash bash/run_scaffnet_void1500.sh

To run our pretrained FusionNet on the KITTI dataset, you may use

bash bash/run_fusionnet_kitti.sh

To run our pretrained FusionNet on the VOID dataset, you may use

bash bash/run_fusionnet_void1500.sh

If you have data that is not preprocessed into form outputted by our setup scripts, you can also run our standalones:

bash bash/run_fusionnet_standalone_kitti.sh
bash bash/run_fusionnet_standalone_void1500.sh

You may replace the restore_path and output_path arguments to evaluate your own checkpoints

Additionally, we have scripts to do batch evaluation over a directory of checkpoints:

bash bash/run_batch_scaffnet_kitti.sh path/to/directory <first checkpoint> <increment between checkpoints> <last checkpoint>
bash bash/run_batch_scaffnet_void1500.sh path/to/directory <first checkpoint> <increment between checkpoints> <last checkpoint>
bash bash/run_batch_fusionnet_kitti.sh path/to/directory <first checkpoint> <increment between checkpoints> <last checkpoint>
bash bash/run_batch_fusionnet_void1500.sh path/to/directory <first checkpoint> <increment between checkpoints> <last checkpoint>

Training ScaffNet and FusionNet

To train ScaffNet on the Virtual KITTI dataset, you may run

sh bash/train_scaffnet_vkitti.sh

To train ScaffNet on the SceneNet dataset, you may run

sh bash/train_scaffnet_scenenet.sh

To monitor your training progress, you may use Tensorboard

tensorboard --logdir trained_scaffnet/vkitti/<model_name>
tensorboard --logdir trained_scaffnet/scenenet/<model_name>

To train FusionNet, we will need to generate ScaffNet predictions first using:

bash bash/setup_dataset_vkitti_to_kitti.sh
bash bash/setup_dataset_scenenet_to_void.sh

The bash scripts by default will use our pretrained models. If you've trained your own models and would like to use them, you may modify the above scripts to point to your model checkpoint.

To train FusionNet on the KITTI dataset, you may run

sh bash/train_fusionnet_kitti.sh

To train FusionNet on the VOID dataset, you may run

sh bash/train_fusionnet_void1500.sh

To monitor your training progress, you may use Tensorboard

tensorboard --logdir trained_fusionnet/kitti/<model_name>
tensorboard --logdir trained_fusionnet/void/<model_name>

Related projects

You may also find the following projects useful:

  • VOICED: Unsupervised Depth Completion from Visual Inertial Odometry. An unsupervised sparse-to-dense depth completion method, developed by the authors. The paper introduces Scaffolding for depth completion and a light-weight network to refine it. This work is published in the Robotics and Automation Letters (RA-L) 2020 and the International Conference on Robotics and Automation (ICRA) 2020.
  • VOID: from Unsupervised Depth Completion from Visual Inertial Odometry. A dataset, developed by the authors, containing indoor and outdoor scenes with non-trivial 6 degrees of freedom. The dataset is published along with this work in the Robotics and Automation Letters (RA-L) 2020 and the International Conference on Robotics and Automation (ICRA) 2020.
  • XIVO: The Visual-Inertial Odometry system developed at UCLA Vision Lab. This work is built on top of XIVO. The VOID dataset used by this work also leverages XIVO to obtain sparse points and camera poses.
  • GeoSup: Geo-Supervised Visual Depth Prediction. A single image depth prediction method developed by the authors, published in the Robotics and Automation Letters (RA-L) 2019 and the International Conference on Robotics and Automation (ICRA) 2019. This work was awarded Best Paper in Robot Vision at ICRA 2019.
  • AdaReg: Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction. A single image depth prediction method that introduces adaptive regularization. This work was published in the proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) 2019.

We also have works in adversarial attacks on depth estimation methods:

  • Stereopagnosia: Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations. Adversarial perturbations for stereo depth estimation, published in the Proceedings of AAAI Conference on Artificial Intelligence (AAAI) 2021.
  • Targeted Attacks for Monodepth: Targeted Adversarial Perturbations for Monocular Depth Prediction. Targeted adversarial perturbations attacks for monocular depth estimation, published in the proceedings of Neural Information Processing Systems (NeurIPS) 2020.

License and disclaimer

This software is property of the UC Regents, and is provided free of charge for research purposes only. It comes with no warranties, expressed or implied, according to these terms and conditions. For commercial use, please contact UCLA TDG.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].