Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → alexklwong → Unsupervised Depth Completion Visual Inertial Odometry

alexklwong / Unsupervised Depth Completion Visual Inertial Odometry

Licence: other

Tensorflow implementation of Unsupervised Depth Completion from Visual Inertial Odometry (in RA-L January 2020 & ICRA 2020)

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning machine-learning tensorflow computer-vision unsupervised-learning 3d-reconstruction depth

Projects that are alternatives of or similar to Unsupervised Depth Completion Visual Inertial Odometry

learning-topology-synthetic-data

Tensorflow implementation of Learning Topology from Synthetic Data for Unsupervised Depth Completion (RAL 2021 & ICRA 2021)

Stars: ✭ 22 (-79.82%)

Mutual labels: depth, unsupervised-learning, 3d-reconstruction

G2LTex

Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor

Stars: ✭ 104 (-4.59%)

Mutual labels: depth, 3d-reconstruction

adareg-monodispnet

Repository for Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction (CVPR2019)

Stars: ✭ 22 (-79.82%)

Mutual labels: unsupervised-learning, 3d-reconstruction

Pysad

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Stars: ✭ 87 (-20.18%)

Mutual labels: unsupervised-learning

Igr

Implicit Geometric Regularization for Learning Shapes

Stars: ✭ 90 (-17.43%)

Mutual labels: 3d-reconstruction

Vizuka

Explore high-dimensional datasets and how your algo handles specific regions.

Stars: ✭ 100 (-8.26%)

Mutual labels: unsupervised-learning

Pcl Learning

🔥PCL（Point Cloud Library）点云库学习记录

Stars: ✭ 106 (-2.75%)

Mutual labels: 3d-reconstruction

Grounder

Implementation of Grounding of Textual Phrases in Images by Reconstruction in Tensorflow

Stars: ✭ 83 (-23.85%)

Mutual labels: unsupervised-learning

Planematch

[ECCV'18 Oral] PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction

Stars: ✭ 105 (-3.67%)

Mutual labels: 3d-reconstruction

3d Recgan Extended

🔥3D-RecGAN++ in Tensorflow (TPAMI 2018)

Stars: ✭ 98 (-10.09%)

Mutual labels: 3d-reconstruction

Awesome Transfer Learning

Best transfer learning and domain adaptation resources (papers, tutorials, datasets, etc.)

Stars: ✭ 1,349 (+1137.61%)

Mutual labels: unsupervised-learning

Self Supervised Relational Reasoning

Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.

Stars: ✭ 89 (-18.35%)

Mutual labels: unsupervised-learning

Ddflow

DDFlow: Learning Optical Flow with Unlabeled Data Distillation

Stars: ✭ 101 (-7.34%)

Mutual labels: unsupervised-learning

Bundler sfm

Bundler Structure from Motion Toolkit

Stars: ✭ 1,296 (+1088.99%)

Mutual labels: 3d-reconstruction

Awsome deep geometry learning

A list of resources about deep learning solutions on 3D shape processing

Stars: ✭ 105 (-3.67%)

Mutual labels: 3d-reconstruction

Pointglr

Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds (CVPR 2020)

Stars: ✭ 86 (-21.1%)

Mutual labels: unsupervised-learning

Back2future.pytorch

Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Stars: ✭ 104 (-4.59%)

Mutual labels: unsupervised-learning

Text Summarizer

Python Framework for Extractive Text Summarization

Stars: ✭ 96 (-11.93%)

Mutual labels: unsupervised-learning

360sd Net

Pytorch implementation of ICRA 2020 paper "360° Stereo Depth Estimation with Learnable Cost Volume"

Stars: ✭ 94 (-13.76%)

Mutual labels: depth

Objectron

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes

Stars: ✭ 1,352 (+1140.37%)

Mutual labels: 3d-reconstruction

View All Similar Projects ➔

Unsupervised Depth Completion from Visual Inertial Odometry

Project VOICED: Depth Completion from Inertial Odometry and Vision

Tensorflow implementation of Unsupervised Depth Completion from Visual Inertial Odometry

Published in RA-L January 2020 and ICRA 2020

[arxiv] [poster]

Model have been tested on Ubuntu 16.04 using Python 3.5, Tensorflow 1.14

Authors: Alex Wong, Xiaohan Fei, Stephanie Tsuei

If you use this work, please cite our paper:

@article{wong2020unsupervised,
 title={Unsupervised Depth Completion From Visual Inertial Odometry},
  author={Wong, Alex and Fei, Xiaohan and Tsuei, Stephanie and Soatto, Stefano},
  journal={IEEE Robotics and Automation Letters},
  volume={5},
  number={2},
  pages={1899--1906},
  year={2020},
  publisher={IEEE}
}

Table of Contents

About sparse-to-dense depth completion
About VOICED
Setting up
Training VOICED
Downloading pretrained models
Evaluating VOICED
Related projects
License and disclaimer

About sparse-to-dense depth completion

In the sparse-to-dense depth completion problem, we seek to infer the dense depth map of a 3-D scene using an RGB image and its associated sparse depth measurements in the form of a sparse depth map, obtained either from computational methods such as SfM (Strcuture-from-Motion) or active sensors such as lidar or structured light sensors.

Input RGB image from the VOID dataset	Densified depth map -- colored and back-projected to 3-D

Input RGB image from the KITTI dataset	Densified depth map -- colored and back-projected to 3-D

To follow the literature and benchmarks for this task, you may visit: Awesome State of Depth Completion

About VOICED

VOICED is an unsupervised depth completion method that is built on top of XIVO. Unlike previous methods, we build a scaffolding of the scene using the sparse depth measurements (~5% density for outdoors driving scenarios like KITTI and ~0.5% to ~0.05% for indoors scenes like VOID) and refines the scaffolding using a light-weight network.

This paradigm allows us to achieve the state-of-the-art on the unsupervised depth completion task while reducing parameters by as much as 80% compared to prior-arts. As an added bonus, our approach does not require top of the line GPUs (e.g. Tesla V100, Titan V) and can be deployed on much cheaper hardware.

Setting up your virtual environment

We will create a virtual environment with the necessary dependencies

virtualenv -p /usr/bin/python3 voiced-py3env
source voiced-py3env/bin/activate
pip install opencv-python scipy scikit-learn Pillow matplotlib gdown
pip install numpy==1.16.4 gast==0.2.2
pip install tensorflow-gpu==1.14

Setting up your datasets

For datasets, we will use KITTI for outdoors and VOID for indoors

mkdir data
bash bash/setup_dataset_kitti.sh
bash bash/setup_dataset_void.sh

The bash script downloads the VOID dataset using gdown. However, gdown intermittently fails. As a workaround, you may download them via:

https://drive.google.com/open?id=1GGov8MaBKCEcJEXxY8qrh8Ldt2mErtWs
https://drive.google.com/open?id=1c3PxnOE0N8tgkvTgPbnUZXS6ekv7pd80
https://drive.google.com/open?id=14PdJggr2PVJ6uArm9IWlhSHO2y3Q658v

which will give you three files void_150.zip, void_500.zip, void_1500.zip.

Assuming you are in the root of the repository, to construct the same dataset structure as the setup script above:

mkdir void_release
unzip -o void_150.zip -d void_release/
unzip -o void_500.zip -d void_release/
unzip -o void_1500.zip -d void_release/
bash bash/setup_dataset_void.sh unpack-only

For more detailed instructions on downloading and using VOID and obtaining the raw rosbags, you may visit the VOID dataset webpage.

Training VOICED

To train VOICED on the KITTI dataset, you may run

sh bash/train_voiced_kitti.sh

To train VOICED on the VOID datasets, you may run

sh bash/train_voiced_void.sh

To monitor your training progress, you may use Tensorboard

tensorboard --logdir trained_models/<model_name>

Downloading our pretrained models

To use our KITTI and VOID models, you can download

gdown https://drive.google.com/uc?id=18jr9l1YvxDUzqAa_S-LYTdfi6zN1OEE9
unzip pretrained_models.zip

Note: gdown fails intermittently and complains about permission. If that happens, you may also download the models via:

https://drive.google.com/open?id=18jr9l1YvxDUzqAa_S-LYTdfi6zN1OEE9

We note that the VOID dataset has been improved (size increased from ~40K to ~47K frames) since this work was published in RA-L and ICRA 2020. We thank the individuals who reached out and gave their feedback. Hence, to reflect the changes, we retrained our model on VOID. We achieve slightly better performance than the reported numbers in the paper.

Model	MAE	RMSE	iMAE	iRMSE
VGG11 from paper	85.05	169.79	48.92	104.02
VGG11 retrained	82.27	141.99	49.23	99.67

To achieve the results, we trained for 20 epochs and use a starting learning rate of 5 x 10^-5 up to the 12th epoch, then 2.5 x 10^-5 for 4 epochs, and 1.2 x 10^-5 for the remaining 4 epochs. The weight for smoothness (w_sm) is changed to 0.15. This is reflected in the train_voiced_void.sh bash script.

Evaluating VOICED

To evaluate the pretrained VOICED on the KITTI dataset, you may run

sh bash/evaluate_voiced_kitti.sh

To evaluate the pretrained VOICED on the VOID dataset, you may run

sh bash/evaluate_voiced_void.sh

You may replace the restore_path and output_path arguments to evaluate your own checkpoints

Related projects

You may also find the following projects useful:

VOID: from Unsupervised Depth Completion from Visual Inertial Odometry. A dataset, developed by the authors, containing indoor and outdoor scenes with non-trivial 6 degrees of freedom. The dataset is published along with this work in the Robotics and Automation Letters (RA-L) 2020 and the International Conference on Robotics and Automation (ICRA) 2020.
XIVO: The Visual-Inertial Odometry system developed at UCLA Vision Lab. This work is built on top of XIVO. The VOID dataset used by this work also leverages XIVO to obtain sparse points and camera poses.
GeoSup: Geo-Supervised Visual Depth Prediction. A single image depth prediction method developed by the authors, published in the Robotics and Automation Letters (RA-L) 2019 and the International Conference on Robotics and Automation (ICRA) 2019. This work was awarded Best Paper in Robot Vision at ICRA 2019.

License and disclaimer

This software is property of the UC Regents, and is provided free of charge for research purposes only. It comes with no warranties, expressed or implied, according to these terms and conditions. For commercial use, please contact UCLA TDG.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 109

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗