All Projects → Huangying-Zhan → Depth Vo Feat

Huangying-Zhan / Depth Vo Feat

Licence: other
Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Depth Vo Feat

Flownet2 Docker
Dockerfile and runscripts for FlowNet 2.0 (estimation of optical flow)
Stars: ✭ 137 (-53.56%)
Mutual labels:  caffe, cvpr
Dispnet Flownet Docker
Dockerfile and runscripts for DispNet and FlowNet1 (estimation of disparity and optical flow)
Stars: ✭ 78 (-73.56%)
Mutual labels:  caffe, cvpr
Flownet2
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Stars: ✭ 938 (+217.97%)
Mutual labels:  caffe, cvpr
Orn
Oriented Response Networks, in CVPR 2017
Stars: ✭ 207 (-29.83%)
Mutual labels:  caffe, cvpr
GuidedLabelling
Exploiting Saliency for Object Segmentation from Image Level Labels, CVPR'17
Stars: ✭ 35 (-88.14%)
Mutual labels:  cvpr
waifu2x-chainer
Chainer implementation of waifu2x
Stars: ✭ 137 (-53.56%)
Mutual labels:  caffe
Awesome-Computer-Vision-Paper-List
This repository contains all the papers accepted in top conference of computer vision, with convenience to search related papers.
Stars: ✭ 248 (-15.93%)
Mutual labels:  cvpr
Restoring-Extremely-Dark-Images-In-Real-Time
The project is the official implementation of our CVPR 2021 paper, "Restoring Extremely Dark Images in Real Time"
Stars: ✭ 79 (-73.22%)
Mutual labels:  cvpr
Deep Learning Model Convertor
The convertor/conversion of deep learning models for different deep learning frameworks/softwares.
Stars: ✭ 3,044 (+931.86%)
Mutual labels:  caffe
Polyaxon
Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (+905.42%)
Mutual labels:  caffe
Facenet-Caffe
facenet recognition and retrieve by using hnswlib and flask, convert tensorflow model to caffe
Stars: ✭ 30 (-89.83%)
Mutual labels:  caffe
caffe-cifar-10-and-cifar-100-datasets-preprocessed-to-HDF5
Both deep learning datasets can be imported in python directly with h5py (HDF5 format). The datasets can be directly imported or converted with a python script.
Stars: ✭ 14 (-95.25%)
Mutual labels:  caffe
Caffe Android Demo
An android caffe demo app exploiting caffe pre-trained ImageNet model for image classification
Stars: ✭ 254 (-13.9%)
Mutual labels:  caffe
Caffe Rotate Pool
Rotate RoI Align and Rotate Position Sensitive RoI Align Operation in Caffe
Stars: ✭ 16 (-94.58%)
Mutual labels:  caffe
Caffe Hrt
Heterogeneous Run Time version of Caffe. Added heterogeneous capabilities to the Caffe, uses heterogeneous computing infrastructure framework to speed up Deep Learning on Arm-based heterogeneous embedded platform. It also retains all the features of the original Caffe architecture which users deploy their applications seamlessly.
Stars: ✭ 271 (-8.14%)
Mutual labels:  caffe
uai-sdk
UCloud AI SDK
Stars: ✭ 34 (-88.47%)
Mutual labels:  caffe
fast-image-retrieval
A lightweight framework using binary hash codes and deep learning for fast image retrieval.
Stars: ✭ 22 (-92.54%)
Mutual labels:  caffe
Caffemodel2pytorch
Convert Caffe models to PyTorch
Stars: ✭ 258 (-12.54%)
Mutual labels:  caffe
adareg-monodispnet
Repository for Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction (CVPR2019)
Stars: ✭ 22 (-92.54%)
Mutual labels:  cvpr
caffe-android-opencl-fp16
Optimised Caffe with OpenCL supporting for less powerful devices such as mobile phones
Stars: ✭ 17 (-94.24%)
Mutual labels:  caffe

Introduction

This repo implements the system described in the CVPR-2018 paper:

Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction

Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid

@InProceedings{Zhan_2018_CVPR,
author = {Zhan, Huangying and Garg, Ravi and Saroj Weerasekera, Chamara and Li, Kejie and Agarwal, Harsh and Reid, Ian},
title = {Unsupervised Learning of Monocular Depth Estimation and Visual Odometry With Deep Feature Reconstruction},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

This repo includes (1) the training procedure of our models; (2) evaluation scripts for the results; (3) trained models and results.

Contents

  1. Requirements
  2. Prepare dataset
  3. Depth
  4. Depth and odometry
  5. Feature Reconstruction Loss for Depth
  6. Depth, odometry and feature
  7. Result evaluation

Part 1. Requirements

This code was tested with Python 2.7, CUDA 8.0 and Ubuntu 14.04 using Caffe.

Caffe: Add the required layers in ./caffe into your own Caffe. Remember to enable Python Layers in the Caffe configuration.

Most of our required models, trained models and results can be downloaded from here. The following instruction also includes specific links to the items.

Part 2. Download dataset and models

The main dataset used in this project is KITTI Driving Dataset. Please follow the instruction in ./data/README.md to prepare the required dataset.

For our trained models and pre-requested models, please visit here to download the models and put the models into the directory ./models.

Part 3. Depth

In this part, the training of single view depth estimation network from stereo pairs is introduced. Photometric loss is used as the main supervision signal. Only stereo pairs are used in this experiment.

  1. Update $YOUR_CAFFE_DIR in ./experiments/depth/train.sh.
  2. Run bash ./expriments/depth/train.sh.

The trained models are saved in ./snapshots/depth

Part 4. Depth and odometry

In this part, the joint training of the depth estimation network and the visual odometry network is introduced. Photometric losses for spatial pairs and temporal pairs are used as the main supervision signal. Both spatial (stereo) pairs and temporal pairs (i.e. stereo sequences) are used in this experiment.

To facilitate the training, the model trained in the Depth experiment is used as an initialization.

  1. Update $YOUR_CAFFE_DIR in ./experiments/depth_odometry/train.sh.
  2. Run bash ./expriments/depth_odometry/train.sh.

The trained models are saved in ./snapshots/depth_odometry

Part 5. Feature Reconstruction Loss for Depth

In this part, the training of single view depth estimation network from stereo pairs is introduced. Both photometric loss and feature reconstruction loss are used as the main supervision signal. Only stereo pairs are used in this experiment. There are several features we have tried for this experiment. Currently, only the example of using KITTI Feat. is shown here. More details of using other features will be updated later.

To facilitate the training, the model trained in the Depth experiment is used as an initialization.

  1. Update $YOUR_CAFFE_DIR in ./experiments/depth_feature/train.sh.
  2. Run bash ./expriments/depth_feature/train.sh.

The trained models are saved in ./snapshots/depth_feature

Part 6. Depth, odometry and feature

In this part, we show the training including feature reconstruction loss. Stereo sequences are used in this experiment.

With the feature extractor proposed in Weerasekera et.al, we can finetune the trained depth model and/or odometry model with our proposed deep feature reconstruction loss.

  1. Update $YOUR_CAFFE_DIR in ./experiments/depth_odometry_feature/train.sh.
  2. Run bash ./expriments/depth_odometry_feature/train.sh.

NOTE: The link to download the feature extractor proposed in Weerasekera et.al will be released soon.

Part 7. Result evalution

Note that the evaluation script provided here uses a different image interpolation for resizing input images (i.e. python's interpolation v.s. Caffe's interpolation), therefore the quantative result could be a little different from the published result.

Depth estimation

Using the test set (697 image-depth pairs from 28 scenes) in Eigen Split is a common protocol to evaluate depth estimation result.

We basically use the evaluation script provided by monodepth to evalute depth estimation results.

In order to run the evaluation, a npy file is required to store the predicted depths. Then run the script to evaluate the performance.

  1. Update caffe_root in ./tools/evaluation_tools.py
  2. To generate the depth prediction and save it in a npy file.
 python ./tools/evaluation_tools.py --func generate_depth_npy --dataset kitti_eigen --depth_net_def ./experiments/networks/depth_deploy.prototxt --model models/trained_models/eigen_split/Baseline.caffemodel --npy_dir ./result/depth/inv_depths_baseline.npy
  1. To evalute the predictions.
python ./tools/eval_depth.py --split eigen --predicted_inv_depth_path ./result/depth/inv_depths_baseline.npy --gt_path data/kitti_raw_data/ --min_depth 1  --max_depth 50 --garg_crop

Some of our results (inverse depths) are released and can be downloaded from here.

Visual Odometry

KITTI Odometry benchmark contains 22 stereo sequences, in which 11 sequences are provided with ground truth. The 11 sequences are used for evaluation or training of visual odometry.

  1. Update caffe_root in ./tools/evaluation_tools.py
  2. To generate the odometry predictions (relative camera motions), run the following script.
python ./tools/evaluation_tools.py --func generate_odom_result --model models/trained_models/odometry_split/Temporal.caffemodel --odom_net_def ./experiments/networks/odometry_deploy.prototxt --odom_result_dir ./result/odom_result
  1. After getting the odometry predictions, we can evalute the performance by comparing with the ground truth poses.
python ./tools/evaluation_tools.py --func eval_odom --odom_result_dir ./result/odom_result

Our odometry results are released and can be downloaded from here.

License

For academic usage, the code is released under the permissive BSD license. For any commercial purpose, please contact the authors.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].