All Projects → cogsys-tuebingen → mobilestereonet

cogsys-tuebingen / mobilestereonet

Licence: Apache-2.0 license
Lightweight stereo matching network based on MobileNet blocks

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to mobilestereonet

DispNet-TensorFlow
TensorFlow implementation of DispNet by Zhijian Jiang.
Stars: ✭ 55 (-40.22%)
Mutual labels:  stereo-vision
Structured-Light-Laser-Stripe-Reconstruction
Reconstructs a 3D stripe on the area of an object on which a laser falls as seen by the camera
Stars: ✭ 35 (-61.96%)
Mutual labels:  stereo-vision
Semantic-Mono-Depth
Geometry meets semantics for semi-supervised monocular depth estimation - ACCV 2018
Stars: ✭ 98 (+6.52%)
Mutual labels:  stereo-vision
zed-pytorch
3D Object detection using the ZED and Pytorch
Stars: ✭ 41 (-55.43%)
Mutual labels:  stereo-vision
zed-openpose
Real-time 3D multi-person with OpenPose and the ZED
Stars: ✭ 37 (-59.78%)
Mutual labels:  stereo-vision
UAV-Stereo-Vision
A program for controlling a micro-UAV for obstacle detection and collision avoidance using disparity mapping
Stars: ✭ 30 (-67.39%)
Mutual labels:  stereo-vision
dispflownet-tf
Tensorflow implementation of https://lmb.informatik.uni-freiburg.de/Publications/2016/MIFDB16 + pretrained weights + implementation of "Unsupervised Adaptation for Deep Stereo" (ICCV 2017)
Stars: ✭ 18 (-80.43%)
Mutual labels:  stereo-vision
Pandora
A stereo matching framework that will help you design your stereo matching pipeline with state of the art performances.
Stars: ✭ 31 (-66.3%)
Mutual labels:  stereo-vision
3D60
Tools accompanying the 3D60 spherical panoramas dataset
Stars: ✭ 83 (-9.78%)
Mutual labels:  stereo-vision
Unsupervised-Adaptation-for-Deep-Stereo
Code for "Unsupervised Adaptation for Deep Stereo" - ICCV17
Stars: ✭ 59 (-35.87%)
Mutual labels:  stereo-vision
edlsm pytorch
Pytorch implementation for stereo matching described in the paper: Efficient Deep learning for stereo matching
Stars: ✭ 16 (-82.61%)
Mutual labels:  stereo-vision
zed-ros2-wrapper
ROS 2 wrapper beta for the ZED SDK
Stars: ✭ 61 (-33.7%)
Mutual labels:  stereo-vision
DSGN
DSGN: Deep Stereo Geometry Network for 3D Object Detection (CVPR 2020)
Stars: ✭ 276 (+200%)
Mutual labels:  stereo-vision
RealtimeStereo
Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices (ACCV, 2020)
Stars: ✭ 110 (+19.57%)
Mutual labels:  stereo-vision
Comparison-of-Disparity-Estimation-Algorithms
Implementation of simple block matching, block matching with dynamic programming and Stereo Matching using Belief Propagation algorithm for stereo disparity estimation
Stars: ✭ 46 (-50%)
Mutual labels:  stereo-vision
RoboVision
Attempting to create a program capable of combining stereo video input , with motors and other sensors on a PC running linux , the target is embedded linux for use in a robot!
Stars: ✭ 21 (-77.17%)
Mutual labels:  stereo-vision
Calibration-Under Different-Resolution
Stereo Camera Calibration Under Different Resolution
Stars: ✭ 38 (-58.7%)
Mutual labels:  stereo-vision
jpp
Joint Perception and Planning For Efficient Obstacle Avoidance Using Stereo Vision
Stars: ✭ 42 (-54.35%)
Mutual labels:  stereo-vision
stereo-vision-fpga
Real-time binocular stereo vision FPGA system with OV5640 cameras
Stars: ✭ 20 (-78.26%)
Mutual labels:  stereo-vision
PatchMatchCuda
The PatchMatch stereo match algorithm implemented by CUDA.
Stars: ✭ 32 (-65.22%)
Mutual labels:  stereo-vision

MobileStereoNet

Python 3.6

This repository contains the code for "MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching", presented at WACV 2022 [Paper] [Supp] [arXiv] [Video Presentation].

drawing

Input image

drawing drawing

2D-MobileStereoNet prediction

drawing drawing

3D-MobileStereoNet prediction

Evaluation Results

MobileStereoNets are trained and tested using SceneFlow (SF), KITTI and DrivingStereo (DS) datasets.
In the following tables, the first columns show the training sets. For instance, in the case of "SF + KITTI2015", the model is firstly pretrained on the SceneFlow dataset, and then finetuned on KITTI images.
The results are reported in End-point Error (EPE); the lower, the better.
Note that some experiments evaluate the zero-shot cross-dataset generalizability, e.g. when the model is trained on "SF + DS" and evaluated on "KITTI2015 val" or "KITTI2012 train".
The related trained models are provided in the tables as hyperlinks.

  • 2D-MobileStereoNet
SF test DS test KITTI2015 val KITTI2012 train
SF 1.14 6.59 2.42 2.45
DS - 0.67 1.02 0.96
SF + DS - 0.73 1.04 1.04
SF + KITTI2015 - 1.41 0.79 1.18
DS + KITTI2015 - 0.79 0.65 0.91
SF + DS + KITTI2015 - 0.83 0.68 0.90
  • 3D-MobileStereoNet
SF test DS test KITTI2015 val KITTI2012 train
SF 0.80 4.50 10.30 9.38
DS - 0.60 1.16 1.14
SF + DS - 0.57 1.12 1.10
SF + KITTI2015 - 1.53 0.65 0.90
DS + KITTI2015 - 0.65 0.60 0.85
SF + DS + KITTI2015 - 0.62 0.59 0.83

Results on KITTI 2015 validation

Predictions of difference networks

Results on KITTI 2015 Leaderboard

Leaderboard
2D-MobileStereoNet on the leaderboard
3D-MobileStereoNet on the leaderboard

Computational Complexity

Requirements for computing the complexity by two methods:

pip install --upgrade git+https://github.com/sovrasov/flops-counter.pytorch.git
pip install --upgrade git+https://github.com/Lyken17/pytorch-OpCounter.git
pip install onnx

Run the following command to see the complexity in terms of number of operations and parameters.

python cost.py

You can also compute the complexity of each part of the network separately. For this, the input size of each module has been written in cost.py.

Installation

Requirements

The code is tested on:

  • Ubuntu 18.04
  • Python 3.6
  • PyTorch 1.4.0
  • Torchvision 0.5.0
  • CUDA 10.0

Setting up the environment

conda env create --file mobilestereonet.yml
conda activate mobilestereonet

SceneFlow Dataset Preparation

Download the finalpass images and the disparity data for SceneFlow FlyingThings3D, Driving and Monkaa. For both, image and disparity data, move the directories in the TRAIN and TEST directories of the Driving and Monkaa Dataset (15mm_focallength/35mm_focallength for Driving, funnyworld_x2 etc. for Monkaa) into the FlyingThings3D TRAIN and TEST directories, respectively.

It should look like this:

frames_finalpass
│
└───TEST
│   │
│   └───A
│   └───B
│   └───C
│   
│
└───TRAIN
│   │
│   └───15mm_focallength
│   └───35mm_focallength
│   └───A
│   └───a_rain_of_stones_x2
│   └─── ..

Training

Set a variable for the dataset directory, e.g. DATAPATH="/Datasets/SceneFlow/". Then, run train.py as below:

Pretraining on SceneFlow

python train.py --dataset sceneflow --datapath $DATAPATH --trainlist ./filenames/sceneflow_train.txt --testlist ./filenames/sceneflow_test.txt --epochs 20 --lrepochs "10,12,14,16:2" --batch_size 8 --test_batch_size 8 --model MSNet2D

Finetuning on KITTI

python train.py --dataset kitti --datapath $DATAPATH --trainlist ./filenames/kitti15_train.txt --testlist ./filenames/kitti15_val.txt --epochs 400 --lrepochs "200:10" --batch_size 8 --test_batch_size 8 --loadckpt ./checkpoints/pretrained.ckpt --model MSNet2D

The arguments in both cases can be set differently depending on the model, dataset and hardware resources.

Prediction

The following script creates disparity maps for a specified model:

python prediction.py --datapath $DATAPATH --testlist ./filenames/kitti15_test.txt --loadckpt ./checkpoints/finetuned.ckpt --dataset kitti --colored True --model MSNet2D

Credits

The implementation of this code is based on PSMNet and GwcNet. Also, we would like to thank the authors of THOP: PyTorch-OpCounter, Flops counter and KITTI python utils.

License

This project is released under the Apache 2.0 license.

Citation

If you use this code, please cite this paper:

@inproceedings{shamsafar2022mobilestereonet,
  title={MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching},
  author={Shamsafar, Faranak and Woerz, Samuel and Rahim, Rafia and Zell, Andreas},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={2417--2426},
  year={2022}
}

Contact

The repository is maintained by Faranak Shamsafar.
[email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].