Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → felixchenfy → Monocular Visual Odometry

felixchenfy / Monocular Visual Odometry

Licence: mit

A simple monocular visual odometry (part of vSLAM) by ORB keypoints with initialization, tracking, local map and bundle adjustment. (WARNING: Hi, I'm sorry that this project is just tuned for course demo, not for real world applications !!!)

Programming Languages

cpp

1120 projects

Labels

opencv tracking eigen

Projects that are alternatives of or similar to Monocular Visual Odometry

Jpdaf tracking

A tracker based on joint probabilistic data association filtering.

Stars: ✭ 107 (-27.21%)

Mutual labels: opencv, tracking, eigen

Person Detection And Tracking

A tensorflow implementation with SSD model for person detection and Kalman Filtering combined for tracking

Stars: ✭ 193 (+31.29%)

Mutual labels: opencv, tracking

Poseflow

PoseFlow: Efficient Online Pose Tracking (BMVC'18)

Stars: ✭ 330 (+124.49%)

Mutual labels: opencv, tracking

Multi Camera Live Object Tracking

Multi-camera live traffic and object counting with YOLO v4, Deep SORT, and Flask.

Stars: ✭ 375 (+155.1%)

Mutual labels: opencv, tracking

Haskell Opencv

Haskell binding to OpenCV-3.x

Stars: ✭ 145 (-1.36%)

Mutual labels: opencv

Animoji Animate

Facial-Landmarks Detection based animating application similar to Apple-Animoji™

Stars: ✭ 142 (-3.4%)

Mutual labels: opencv

Bgslibrary

A C++ Background Subtraction Library with wrappers for Python, MATLAB, Java and GUI on QT

Stars: ✭ 1,838 (+1150.34%)

Mutual labels: opencv

Legocv

Native OpenCV Swift Framework

Stars: ✭ 140 (-4.76%)

Mutual labels: opencv

Stb Tester

Automated Testing for Set-Top Boxes and Smart TVs

Stars: ✭ 148 (+0.68%)

Mutual labels: opencv

Face Detect

A Python based tool to extract faces from any picture.

Stars: ✭ 146 (-0.68%)

Mutual labels: opencv

Sql tracker

Rails SQL Query Tracker

Stars: ✭ 144 (-2.04%)

Mutual labels: tracking

Arkitexperiments

Quick and dirty experiments with ARKit

Stars: ✭ 142 (-3.4%)

Mutual labels: opencv

Dice

Digital Image Correlation Engine (DICe): a stereo DIC application that runs on a desktop or high performance computing platform (massively parallel)

Stars: ✭ 144 (-2.04%)

Mutual labels: tracking

Find face landmarks

C++ \ Matlab library for finding face landmarks and bounding boxes in video\image sequences.

Stars: ✭ 141 (-4.08%)

Mutual labels: tracking

Jeelizfacefilter

Javascript/WebGL lightweight face tracking library designed for augmented reality webcam filters. Features : multiple faces detection, rotation, mouth opening. Various integration examples are provided (Three.js, Babylon.js, FaceSwap, Canvas2D, CSS3D...).

Stars: ✭ 2,042 (+1289.12%)

Mutual labels: tracking

Opencv transforms torchvision

opencv reimplement for transforms in torchvision

Stars: ✭ 141 (-4.08%)

Mutual labels: opencv

Self Driving Car 3d Simulator With Cnn

Implementing a self driving car using a 3D Driving Simulator. CNN will be used for training

Stars: ✭ 143 (-2.72%)

Mutual labels: opencv

Scene Text Recognition

Scene text detection and recognition based on Extremal Region(ER)

Stars: ✭ 146 (-0.68%)

Mutual labels: opencv

Img term

Display image, video or USB camera in your ANSI terminal!

Stars: ✭ 143 (-2.72%)

Mutual labels: opencv

Hololenswithopencvforunityexample

HoloLens With OpenCVforUnity Example

Stars: ✭ 142 (-3.4%)

Mutual labels: opencv

View All Similar Projects ➔

Monocular Visual Odometry

A monocular visual odometry (VO) with 4 components: initialization, tracking, local map, and bundle adjustment.

I did this project after I read the Slambook.
It's also my final project for the course EESC-432 Advanced Computer Vision in NWU in 2019 March.

A demo:

In the above figure:
Left is a video and the detected key points.
Right is the camera trajectory corresponding to the left video: White line is from VO; Green line is ground truth. Red markers on white line are the keyframes. Points are the map points, where points with red color are newly triangulated.
You can download video here.

Report

My pdf-version course report is here. It has a more clear decription about the algorithms than this README, so I suggest to read it.

1. Algorithm

This VO is achieved by the following procedures/algorithms:

1.1. Initialization

Estimate relative camera pose:
Given a video, set the 1st frame(image) as reference, and do feature matching with the 2nd frame. Compute the Essential Matrix (E) and Homography Matrix (H) between the two frames. Compute their Symmetric Transfer Error by method in ORB-SLAM paper and choose the better one (i.e., choose H if H/(E+H)>0.45). Decompose E or H into the relative pose between two frames, which is the rotation (R) and translation (t). By using OpenCV, E gives 1 result, and H gives 2 results, satisfying the criteria that points are in front of camera. For E, only single result to choose; For H, choose the one that makes the image plane and world-points plane more parallel.

Keyframe and local map:
Insert both 1st and K_th frame as keyframe. Triangulate their inlier matched keypoints to obtain the points' world positions. These points are called map points and are pushed to local map.

Check Triangulation Result
If the median triangulation angle is smaller than threshold, I will abandon this 2nd frame, and repeat the above process on frame 3, 4, etc. If at frame K, the triangulation angle is large than threshold, the initialization is completed.

Change scale:
Scale the translation t to be the same length as the ground truth, so that I can make comparison with ground truth. Then, scale the map points correspondingly.

1.2. Tracking

Keep on estimating the next camera pose. First, find map points that are in the camera view. Do feature matching to find 2d-3d correspondance between 3d map points and 2d image keypoints. Estimate camera pose by RANSAC and PnP.

1.3. Local Map

Insert keyframe: If the relative pose between current frame and previous keyframe is large enough with a translation or rotation larger than the threshold, insert current frame as a keyframe.
Do feature matching between current and previous keyframe. Get inliers by epipoloar constraint. If a inlier cv::KeyPoint hasn't been triangulated before, then triangulate it and push it to local map.

Clean up local map: Remove map points that are: (1) not in current view, (2) whose view_angle is larger than threshold, (3) rarely be matched as inlier point. (See Slambook Chapter 9.4.)

Graph/Connections between map points and frames:
Graphs are built at two stages of the algorithm:

After PnP, based on the 3d-2d correspondances, I update the connectionts between map points and current keypoints.
During triangulation, I also update the 2d-3d correspondance between current keypoints and triangulated mappoints, by either a direct link or going through previous keypoints that have been triangulated.

1.4. Bundle Adjustment

Since I've built the graph in previous step, I know what the 3d-2d point correspondances are in all frames.

Apply optimization to the previous N frames, where the cost function is the sum of reprojection error of each 3d-2d point pair. By computing the deriviate wrt (1) points 3d pos and (2) camera poses, we can solve the optimization problem using Gauss-Newton Method and its variants. These are done by g2o and its built-in datatypes of VertexSBAPointXYZ, VertexSE3Expmap, and EdgeProjectXYZ2UV. See Slambook Chapter 4 and Chapter 7.8.2 for more details.

1.5. Other details

Image features:
Extract ORB keypoints and features. Then, a simple grid sampling is applied to obtain keypoints uniformly distributed across image.

Feature matching:
Two methods are implemented, where good match is:
(1) Feature's distance is smaller than threshold, described in Slambook.
(2) Ratio of smallest and second smallest distance is smaller than threshold, proposed in Prof. Lowe's 2004 SIFT paper.
The first one is adopted, which is easier to tune the parameters to generate fewer error matches.

2. File Structure

2.1. Folders

include/: c++ header files.
src/: c++ definitions.
test/: Testing scripts for c++ functions.
data/: Store images.

Main scripts and classes for VO are in include/my_slam/vo/. I referenced this structure from the Slambook Chapter 9.

2.2. Functions

Functions are declared in include/. Some of its folders contain a README. See the tree structure for overview:

include/
└── my_slam
    ├── basics
    │   ├── basics.h
    │   ├── config.h
    │   ├── yaml.h
    │   ├── eigen_funcs.h
    │   ├── opencv_funcs.h
    │   └── README.md
    ├── common_include.h
    ├── display
    │   ├── pcl_display.h
    │   └── pcl_display_lib.h
    ├── geometry
    │   ├── camera.h
    │   ├── epipolar_geometry.h
    │   ├── feature_match.h
    │   └── motion_estimation.h
    ├── optimization
    │   └── g2o_ba.h
    └── vo
        ├── frame.h
        ├── map.h
        ├── mappoint.h
        ├── README.md
        ├── vo_commons.h
        ├── vo_io.h
        └── vo.h

3. Dependencies

Require: OpenCV, Eigen, Sophus, g2o.
See details below:

(1) OpenCV 4.0
Tutorial for install OpenCV 4.0: link.

You may need a version newer than 3.4.5, because I used this function:
filterHomographyDecompByVisibleRefpoints, which appears in OpenCV 3.4.5.

(2) Eigen 3
It's about matrix arithmetic. See its official page. Install by:

$ sudo apt-get install libeigen3-dev

(Note: Eigen only has header files. No ".so" or ".a" files.)

(3) Sophus

It's based on Eigen, and contains datatypes for Lie Group and Lie Algebra (SE3/SO3/se3/so3).

Download this lib here: https://github.com/strasdat/Sophus. Do cmake and make. Since I failed to make install it, I manually moved “/Sophus/sophus” to “/usr/include/sophus”, and moved “libSophus.so” to “usr/lib”. Then, in my CMakeLists.txt, I add this: set (THIRD_PARTY_LIBS libSophus.so ).

If there is an error of "unit_complex_.real() = 1.;" replace it and its following line with "unit_complex_ = std::complex(1,0);"

(4) g2o

First install either of the following two packages:

$ sudo apt-get install libsuitesparse $ sudo apt-get install libsuitesparse-dev

Download here: https://github.com/RainerKuemmerle/g2o.
Checkout to the last version in year 2017. Do cmake, make, make install.

4. How to Run

$ mkdir build && mkdir lib && mkdir bin
$ cd build && cmake .. && make && cd ..

Then, set up things in config/config.yaml, and run:

$ bin/run_vo config/config.yaml

5. Results

I tested the current implementation on TUM fr1_desk and fr1_xyz dataset, but both performances are bad. I guess its due to too few detected keypoints, which causes too few keypoints matches. The solution I guess is to use the ORB-SLAM's method for extracting enough uniformly destributed keypoints across different scales, and doing guided matching based on the estimated camera motion.

Despite bad performance on fr1 dataset, my program does work well on this New Tsukuba Stereo Database, whose images and scenes are synthetic and have abundant high quality keypoints. The results are shown below.

I tested my VO with 3 different settings: (1) No optimization. (2) Optimize on map points and current camera pose. (3) Optimize on previous 5 camera poses. See videos below:

(1) No optimization:

(2) Optimize on points + current pose:

(2) Optimize on prev 5 poses:

The result shows: (1) Optimization improves accuracy. (2) The estiamted trajectory is close to the ground truth.

6. Reference

(1) Slambook:
I read this Dr. Xiang Gao's Slambook before writing code. The book provides both vSLAM theory as well as easy-to-read code examples in every chapter.

The framework of my program is based on Chapter 9 of Slambook, which is a RGB-D visual odometry project. Classes declared in include/vo/ are based on this Chapter.

These files are mainly copied or built on top of the Slambook's code:

I also borrowed other codes from the slambook. But since they are small pieces and lines, I didn't list them here.

In short, the Slambook provides huge help for me and my this project.

(2) Matlab VO tutorial:
This is a matlab tutorial of monocular visual odometry. Since Slambook doesn't write a lot about monocular VO, I resorted to this Matlab tutorial for solution. It helped me a lot for getting clear the whole workflow.

The dataset I used is also the same as this Matlab tutorial, which is the New Tsukuba Stereo Database.

(3) ORB-SLAM/ORB-SLAM2 papers

I borrowed its code of the criteria for choosing Essential or Homography (for decomposition to obtain relative camera pose.). The copied functions are checkEssentialScore and checkHomographyScore in motion_estimation.h.

7. To Do

Improvements

In bundle adjustment, I cannot optimize (1) multiple frames and (b) map points at the same time. It returns huge error. I haven't figure out why.
If certain region of the image has only few keypoints, then extract more.
Utilize epipolar constraint to do feature matching.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 147

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (5) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

felixchenfy / Monocular Visual Odometry

Programming Languages

Labels

Projects that are alternatives of or similar to Monocular Visual Odometry

Monocular Visual Odometry

Report

Directory

1. Algorithm

1.1. Initialization

1.2. Tracking

1.3. Local Map

1.4. Bundle Adjustment

1.5. Other details

2. File Structure

2.1. Folders

2.2. Functions

3. Dependencies

4. How to Run

5. Results

6. Reference

7. To Do