Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → jzbontar → Mc Cnn

jzbontar / Mc Cnn

Licence: bsd-2-clause

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

Labels

cuda

Projects that are alternatives of or similar to Mc Cnn

Rustacuda

Rusty wrapper for the CUDA Driver API

Stars: ✭ 511 (-19.91%)

Mutual labels: cuda

Cudasift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)

Stars: ✭ 555 (-13.01%)

Mutual labels: cuda

Speedtorch

Library for faster pinned CPU <-> GPU transfer in Pytorch

Stars: ✭ 615 (-3.61%)

Mutual labels: cuda

Arrayfire Rust

Rust wrapper for ArrayFire

Stars: ✭ 525 (-17.71%)

Mutual labels: cuda

Cupy

NumPy & SciPy for GPU

Stars: ✭ 5,625 (+781.66%)

Mutual labels: cuda

Trtorch

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

Stars: ✭ 583 (-8.62%)

Mutual labels: cuda

Docker Pytorch

A Docker image for PyTorch

Stars: ✭ 505 (-20.85%)

Mutual labels: cuda

Kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA

Stars: ✭ 627 (-1.72%)

Mutual labels: cuda

Cudamat

Python module for performing basic dense linear algebra computations on the GPU using CUDA.

Stars: ✭ 554 (-13.17%)

Mutual labels: cuda

Luxcore

LuxCore source repository

Stars: ✭ 601 (-5.8%)

Mutual labels: cuda

Stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

Stars: ✭ 531 (-16.77%)

Mutual labels: cuda

Lighthouse2

Lighthouse 2 framework for real-time ray tracing

Stars: ✭ 542 (-15.05%)

Mutual labels: cuda

Taskflow

A General-purpose Parallel and Heterogeneous Task Programming System

Stars: ✭ 6,128 (+860.5%)

Mutual labels: cuda

Depthwiseconvolution

A personal depthwise convolution layer implementation on caffe by liuhao.(only GPU)

Stars: ✭ 512 (-19.75%)

Mutual labels: cuda

Twostreamfusion

Code release for "Convolutional Two-Stream Network Fusion for Video Action Recognition", CVPR 2016.

Stars: ✭ 618 (-3.13%)

Mutual labels: cuda

Convnet

A GPU implementation of Convolutional Neural Nets in C++

Stars: ✭ 506 (-20.69%)

Mutual labels: cuda

Xmrig Nvidia

Monero (XMR) NVIDIA miner

Stars: ✭ 560 (-12.23%)

Mutual labels: cuda

Slang

Making it easier to work with shaders

Stars: ✭ 627 (-1.72%)

Mutual labels: cuda

Vexcl

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP

Stars: ✭ 626 (-1.88%)

Mutual labels: cuda

Thundergbm

ThunderGBM: Fast GBDTs and Random Forests on GPUs

Stars: ✭ 586 (-8.15%)

Mutual labels: cuda

View All Similar Projects ➔

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

The repository contains

procedures to compute the stereo matching cost with a convolutional neural network;
procedures to train a convolutional neural network on the stereo matching task; and
a basic stereo method (cross-based cost aggregation, semiglobal matching, left-right consistency check, median filter, and bilateral filter);

A NVIDIA GPU with at least 6 GB of memory is required to run on the KITTI data set and 12 GB to run on the Middlebury data set. We tested the code on GTX Titan (KITTI only), K80, and GTX Titan X. The code is released under the BSD 2-Clause license. Please cite our paper if you use code from this repository in your work.

@article{zbontar2016stereo,
  title={Stereo matching by training a convolutional neural network to compare image patches},
  author={Zbontar, Jure and LeCun, Yann},
  journal={Journal of Machine Learning Research},
  volume={17},
  pages={1--32},
  year={2016}
}

Download trained networks

Compute the Matching Cost

Install Torch, OpenCV 2.4, and png++.

Run the following commands in the same directory as this README file.

Compile the shared libraries:

$ cp Makefile.proto Makefile
$ make

The command should produce two files: libadcensus.so and libcv.so.

To run the stereo algorithm on a stereo pair from the KITTI 2012 training set—

Left input image
Right input image

—download the pretrained network and call main.lua with the following arguments:

$ wget -P net/ https://s3.amazonaws.com/mc-cnn/net_kitti_fast_-a_train_all.t7
$ ./main.lua kitti fast -a predict -net_fname net/net_kitti_fast_-a_train_all.t7 -left samples/input/kittiL.png -right samples/input/kittiR.png -disp_max 70
Writing right.bin, 1 x 70 x 370 x 1226
Writing left.bin, 1 x 70 x 370 x 1226
Writing disp.bin, 1 x 1 x 370 x 1226

The first two arguments (kitti fast) are used to set the default hyperparameters of the stereo method. The outputs are stored as three binary files:

left.bin: The matching cost after semiglobal matching and cross-based cost aggregation with the left image treated as the reference image.
right.bin: Same as left.bin, but with the right image treated as the reference image.
disp.bin: The disparity map after the full stereo method.

Use the bin2png.lua script to generate the .png images like the ones above:

$ luajit samples/bin2png.lua 
Writing left.png
Writing right.png
Writing disp.png

If you wish to use the raw convolutional neural network outputs, that is, without applying cross-based cost aggregation and semiglobal matching, run the following command:

$ ./main.lua kitti fast -a predict -net_fname net/net_kitti_fast_-a_train_all.t7 -left samples/input/kittiL.png -right samples/input/kittiR.png -disp_max 70 -sm_terminate cnn
Writing right.bin, 1 x 70 x 370 x 1226
Writing left.bin, 1 x 70 x 370 x 1226
Writing disp.bin, 1 x 1 x 370 x 1226

The resulting disparity maps should look like this:

left.png
right.png

Note that -disp_max 70 is used only as an example. To reproduce our results on the KITTI data sets use -disp_max 228.

See the predict_kitti.lua script for how you might call main.lua in a loop, for multiple image pairs.

Load the Output Binary Files

You can load the binary files (if, for example, you want to apply different post-processing steps that you have written yourself) by memory mapping them. We include examples of memory mapping for some of the more popular programming languages.

Lua

  require 'torch'
  left = torch.FloatTensor(torch.FloatStorage('../left.bin')):view(1, 70, 370, 1226)
  right = torch.FloatTensor(torch.FloatStorage('../right.bin')):view(1, 70, 370, 1226)
  disp = torch.FloatTensor(torch.FloatStorage('../disp.bin')):view(1, 1, 370, 1226)

Python

  import numpy as np
  left = np.memmap('../left.bin', dtype=np.float32, shape=(1, 70, 370, 1226))
  right = np.memmap('../right.bin', dtype=np.float32, shape=(1, 70, 370, 1226))
  disp = np.memmap('../disp.bin', dtype=np.float32, shape=(1, 1, 370, 1226))

Matlab

  left = memmapfile('../left.bin', 'Format', 'single').Data;
  left = permute(reshape(left, [1226 370 70]), [3 2 1]);
  right = memmapfile('../right.bin', 'Format', 'single').Data;
  right = permute(reshape(right, [1226 370 70]), [3 2 1]);
  disparity = memmapfile('../disp.bin', 'Format', 'single').Data;
  disparity = reshape(disparity, [1226 370])';

  #include <fcntl.h>
  #include <stdio.h>
  #include <sys/mman.h>
  #include <sys/stat.h>
  #include <sys/types.h>
  int main(void)
  {
  	int fd;
  	float *left, *right, *disp;
  	fd = open("../left.bin", O_RDONLY);
  	left = mmap(NULL, 1 * 70 * 370 * 1226 * sizeof(float), PROT_READ, MAP_SHARED, fd, 0);
  	close(fd);
  	fd = open("../right.bin", O_RDONLY);
  	right = mmap(NULL, 1 * 70 * 370 * 1226 * sizeof(float), PROT_READ, MAP_SHARED, fd, 0);
  	close(fd);
  	fd = open("../disp.bin", O_RDONLY);
  	disp = mmap(NULL, 1 * 1 * 370 * 1226 * sizeof(float), PROT_READ, MAP_SHARED, fd, 0);
  	close(fd);
  	return 0;
  }

Train

This section explains how to train the convolutional neural network on the KITTI and Middlebury data sets.

KITTI

Download both

the KITTI 2012 data set and unzip it into data.kitti/unzip (you should end up with a file data.kitti/unzip/training/image_0/000000_10.png) and
the KITTI 2015 data set and unzip it into data.kitti2015/unzip (you should end up with a file data.kitti2015/unzip/training/image_2/000000_10.png).

Run the preprocessing script:

$ ./preprocess_kitti.lua
dataset 2012
1
...
389
dataset 2015
1
...
400

Run main.lua to train the network:

$ ./main.lua kitti slow -a train_tr
kitti slow -a train_tr 
conv(in=1, out=112, k=3)
cudnn.ReLU
conv(in=112, out=112, k=3)
cudnn.ReLU
conv(in=112, out=112, k=3)
cudnn.ReLU
conv(in=112, out=112, k=3)
cudnn.ReLU
nn.Reshape(128x224)
nn.Linear(224 -> 384)
cudnn.ReLU
nn.Linear(384 -> 384)
cudnn.ReLU
nn.Linear(384 -> 384)
cudnn.ReLU
nn.Linear(384 -> 384)
cudnn.ReLU
nn.Linear(384 -> 1)
cudnn.Sigmoid
...

The network is trained on a subset of all training examples with the remaining examples used for validation; to train on all examples use

$ ./main.lua kitti slow -a train_all

In the previous command, the KITTI 2012 data set is used. If you wish to train on the KITTI 2015 run

$ ./main.lua kitti2015 slow -a train_tr

To train the fast architecture instead use

$ ./main.lua kitti fast -a train_tr

The network is stored in the net/ directory.

$ ls net/
...
net_kitti2012_fast_-action_train_tr.t7
...

Middlebury

Run download_middlebury.sh to download the training data (this can take a long time, depending on your internet connection).

$ ./download_middlebury.sh

The data set is downloaded into the data.mb/unzip directory.

Compile the MiddEval3-SDK. You should end up with the computemask binary in one of the directories listed in your PATH enviromential variable.

Install ImageMagick; the preprocessing steps requires the convert binary to resize the images.

Run the preprocessing script:

$ mkdir data.mb.imperfect_gray
$ ./preprocess_mb.py imperfect gray
Adirondack
Backpack
...
testH/Staircase

The preprocessing is slow (it takes around 30 minutes) the first time it is run, because the images have to be resized.

Use main.lua to train the network:

$ ./main.lua mb slow -a train_tr

Other Useful Commands

Compute the error rate on the validation set (useful for setting hyperparameters):

$ ./main.lua kitti fast -a test_te -net_fname net/net_kitti_fast_-a_train_tr.t7 
kitti fast -a test_te -net_fname net/net_kitti_fast_-a_train_tr.t7 
0.86836290359497        0.0082842716717202
...
0.73244595527649        0.024202708004929
0.72730183601379        0.023603160822285
0.030291934952454

The validation error rate of the fast architecture on the KITTI 2012 data set is 3.029%.

***

Compute the error rate on the validation set of one dataset for a network that has been trained on a different dataset.

$ ./main.lua kitti fast -a test_te -net_fname net/net_mb_fast_-a_train_all.t7
kitti fast -a test_te -net_fname net/net_mb_fast_-a_train_all.t7
2.1474301815033	0.0071447750148986
...
1.4276049137115	0.024273838024622
1.4282908439636	0.01881285579564
1.408842086792	0.021741689597834
0.031564540460366

The validation error rate of the fast architecture tested on KITTI 2012 but trained on Middlebury is 3.156%.

***

Prepare files for submission to the KITTI and Middlebury evaluation server.

$ ./main.lua kitti fast -a submit -net_fname net/net_kitti_fast_-a_train_all.t7 
kitti fast -a submit -net_fname net/net_kitti_fast_-a_train_all.t7 
  adding: 000038_10.png (deflated 0%)
  adding: 000124_10.png (deflated 0%)
  ...
  adding: 000021_10.png (deflated 0%)

The output is stored in out/submission.zip and can be used to submit to the KITTI evaluation server.

***

Experiment with different network architectures:

$ ./main.lua kitti slow -a train_tr -l1 2 -fm 128 -l2 3 -nh2 512
kitti slow -a train_tr -l1 2 -fm 128 -l2 3 -nh2 512 
conv(in=1, out=128, k=3)
cudnn.ReLU
conv(in=128, out=128, k=3)
cudnn.ReLU
nn.Reshape(128x256)
nn.Linear(256 -> 512)
cudnn.ReLU
nn.Linear(512 -> 512)
cudnn.ReLU
nn.Linear(512 -> 512)
cudnn.ReLU
nn.Linear(512 -> 1)
cudnn.Sigmoid
...

***

Measure the runtime on a particular data set:

$ ./main.lua kitti fast -a time
kitti fast -a time 
conv(in=1, out=64, k=3)
cudnn.ReLU
conv(in=64, out=64, k=3)
cudnn.ReLU
conv(in=64, out=64, k=3)
cudnn.ReLU
conv(in=64, out=64, k=3)
nn.Normalize2
nn.StereoJoin1
0.73469495773315

It take 0.73 seconds to run the fast architecure on the KITTI 2012 data set. If you care only about the time spent in the neural network, you can terminate the stereo method early:

$ ./main.lua kitti fast -a time -sm_terminate cnn
kitti fast -a time -sm_terminate cnn 
conv(in=1, out=64, k=3)
cudnn.ReLU
conv(in=64, out=64, k=3)
cudnn.ReLU
conv(in=64, out=64, k=3)
cudnn.ReLU
conv(in=64, out=64, k=3)
nn.Normalize2
nn.StereoJoin1
0.31126594543457

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 638

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (42) 🔗