All Projects → ClementPinard → Pytorch Correlation Extension

ClementPinard / Pytorch Correlation Extension

Licence: mit
Custom implementation of Corrleation Module

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pytorch Correlation Extension

LSUV-keras
Simple implementation of the LSUV initialization in keras
Stars: ✭ 65 (-75.75%)
Mutual labels:  deeplearning
Deep-Learning
It contains the coursework and the practice I have done while learning Deep Learning.🚀 👨‍💻💥 🚩🌈
Stars: ✭ 21 (-92.16%)
Mutual labels:  deeplearning
Yolov5-deepsort-driverDistracted-driving-behavior-detection
基于深度学习的驾驶员分心驾驶行为(疲劳+危险行为)预警系统使用YOLOv5+Deepsort实现驾驶员的危险驾驶行为的预警监测
Stars: ✭ 107 (-60.07%)
Mutual labels:  deeplearning
Image-Forgery-using-Deep-Learning
Image Forgery Detection using Deep Learning, implemented in PyTorch.
Stars: ✭ 46 (-82.84%)
Mutual labels:  deeplearning
battery-rul-estimation
Remaining Useful Life (RUL) estimation of Lithium-ion batteries using deep LSTMs
Stars: ✭ 25 (-90.67%)
Mutual labels:  deeplearning
recurrent-defocus-deblurring-synth-dual-pixel
Reference github repository for the paper "Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data". We propose a procedure to generate realistic DP data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Lev…
Stars: ✭ 30 (-88.81%)
Mutual labels:  deeplearning
stylizeapp
A flask website for style transfer
Stars: ✭ 34 (-87.31%)
Mutual labels:  deeplearning
Jejunet
Real-Time Video Segmentation on Mobile Devices with DeepLab V3+, MobileNet V2. Worked on the project in 🏝 Jeju island
Stars: ✭ 258 (-3.73%)
Mutual labels:  deeplearning
Multi-Face-Comparison
This repo is meant for backend API for face comparision and computer vision. It is built on python flask framework
Stars: ✭ 20 (-92.54%)
Mutual labels:  deeplearning
dilation-keras
Multi-Scale Context Aggregation by Dilated Convolutions in Keras.
Stars: ✭ 72 (-73.13%)
Mutual labels:  deeplearning
face recognition
Face Recognition based on DeepID
Stars: ✭ 22 (-91.79%)
Mutual labels:  deeplearning
Ensemble-Pytorch
A unified ensemble framework for PyTorch to improve the performance and robustness of your deep learning model.
Stars: ✭ 407 (+51.87%)
Mutual labels:  deeplearning
WSCNNTDSaliency
[BMVC17] Weakly Supervised Saliency Detection with A Category-Driven Map Generator
Stars: ✭ 19 (-92.91%)
Mutual labels:  deeplearning
ohsome2label
Historical OpenStreetMap Objects to Machine Learning Training Samples
Stars: ✭ 33 (-87.69%)
Mutual labels:  deeplearning
deep sort
Deep Sort algorithm C++ version
Stars: ✭ 60 (-77.61%)
Mutual labels:  deeplearning
deepstreet
Traffic Sign Recognition - Fine tuning VGG16 + GTSRB
Stars: ✭ 60 (-77.61%)
Mutual labels:  deeplearning
Similarity-Adaptive-Deep-Hashing
Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization (TPAMI2018)
Stars: ✭ 18 (-93.28%)
Mutual labels:  deeplearning
Fixmatch Pytorch
Unofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"
Stars: ✭ 259 (-3.36%)
Mutual labels:  deeplearning
Data-Analysis
Different types of data analytics projects : EDA, PDA, DDA, TSA and much more.....
Stars: ✭ 22 (-91.79%)
Mutual labels:  deeplearning
HistoGAN
Reference code for the paper HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms (CVPR 2021).
Stars: ✭ 158 (-41.04%)
Mutual labels:  deeplearning

PyPI

Pytorch Correlation module

this is a custom C++/Cuda implementation of Correlation module, used e.g. in FlowNetC

This tutorial was used as a basis for implementation, as well as NVIDIA's cuda code

  • Build and Install C++ and CUDA extensions by executing python setup.py install,
  • Benchmark C++ vs. CUDA by running python benchmark.py {cpu, cuda},
  • Run gradient checks on the code by running python grad_check.py --backend {cpu, cuda}.

Requirements

This module is expected to compile for Pytorch 1.6.

Installation

this module is available on pip

pip install spatial-correlation-sampler

For a cpu-only version, you can install from source with

python setup_cpu.py install

Known Problems

This module needs compatible gcc version and CUDA to be compiled. Namely, CUDA 9.1 and below will need gcc5, while CUDA 9.2 and 10.0 will need gcc7 See this issue for more information

Usage

API has a few difference with NVIDIA's module

  • output is now a 5D tensor, which reflects the shifts horizontal and vertical.
input (B x C x H x W) -> output (B x PatchH x PatchW x oH x oW)
  • Output sizes oH and oW are no longer dependant of patch size, but only of kernel size and padding
  • Patch size patch_size is now the whole patch, and not only the radii.
  • stride1 is now stride andstride2 is dilation_patch, which behave like dilated convolutions
  • equivalent max_displacement is then dilation_patch * (patch_size - 1) / 2.
  • dilation is a new parameter, it acts the same way as dilated convolution regarding the correlation kernel
  • to get the right parameters for FlowNetC, you would have
kernel_size=1
patch_size=21,
stride=1,
padding=0,
dilation=1
dilation_patch=2

Example

import torch
from spatial_correlation_sampler import SpatialCorrelationSampler, 

device = "cuda"
batch_size = 1
channel = 1
H = 10
W = 10
dtype = torch.float32

input1 = torch.randint(1, 4, (batch_size, channel, H, W), dtype=dtype, device=device, requires_grad=True)
input2 = torch.randint_like(input1, 1, 4).requires_grad_(True)

#You can either use the function or the module. Note that the module doesn't contain any parameter tensor.

#function

out = spatial_correlation_sample(input1,
	                         input2,
                                 kernel_size=3,
                                 patch_size=1,
                                 stride=2,
                                 padding=0,
                                 dilation=2,
                                 dilation_patch=1)

#module

correlation_sampler = SpatialCorrelationSampler(
    kernel_size=3,
    patch_size=1,
    stride=2,
    padding=0,
    dilation=2,
    dilation_patch=1)
out = correlation_sampler(input1, input2)

Benchmark

  • default parameters are from benchmark.py, FlowNetC parameters are same as use in FlowNetC with a batch size of 4, described in this paper, implemented here and here.
  • Feel free to file an issue to add entries to this with your hardware !

CUDA Benchmark

  • See here for a benchmark script working with NVIDIA's code, and Pytorch.
  • Benchmark are launched with environment variable CUDA_LAUNCH_BLOCKING set to 1.
  • Only float32 is benchmarked.
  • FlowNetC correlation parameters where launched with the following command:
CUDA_LAUNCH_BLOCKING=1 python benchmark.py --scale ms -k1 --patch 21 -s1 -p0 --patch_dilation 2 -b4 --height 48 --width 64 -c256 cuda -d float

CUDA_LAUNCH_BLOCKING=1 python NV_correlation_benchmark.py --scale ms -k1 --patch 21 -s1 -p0 --patch_dilation 2 -b4 --height 48 --width 64 -c256
implementation Correlation parameters device pass min time avg time
ours default 980 GTX forward 5.745 ms 5.851 ms
ours default 980 GTX backward 77.694 ms 77.957 ms
NVIDIA default 980 GTX forward 13.779 ms 13.853 ms
NVIDIA default 980 GTX backward 73.383 ms 73.708 ms
ours FlowNetC 980 GTX forward 26.102 ms 26.179 ms
ours FlowNetC 980 GTX backward 208.091 ms 208.510 ms
NVIDIA FlowNetC 980 GTX forward 35.363 ms 35.550 ms
NVIDIA FlowNetC 980 GTX backward 283.748 ms 284.346 ms

Notes

  • The overhead of our implementation regarding kernel_size > 1 during backward needs some investigation, feel free to dive in the code to improve it !
  • The backward pass of NVIDIA is not entirely correct when stride1 > 1 and kernel_size > 1, because not everything is computed, see here.

CPU Benchmark

  • No other implementation is avalaible on CPU.
  • It is obviously not recommended to run it on CPU if you have a GPU.
Correlation parameters device pass min time avg time
default E5-2630 v3 @ 2.40GHz forward 159.616 ms 188.727 ms
default E5-2630 v3 @ 2.40GHz backward 282.641 ms 294.194 ms
FlowNetC E5-2630 v3 @ 2.40GHz forward 2.138 s 2.144 s
FlowNetC E5-2630 v3 @ 2.40GHz backward 7.006 s 7.075 s
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].