All Projects → soumith → Cuda Convnet2.torch

soumith / Cuda Convnet2.torch

Licence: apache-2.0
Torch7 bindings for cuda-convnet2 kernels!

Labels

Projects that are alternatives of or similar to Cuda Convnet2.torch

Cub
Cooperative primitives for CUDA C++.
Stars: ✭ 883 (+2002.38%)
Mutual labels:  cuda
Ktt
Kernel Tuning Toolkit
Stars: ✭ 33 (-21.43%)
Mutual labels:  cuda
Style Feature Reshuffle
caffe implementation of "Arbitrary Style Transfer with Deep Feature Reshuffle"
Stars: ✭ 38 (-9.52%)
Mutual labels:  cuda
Cuda Cnn
Implementation of a simple CNN using CUDA
Stars: ✭ 29 (-30.95%)
Mutual labels:  cuda
Simple Sh Datascience
A collection of Bash scripts and Dockerfiles to install data science Tool, Lib and application
Stars: ✭ 32 (-23.81%)
Mutual labels:  cuda
Cure
Stars: ✭ 36 (-14.29%)
Mutual labels:  cuda
Graphvite
GraphVite: A General and High-performance Graph Embedding System
Stars: ✭ 865 (+1959.52%)
Mutual labels:  cuda
Sixtyfour
How fast can we brute force a 64-bit comparison?
Stars: ✭ 41 (-2.38%)
Mutual labels:  cuda
Deformable Convolution V2 Pytorch
Deformable ConvNets V2 (DCNv2) in PyTorch
Stars: ✭ 963 (+2192.86%)
Mutual labels:  cuda
Soul Engine
Physically based renderer and simulation engine for real-time applications.
Stars: ✭ 37 (-11.9%)
Mutual labels:  cuda
Cuda Utilities
Utilities for CUDA programming
Stars: ✭ 30 (-28.57%)
Mutual labels:  cuda
Cuda
Experiments with CUDA and Rust
Stars: ✭ 31 (-26.19%)
Mutual labels:  cuda
Nvidia libs test
Tests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-14.29%)
Mutual labels:  cuda
Des Cuda
DES cracking using brute force algorithm and CUDA
Stars: ✭ 21 (-50%)
Mutual labels:  cuda
Nbody
N body gravity attraction problem solver
Stars: ✭ 40 (-4.76%)
Mutual labels:  cuda
Imagenet Classifier Tensorflow
Image recognition and classification using Convolutional Neural Networks with TensorFlow
Stars: ✭ 13 (-69.05%)
Mutual labels:  cuda
Object Detection And Location Realsensed435
Use the Intel D435 real-sensing camera to realize target detection based on the Yolov3 framework under the Opencv DNN framework, and realize the 3D positioning of the Objection according to the depth information. Real-time display of the coordinates in the camera coordinate system.ADD--Using Yolov5 By TensorRT model,AGX-Xavier,RealTime Object Detection
Stars: ✭ 36 (-14.29%)
Mutual labels:  cuda
Qualia2.0
Qualia is a deep learning framework deeply integrated with automatic differentiation and dynamic graphing with CUDA acceleration. Qualia was built from scratch.
Stars: ✭ 41 (-2.38%)
Mutual labels:  cuda
Octree Slam
Large octree map construction and rendering with CUDA and OpenGL
Stars: ✭ 40 (-4.76%)
Mutual labels:  cuda
Smallpt Parallel Bvh Gpu
A GPU implementation of smallpt (http://www.kevinbeason.com/smallpt/) with Bounding Volume Hierarchy (BVH) tree.
Stars: ✭ 36 (-14.29%)
Mutual labels:  cuda

cuda-convnet2.torch

Torch7 bindings for cuda-convnet2 kernels! Kept as a separate repo because of the License, and because the codebase is not small.

**This is a Work IN PROGRESS! ** DONT USE any modules which are not listed below

####Modules that are usable:

ccn2.SpatialConvolution(nInputPlane, nOutputPlane, kH, [dH = 1], [padding = 0], [groups = 1], [partialSum = oH * oH])
ccn2.SpatialConvolutionLocal(nInputPlane, nOutputPlane, inputHeight, kH, [dH = 1], [padding = 0])
ccn2.SpatialMaxPooling(kW, [dW = kW])
ccn2.SpatialAvgPooling(kW, [dW = kW])
ccn2.SpatialCrossResponseNormalization(nCrossFeaturemaps, [addScale = 0.0001], [powScale = 0.75], [minDiv = 1])

####What's left to do?

All the modules from here: https://code.google.com/p/cuda-convnet/wiki/LayerParams

####How to do it? it is pretty simple,

  • Add the function signature from cudaconv3/include into ffi.lua
  • Call the function in your lua module

For an example, look at SpatialConvolution.lua,

How to use them?

Either send in an input of layout Depth x Height x Width x Batch, or wrap around nn.Transpose modules

Example

fSize = {3, 96, 128, 128, 384}
features = nn.Sequential()
features:add(nn.Transpose({1,4},{1,3},{1,2}))
features:add(ccn2.SpatialConvolution(fSize[1], fSize[2], 9))
features:add(nn.ReLU())
features:add(ccn2.SpatialMaxPooling(2,2))
features:add(ccn2.SpatialConvolution(fSize[2], fSize[3], 5))
features:add(nn.ReLU())
features:add(ccn2.SpatialMaxPooling(2,2))
features:add(ccn2.SpatialConvolution(fSize[3], fSize[4], 4))
features:add(nn.ReLU())
features:add(ccn2.SpatialConvolution(fSize[4], fSize[5], 3))
features:add(nn.ReLU())
features:add(ccn2.SpatialMaxPooling(2,2))
features:add(nn.Transpose({4,1},{4,2},{4,3}))
features:add(nn.View(featuresOut))

###NVMatrix to THTensor cheatsheet | NVMatrix | THCudaTensor | | --------------------|:-------------:| | .getNumCols() | .size[1] | .getNumRows() | .size[0] | .getNumElements() | THCudaTensor_nElement() | .getNumDataBytes() | THCudaTensor_nElement() * 4 | .getStride() | .stride[0] | .isTrans() | N/A | .getDevData() | THCudaTensor_data() | .resize() | THCudaTensor_resizeXd where X = dims | .getTextureObject() | THCudaTensor_getTextureObject | .isContiguous | THCudaTensor_isContiguous | .isSameDims | THCudaTensor_isSameSizeAs | .apply | THCudaTensor_fill()

  • check contiguity of all tensors, if not, make contiguous
  • ignore/remove assertions (because you are doing contiguous checks anyways)
  • harmonize getTextureObject. destroy all the texture objects after usage, treat them like pointers. NVMatrix does it in it's destructor, but since the object is not a member of the THCudaTensor structure, we have to destroy it manually after use.
  • double-check places where strides are allowed (especially conv) Agg = ?, Agg.getBaseValue, Agg.output(.., ..)
  • Remember that NVMatrix only supports 2D tensors!
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].