Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Stars: ✭ 2,097 (+1564.29%)

Mutual labels: deep-neural-networks, dnn

Mobilnet ssd opencv

MobilNet-SSD object detection in opencv 3.4.1

Stars: ✭ 64 (-49.21%)

Mutual labels: deep-neural-networks, dnn

Chaidnn

HLS based Deep Neural Network Accelerator Library for Xilinx Ultrascale+ MPSoCs

Stars: ✭ 258 (+104.76%)

Mutual labels: deep-neural-networks, dnn

Opentpod

Open Toolkit for Painless Object Detection

Stars: ✭ 106 (-15.87%)

Mutual labels: deep-neural-networks, dnn

Ml Fraud Detection

Credit card fraud detection through logistic regression, k-means, and deep learning.

Stars: ✭ 117 (-7.14%)

Mutual labels: deep-neural-networks

Imagecluster

Cluster images based on image content using a pre-trained deep neural network, optional time distance scaling and hierarchical clustering.

Stars: ✭ 122 (-3.17%)

Mutual labels: deep-neural-networks

Tenginekit

TengineKit - Free, Fast, Easy, Real-Time Face Detection & Face Landmarks & Face Attributes & Hand Detection & Hand Landmarks & Body Detection & Body Landmarks & Iris Landmarks & Yolov5 SDK On Mobile.

Stars: ✭ 2,103 (+1569.05%)

Mutual labels: deep-neural-networks

Yolo mark

GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2

Stars: ✭ 1,624 (+1188.89%)

Mutual labels: dnn

Echo

Python package containing all custom layers used in Neural Networks (Compatible with PyTorch, TensorFlow and MegEngine)

Stars: ✭ 126 (+0%)

Mutual labels: deep-neural-networks

Chinese Speech To Text

Chinese Speech To Text Using Wavenet

Stars: ✭ 124 (-1.59%)

Mutual labels: deep-neural-networks

Lenet 5

PyTorch implementation of LeNet-5 with live visualization

Stars: ✭ 122 (-3.17%)

Mutual labels: deep-neural-networks

Deephyper

DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

Stars: ✭ 117 (-7.14%)

Mutual labels: deep-neural-networks

Perceptualsimilarity

LPIPS metric. pip install lpips

Stars: ✭ 2,037 (+1516.67%)

Mutual labels: deep-neural-networks

Keras Kaldi

Keras Interface for Kaldi ASR

Stars: ✭ 124 (-1.59%)

Mutual labels: deep-neural-networks

Tfg Voice Conversion

Deep Learning-based Voice Conversion system

Stars: ✭ 115 (-8.73%)

Mutual labels: deep-neural-networks

Nlp Pretrained Model

A collection of Natural language processing pre-trained models.

Stars: ✭ 122 (-3.17%)

Mutual labels: deep-neural-networks

Hyperdensenet

This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.

Stars: ✭ 124 (-1.59%)

Mutual labels: deep-neural-networks

Microexpnet

MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images

Stars: ✭ 121 (-3.97%)

Mutual labels: deep-neural-networks

View All Similar Projects ➔

MONeT: Memory Optimization for Deep Networks

Implemented over PyTorch, MONeT schedules allow training deep networks on a constrained memory budget with minimal computational overhead. MONeT jointly determines checkpointing as well as operator implementations, reducing GPU memory by as much as 3x with a compute overhead of 9-16%.

Memory Optimization for Deep Networks
Aashaka Shah, Chao-Yuan Wu, Jayashree Mohan, Vijay Chidambaram, Philipp Krähenbühl
[paper]

Installation

MONeT has been tested with PyTorch 1.5.1, torchvision 0.6.1, and cudatoolkit 10.1. Create a conda environment with python 3.7 or greater. Inside the environment, install the following packages: cvxpy, gurobi, pandas, ninja-build, coinor-cbc, coinor-libcbc-dev, cylp.

install.sh provides the installation script.

Clone this repo and install the package. Ensure that the conda environment is activated.

git clone --recursive https://github.com/utsaslab/MONeT
cd MONeT
pip install -e .

Getting Started

MONeT usage

MONeT has been tested for single-GPU training and single-machine multi-GPU Distributed Data Parallel training. To get started with MONeT using solutions in the schedule zoo, add the following imports to your code:

from monet.cvxpy_solver import Solution
from monet.monet_wrapper import MONeTWrapper

Wrap your model using a MONeTWrapper

monet_model = MONeTWrapper(model, solution_file, input_shape)

Use the model like you normally would

output = monet_model(input) # Forward pass
output.sum().backward() # Backward pass

A working version of this code can be found at examples/training.py.

For Distributed Data Parallel training, monet_model can be wrapped by torch.nn.parallel.DistributedDataParallel like any other model. A working distributed training code can be found at examples/dist_training.py.

The examples/imagenet.py has been modified to use MONeT schedules for ImageNet training.

python imagenet.py DATA_DIR -a [arch] --gpu 0 \
        --epochs [num_epochs] \
        --batch-size [batch_size] \
        --solution_file [path to solution file]

Schedule zoo

We have already created some schedules which can be used right off the bat. Simply install MONeT, modify your training similar to examples/imagenet.py, and use the memory efficient schedules for training! The schedule zoo is hosted in the data directory. You can use the results below to pick the right schedule according to your requirements.

A solution solution_resnet50_184_inplace_conv_multiway_newnode_10.00.pkl uses 10 GB memory for training ResNet-50 with a batch size of 184, and according to the results, has a 3.22% overhead over the original PyTorch implementation which uses 15.06 GB memory.

Results

ResNet-50 (184)	Memory (GB)	Compute Overhead (%)
PyTorch	15.06	0
MONeT	10.01	3.22%
MONeT	9.01	4.68%
MONeT	8.01	5.56%
MONeT	6.99	7.28%
MONeT	6.00	9.31%
MONeT	4.99	11.95%

GoogleNet (320)	Memory (GB)	Compute Overhead (%)
PyTorch	14.93	0
MONeT	9.98	7.13%
MONeT	8.99	7.87%
MONeT	8.01	8.44%
MONeT	7.02	9.71%
MONeT	6.01	12.14%
MONeT	4.99	15.77%

UNet (11)	Memory (GB)	Compute Overhead (%)
PyTorch	14.32	0
MONeT	10.01	-4.10%
MONeT	9.01	-2.07%
MONeT	8.02	-0.09%
MONeT	7.00	1.39%
MONeT	6.01	4.95%
MONeT	5.01	11.51%

Mobilenet (272)	Memory (GB)	Compute Overhead (%)
PyTorch	14.46	0
MONeT	10.02	2.40%
MONeT	9.01	3.10%
MONeT	8.02	4.77%
MONeT	7.01	5.53%
MONeT	6.01	7.55%
MONeT	5.01	8.72%

VGG-16 (176)	Memory (GB)	Compute Overhead (%)
PyTorch	14.12	0
MONeT	9.71	-5.30%
MONeT	8.66	-4.64%
MONeT	7.88	-2.18%
MONeT	6.82	1.99%
MONeT	5.90	5.44%
MONeT	5.51	9.11%

Advanced MONeT usage

Obtain the Gurobi academic license from the Gurobi website. Login with a .edu email to get the free license.

To create a MONeT solution:

python cvxpy_solver.py MODEL BATCH_SIZE BUDGET MODE "GUROBI" --time_limit TIME_LIMIT

MODEL format: "torchvision.models.<model>()". For UNeT, the format is "unet".
BUDGET is the memory budget in GB
MODE is "inplace_conv_multiway_newnode" for complete MONeT
TIME_LIMIT is the solver time limit in seconds
The flag --ablation can be added to disable checkpointing when creating a solution.

To profile a MONeT schedule given a solution:

python schedule.py MODEL BATCH_SIZE BUDGET MODE "GUROBI" \
        --run_bs --solution_file SOLUTION_FILE

The flag --run_bs can be replaced by --check_runtime to check the runtime of the schedule or --check_diff to check the gradients of MONeT against original PyTorch.

Other modes may be used for experimenting with MONeT:

inplace_ prefix enables operator optimization
conv_normal selects conv-optimization
multiway selects output-activated optimization
newnode selects intermediate-activate optimization

Refer the paper for details about the optimizations.

Citation

If you use MONeT in your work, please consider citing us as

@misc{shah2020memory,
      title={Memory Optimization for Deep Networks},
      author={Aashaka Shah and Chao-Yuan Wu and Jayashree Mohan and Vijay Chidambaram and Philipp Krähenbühl},
      year={2020},
      eprint={2010.14501},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgements

The code for UNeT is taken from Pytorch-UNet by milesial. Distributed Data Parallel training example code is borrowed from the distributed tutorial by yangkky.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 126

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗