All Projects → utsaslab → Monet

utsaslab / Monet

Licence: mit
MONeT framework for reducing memory consumption of DNN training

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Monet

Onnx
Open standard for machine learning interoperability
Stars: ✭ 11,829 (+9288.1%)
Mutual labels:  deep-neural-networks, dnn
Rmdl
RMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (+197.62%)
Mutual labels:  deep-neural-networks, dnn
Pytorch Kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Stars: ✭ 2,097 (+1564.29%)
Mutual labels:  deep-neural-networks, dnn
Mobilnet ssd opencv
MobilNet-SSD object detection in opencv 3.4.1
Stars: ✭ 64 (-49.21%)
Mutual labels:  deep-neural-networks, dnn
Chaidnn
HLS based Deep Neural Network Accelerator Library for Xilinx Ultrascale+ MPSoCs
Stars: ✭ 258 (+104.76%)
Mutual labels:  deep-neural-networks, dnn
Opentpod
Open Toolkit for Painless Object Detection
Stars: ✭ 106 (-15.87%)
Mutual labels:  deep-neural-networks, dnn
Ml Fraud Detection
Credit card fraud detection through logistic regression, k-means, and deep learning.
Stars: ✭ 117 (-7.14%)
Mutual labels:  deep-neural-networks
Imagecluster
Cluster images based on image content using a pre-trained deep neural network, optional time distance scaling and hierarchical clustering.
Stars: ✭ 122 (-3.17%)
Mutual labels:  deep-neural-networks
Tenginekit
TengineKit - Free, Fast, Easy, Real-Time Face Detection & Face Landmarks & Face Attributes & Hand Detection & Hand Landmarks & Body Detection & Body Landmarks & Iris Landmarks & Yolov5 SDK On Mobile.
Stars: ✭ 2,103 (+1569.05%)
Mutual labels:  deep-neural-networks
Yolo mark
GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2
Stars: ✭ 1,624 (+1188.89%)
Mutual labels:  dnn
Echo
Python package containing all custom layers used in Neural Networks (Compatible with PyTorch, TensorFlow and MegEngine)
Stars: ✭ 126 (+0%)
Mutual labels:  deep-neural-networks
Chinese Speech To Text
Chinese Speech To Text Using Wavenet
Stars: ✭ 124 (-1.59%)
Mutual labels:  deep-neural-networks
Lenet 5
PyTorch implementation of LeNet-5 with live visualization
Stars: ✭ 122 (-3.17%)
Mutual labels:  deep-neural-networks
Deephyper
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks
Stars: ✭ 117 (-7.14%)
Mutual labels:  deep-neural-networks
Perceptualsimilarity
LPIPS metric. pip install lpips
Stars: ✭ 2,037 (+1516.67%)
Mutual labels:  deep-neural-networks
Keras Kaldi
Keras Interface for Kaldi ASR
Stars: ✭ 124 (-1.59%)
Mutual labels:  deep-neural-networks
Tfg Voice Conversion
Deep Learning-based Voice Conversion system
Stars: ✭ 115 (-8.73%)
Mutual labels:  deep-neural-networks
Nlp Pretrained Model
A collection of Natural language processing pre-trained models.
Stars: ✭ 122 (-3.17%)
Mutual labels:  deep-neural-networks
Hyperdensenet
This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.
Stars: ✭ 124 (-1.59%)
Mutual labels:  deep-neural-networks
Microexpnet
MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images
Stars: ✭ 121 (-3.97%)
Mutual labels:  deep-neural-networks

MONeT: Memory Optimization for Deep Networks

Implemented over PyTorch, MONeT schedules allow training deep networks on a constrained memory budget with minimal computational overhead. MONeT jointly determines checkpointing as well as operator implementations, reducing GPU memory by as much as 3x with a compute overhead of 9-16%.

Memory Optimization for Deep Networks
Aashaka Shah, Chao-Yuan Wu, Jayashree Mohan, Vijay Chidambaram, Philipp Krähenbühl
[paper]

Installation

MONeT has been tested with PyTorch 1.5.1, torchvision 0.6.1, and cudatoolkit 10.1. Create a conda environment with python 3.7 or greater. Inside the environment, install the following packages: cvxpy, gurobi, pandas, ninja-build, coinor-cbc, coinor-libcbc-dev, cylp.

install.sh provides the installation script.

Clone this repo and install the package. Ensure that the conda environment is activated.

git clone --recursive https://github.com/utsaslab/MONeT
cd MONeT
pip install -e .

Getting Started

MONeT usage

MONeT has been tested for single-GPU training and single-machine multi-GPU Distributed Data Parallel training. To get started with MONeT using solutions in the schedule zoo, add the following imports to your code:

from monet.cvxpy_solver import Solution
from monet.monet_wrapper import MONeTWrapper

Wrap your model using a MONeTWrapper

monet_model = MONeTWrapper(model, solution_file, input_shape)

Use the model like you normally would

output = monet_model(input) # Forward pass
output.sum().backward() # Backward pass

A working version of this code can be found at examples/training.py.

For Distributed Data Parallel training, monet_model can be wrapped by torch.nn.parallel.DistributedDataParallel like any other model. A working distributed training code can be found at examples/dist_training.py.

The examples/imagenet.py has been modified to use MONeT schedules for ImageNet training.

python imagenet.py DATA_DIR -a [arch] --gpu 0 \
        --epochs [num_epochs] \
        --batch-size [batch_size] \
        --solution_file [path to solution file]

Schedule zoo

We have already created some schedules which can be used right off the bat. Simply install MONeT, modify your training similar to examples/imagenet.py, and use the memory efficient schedules for training! The schedule zoo is hosted in the data directory. You can use the results below to pick the right schedule according to your requirements.

A solution solution_resnet50_184_inplace_conv_multiway_newnode_10.00.pkl uses 10 GB memory for training ResNet-50 with a batch size of 184, and according to the results, has a 3.22% overhead over the original PyTorch implementation which uses 15.06 GB memory.

Results

ResNet-50 (184) Memory (GB) Compute Overhead (%)
PyTorch 15.06 0
MONeT 10.01 3.22%
MONeT 9.01 4.68%
MONeT 8.01 5.56%
MONeT 6.99 7.28%
MONeT 6.00 9.31%
MONeT 4.99 11.95%
GoogleNet (320) Memory (GB) Compute Overhead (%)
PyTorch 14.93 0
MONeT 9.98 7.13%
MONeT 8.99 7.87%
MONeT 8.01 8.44%
MONeT 7.02 9.71%
MONeT 6.01 12.14%
MONeT 4.99 15.77%
UNet (11) Memory (GB) Compute Overhead (%)
PyTorch 14.32 0
MONeT 10.01 -4.10%
MONeT 9.01 -2.07%
MONeT 8.02 -0.09%
MONeT 7.00 1.39%
MONeT 6.01 4.95%
MONeT 5.01 11.51%
Mobilenet (272) Memory (GB) Compute Overhead (%)
PyTorch 14.46 0
MONeT 10.02 2.40%
MONeT 9.01 3.10%
MONeT 8.02 4.77%
MONeT 7.01 5.53%
MONeT 6.01 7.55%
MONeT 5.01 8.72%
VGG-16 (176) Memory (GB) Compute Overhead (%)
PyTorch 14.12 0
MONeT 9.71 -5.30%
MONeT 8.66 -4.64%
MONeT 7.88 -2.18%
MONeT 6.82 1.99%
MONeT 5.90 5.44%
MONeT 5.51 9.11%

Advanced MONeT usage

Obtain the Gurobi academic license from the Gurobi website. Login with a .edu email to get the free license.

  1. To create a MONeT solution:
python cvxpy_solver.py MODEL BATCH_SIZE BUDGET MODE "GUROBI" --time_limit TIME_LIMIT

MODEL format: "torchvision.models.<model>()". For UNeT, the format is "unet".
BUDGET is the memory budget in GB
MODE is "inplace_conv_multiway_newnode" for complete MONeT
TIME_LIMIT is the solver time limit in seconds
The flag --ablation can be added to disable checkpointing when creating a solution.

  1. To profile a MONeT schedule given a solution:
python schedule.py MODEL BATCH_SIZE BUDGET MODE "GUROBI" \
        --run_bs --solution_file SOLUTION_FILE

The flag --run_bs can be replaced by --check_runtime to check the runtime of the schedule or --check_diff to check the gradients of MONeT against original PyTorch.

Other modes may be used for experimenting with MONeT:

  • inplace_ prefix enables operator optimization
  • conv_normal selects conv-optimization
  • multiway selects output-activated optimization
  • newnode selects intermediate-activate optimization

Refer the paper for details about the optimizations.

Citation

If you use MONeT in your work, please consider citing us as

@misc{shah2020memory,
      title={Memory Optimization for Deep Networks},
      author={Aashaka Shah and Chao-Yuan Wu and Jayashree Mohan and Vijay Chidambaram and Philipp Krähenbühl},
      year={2020},
      eprint={2010.14501},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgements

The code for UNeT is taken from Pytorch-UNet by milesial. Distributed Data Parallel training example code is borrowed from the distributed tutorial by yangkky.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].