All Projects → parasj → checkmate

parasj / checkmate

Licence: Apache-2.0 license
Training neural networks in TensorFlow 2.0 with 5x less memory

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
Makefile
30231 projects

Projects that are alternatives of or similar to checkmate

Memtriage
Allows you to quickly query a Windows machine for RAM artifacts
Stars: ✭ 200 (+72.41%)
Mutual labels:  memory
tensorflow 2.0 tutorial
tensorflow 2.0 正式版实用教程/tutorial
Stars: ✭ 48 (-58.62%)
Mutual labels:  tensorflow2
string-combinations
A simple, low-memory footprint function to generate all string combinations from a series of characters.
Stars: ✭ 25 (-78.45%)
Mutual labels:  memory
Pubg mobile memory hacking examples
Pubg Mobile Emulator Gameloop Memory Hacking C++ code examples. Ex: Name, Coord, Bones, Weapons, Items, Box, Drop etc.
Stars: ✭ 224 (+93.1%)
Mutual labels:  memory
Superstring
A fast and memory-optimized string library for C++
Stars: ✭ 252 (+117.24%)
Mutual labels:  memory
TF2DeepFloorplan
TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.
Stars: ✭ 98 (-15.52%)
Mutual labels:  tensorflow2
Process Governor
This application allows you to put various limits on a Windows process.
Stars: ✭ 190 (+63.79%)
Mutual labels:  memory
cache-bucket
Light Cache for nodeJs and browserJs with TTL.
Stars: ✭ 14 (-87.93%)
Mutual labels:  memory
Amonguscapture
Capture of the local Among Us executable state
Stars: ✭ 252 (+117.24%)
Mutual labels:  memory
TFLite-ModelMaker-EfficientDet-Colab-Hands-On
TensorFlow Lite Model Makerで物体検出を行うハンズオン用資料です(Hands-on for object detection with TensorFlow Lite Model Maker)
Stars: ✭ 15 (-87.07%)
Mutual labels:  tensorflow2
Pm2 Server Monit
Monitor server CPU / Memory / Process / Zombie Process / Disk size / Security Packages / Network Input / Network Output
Stars: ✭ 247 (+112.93%)
Mutual labels:  memory
Browser Sec Whitepaper
Cure53 Browser Security White Paper
Stars: ✭ 251 (+116.38%)
Mutual labels:  memory
mmappickle
Python 3 library to store memory mappable objects into pickle-compatible files
Stars: ✭ 34 (-70.69%)
Mutual labels:  memory
Mytetra dev
MyTetra - smart crossplatform manager for information collecting / MyTetra - кроссплатформенный менеджер накопления информации / Официальная страница:
Stars: ✭ 207 (+78.45%)
Mutual labels:  memory
AiSpace
AiSpace: Better practices for deep learning model development and deployment For Tensorflow 2.0
Stars: ✭ 28 (-75.86%)
Mutual labels:  tensorflow2
Onewirehub
OneWire slave device emulator
Stars: ✭ 195 (+68.1%)
Mutual labels:  memory
DataAugmentationTF
Implementation of modern data augmentation techniques in TensorFlow 2.x to be used in your training pipeline.
Stars: ✭ 35 (-69.83%)
Mutual labels:  tensorflow2
transformer
Build English-Vietnamese machine translation with ProtonX Transformer. :D
Stars: ✭ 41 (-64.66%)
Mutual labels:  tensorflow2
pyradox
State of the Art Neural Networks for Deep Learning
Stars: ✭ 61 (-47.41%)
Mutual labels:  tensorflow2
Grokking-Machine-Learning
This repo aims to contain different machine learning use cases along with the descriptions to the model architectures
Stars: ✭ 54 (-53.45%)
Mutual labels:  tensorflow2

See the paper! https://arxiv.org/abs/1910.02653

checkmate breaks the GPU memory wall by enabling researchers to train large state-of-the-art models that do not fit in GPU memory. Checkmate applies optimal tensor rematerialization (as detailed in our paper at MLSys 2020) to trade off space and time.

At the moment, Checkmate only supports TensorFlow 2.0. PyTorch support is coming soon!

IF YOU ARE TRYING TO REPLICATE OUR MLSYS 2020 PAPER, USE THE mlsys20_artifact BRANCH!

Installation

Checkmate depends on:

  • TensorFlow 2.0, i.e. pip install tensorflow or pip install tensorflow-gpu.

  • CyLP solver

    Installing CyLP on Debian Linux / Ubuntu

    $ sudo apt install coinor-cbc coinor-libcbc-dev
    $ pip install cylp

    Installing CyLP on MacOS

    The easiest way to set up CyLP is using homebrew.

    $ brew tap coin-or-tools/coinor
    $ brew install coin-or-tools/coinor/cbc pkg-config
    $ pip install cylp

Once TensorFlow 2.0 and CyLP are installed, Checkmate can be installed using pip via pip install "https://github.com/parasj/checkmate/archive/master.zip#egg=checkmate".

Quick start

Get started in 5m with our TF2.0 quickstart tutorial

Adapt your Keras model to fit within the memory constraints of a single GPU:

import checkmate
model = tf.keras.applications.vgg19.VGG19(...)
...

train_iteration_fn = checkmate.tf2.compile(model, loss, optimizer,
    input_spec=sample_input[0], label_spec=sample_input[1])

for image, label in train_ds:
    prediction, loss = train_iteration_fn(image, label)

Key ideas

From our paper at MLSys 2020:

Modern neural networks are increasingly bottlenecked by the limited capacity of on-device
GPU memory. Prior work explores dropping activations as a strategy to scale to larger
neural networks under memory constraints. However, these heuristics assume uniform
per-layer costs and are limited to simple architectures with linear graphs, limiting their
usability. In this paper, we formalize the problem of trading-off DNN training time and
memory requirements as the tensor rematerialization optimization problem, a generalization
of prior checkpointing strategies. We introduce Checkmate, a system that solves for
optimal schedules in reasonable times (under an hour) using off-the-shelf MILP solvers,
then uses these schedules to accelerate millions of training iterations. Our method scales
to complex, realistic architectures and is hardware-aware through the use of
accelerator-specific, profile-based cost models. In addition to reducing training cost,
Checkmate enables real-world networks to be trained with up to 5.1× larger input sizes.

Citation

If you use Checkmate in your work, please cite us with:

@incollection{mlsys2020_196,
 author = {Jain, Paras and Jain, Ajay and Nrusimha, Aniruddha and Gholami, Amir and Abbeel, Pieter and Gonzalez, Joseph and Keutzer, Kurt and Stoica, Ion},
 booktitle = {Proceedings of Machine Learning and Systems 2020},
 pages = {497--511},
 title = {Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization},
 year = {2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].