All Projects → dbaranchuk → Memory Efficient Maml

dbaranchuk / Memory Efficient Maml

Licence: mit
Memory efficient MAML using gradient checkpointing

Projects that are alternatives of or similar to Memory Efficient Maml

Optical Flow Filter
A real time optical flow algorithm implemented on GPU
Stars: ✭ 146 (+143.33%)
Mutual labels:  jupyter-notebook, gpu
Pycaret
An open-source, low-code machine learning library in Python
Stars: ✭ 4,594 (+7556.67%)
Mutual labels:  jupyter-notebook, gpu
Ml Workspace
🛠 All-in-one web-based IDE specialized for machine learning and data science.
Stars: ✭ 2,337 (+3795%)
Mutual labels:  jupyter-notebook, gpu
Cuxfilter
GPU accelerated cross filtering with cuDF.
Stars: ✭ 128 (+113.33%)
Mutual labels:  jupyter-notebook, gpu
Reinforcement learning tutorial with demo
Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers, Courses, etc..
Stars: ✭ 442 (+636.67%)
Mutual labels:  jupyter-notebook, meta-learning
Ipyexperiments
jupyter/ipython experiment containers for GPU and general RAM re-use
Stars: ✭ 128 (+113.33%)
Mutual labels:  jupyter-notebook, gpu
Adaptnlp
An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.
Stars: ✭ 278 (+363.33%)
Mutual labels:  jupyter-notebook, gpu
Deep Learning Boot Camp
A community run, 5-day PyTorch Deep Learning Bootcamp
Stars: ✭ 1,270 (+2016.67%)
Mutual labels:  jupyter-notebook, gpu
Trainyourownyolo
Train a state-of-the-art yolov3 object detector from scratch!
Stars: ✭ 399 (+565%)
Mutual labels:  jupyter-notebook, gpu
Adanet
Fast and flexible AutoML with learning guarantees.
Stars: ✭ 3,340 (+5466.67%)
Mutual labels:  jupyter-notebook, gpu
Ds bowl 2018
Kaggle Data Science Bowl 2018
Stars: ✭ 116 (+93.33%)
Mutual labels:  jupyter-notebook, gpu
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+9326.67%)
Mutual labels:  jupyter-notebook, gpu
Kmeans pytorch
kmeans using PyTorch
Stars: ✭ 98 (+63.33%)
Mutual labels:  jupyter-notebook, gpu
Benchmarks
Comparison tools
Stars: ✭ 139 (+131.67%)
Mutual labels:  jupyter-notebook, gpu
Pytorch
PyTorch tutorials A to Z
Stars: ✭ 87 (+45%)
Mutual labels:  jupyter-notebook, gpu
Keras Acgan
Auxiliary Classifier Generative Adversarial Networks in Keras
Stars: ✭ 196 (+226.67%)
Mutual labels:  jupyter-notebook, gpu
Glove As A Tensorflow Embedding Layer
Taking a pretrained GloVe model, and using it as a TensorFlow embedding weight layer **inside the GPU**. Therefore, you only need to send the index of the words through the GPU data transfer bus, reducing data transfer overhead.
Stars: ✭ 85 (+41.67%)
Mutual labels:  jupyter-notebook, gpu
Training Material
A collection of code examples as well as presentations for training purposes
Stars: ✭ 85 (+41.67%)
Mutual labels:  jupyter-notebook, gpu
Gdrl
Grokking Deep Reinforcement Learning
Stars: ✭ 304 (+406.67%)
Mutual labels:  jupyter-notebook, gpu
Fastai
The fastai deep learning library
Stars: ✭ 21,718 (+36096.67%)
Mutual labels:  jupyter-notebook, gpu

Memory Efficient MAML

Overview

PyTorch implementation of Model-Agnostic Meta-Learning[1] with gradient checkpointing[2]. It allows you to perform way (~10-100x) more MAML steps with the same GPU memory budget.

Install

For normal installation, run pip install torch_maml

For development installation, clone a repo and python setup.py develop

How to use:

See examples in example.ipynb

Open In Colab

Tips and tricks

  1. Make sure that your model doesn't have implicit parameter updates like torch.nn.BatchNorm2d under track_running_stats=True. With gradient checkpointing, these updates will be performed twice (once per forward pass). If still want these updates, take a look at torch_maml.utils.disable_batchnorm_stats. Note that we already support this for vanilla BatchNorm{1-3}d.

  2. When computing gradients through many MAML steps (e.g. 100 or 1000), you should care about vanishing and exploding gradients within optimizers (same as in RNN). This implementation supports gradient clipping to avoid the explosive part of the problem.

  3. Also, when you deal with a large number of MAML steps, be aware of accumulating computational error due to float precision and specifically CUDNN operations. We recommend you to use torch.backend.cudnn.determistic=True. The problem appears when gradients become slightly noisy due to errors, and, during backpropagation through MAML steps, the error is likely to increase dramatically.

  4. You could also consider Implicit Gradient MAML [3] for memory efficient meta-learning alternatives. While this algorithm requires even less memory, it assumes that your optimization converges to the optimum. Therefore, it is inapplicable if your task does not always converge by the time you start backpropagating. In contrast, our implementation allows you to meta-learn even from a partially converged state.

References

[1] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

[2] Gradient checkpointing technique (GitHub)

[3] Meta-Learning with Implicit Gradients

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].