All Projects → glassroom → heinsen_routing

glassroom / heinsen_routing

Licence: MIT license
Official implementation of "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019) in PyTorch.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to heinsen routing

capsules-tensorflow
Another implementation of Hinton's capsule networks in tensorflow.
Stars: ✭ 18 (-56.1%)
Mutual labels:  capsule-network, capsules
Capsnet Tensorflow
A Tensorflow implementation of CapsNet(Capsules Net) in paper Dynamic Routing Between Capsules
Stars: ✭ 3,776 (+9109.76%)
Mutual labels:  routing-algorithm, capsule-network
BookLibrary
Book Library of P&W Studio
Stars: ✭ 13 (-68.29%)
Mutual labels:  paper
bug-localization
Source code of the paper "Leveraging textual properties of bug reports to localize relevant source files".
Stars: ✭ 15 (-63.41%)
Mutual labels:  paper
NLP-Natural-Language-Processing
Projects and useful articles / links
Stars: ✭ 149 (+263.41%)
Mutual labels:  paper
atari-demo
Code for the blog post "Learning Montezuma’s Revenge from a Single Demonstration"
Stars: ✭ 21 (-48.78%)
Mutual labels:  paper
STACP
Joint Geographical and Temporal Modeling based on Matrix Factorization for Point-of-Interest Recommendation - ECIR 2020
Stars: ✭ 19 (-53.66%)
Mutual labels:  paper
PaperDownload
知网/万方 论文/期刊批量检索下载
Stars: ✭ 18 (-56.1%)
Mutual labels:  paper
PhD
Incremental Methods of Deep Learning for Detection and Classifcation in a Robotics Environment
Stars: ✭ 13 (-68.29%)
Mutual labels:  paper
batu-gunting-kertas-nuxt
Rock Paper Scissors Game with Artificial Intellegence
Stars: ✭ 50 (+21.95%)
Mutual labels:  paper
best AI papers 2021
A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.
Stars: ✭ 2,740 (+6582.93%)
Mutual labels:  paper
3d-detection-papers
The papers in this list are about 3d detection, especially those using point clouds.
Stars: ✭ 77 (+87.8%)
Mutual labels:  paper
FSL-Mate
FSL-Mate: A collection of resources for few-shot learning (FSL).
Stars: ✭ 1,346 (+3182.93%)
Mutual labels:  paper
minie
An open information extraction system that provides compact extractions
Stars: ✭ 83 (+102.44%)
Mutual labels:  paper
me recognition
CapsuleNet for Micro-expression Recognition (IEEE FG 2019)
Stars: ✭ 56 (+36.59%)
Mutual labels:  capsule-network
neural-gpu
Code for the Neural GPU model originally described in "Neural GPUs Learn Algorithms"
Stars: ✭ 100 (+143.9%)
Mutual labels:  paper
modules
The official repository for our paper "Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks". We develop a method for analyzing emerging functional modularity in neural networks based on differentiable weight masks and use it to point out important issues in current-day neural networks.
Stars: ✭ 25 (-39.02%)
Mutual labels:  paper
papers
Summarizing the papers I have read (Japanese)
Stars: ✭ 38 (-7.32%)
Mutual labels:  paper
pluGET
📦 Powerful Package manager which updates plugins & server software for minecraft servers
Stars: ✭ 87 (+112.2%)
Mutual labels:  paper
Orion
Mixin loader for Paper
Stars: ✭ 46 (+12.2%)
Mutual labels:  paper

heinsen_routing

Official implementation of "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019) in PyTorch. This learning algorithm, without change, achieves state-of-the-art results in two domains, vision and language.

For example, a capsule network using this algorithm outperforms Hinton et al. (2018)'s capsule network on a visual task using fewer parameters and requiring an order of magnitude less training. A capsule network using the same algorithm outperforms BERT on a language task. In both of these examples, the same training regime was used to train the model (same hyperparameters, learning rate schedule, regularization, etc.).

You can easily add the algorithm as a new layer to any model to improve its performance. Try it!

Sample usage

Detect objects from their component parts in images:

from heinsen_routing import Routing

part_scores = torch.randn(100)       # 100 scores, one per detected part
part_poses = torch.randn(100, 4, 4)  # 100 capsules, each a 4 x 4 pose matrix

detect_objs = Routing(d_cov=4, d_inp=4, d_out=4, n_inp=100, n_out=10)
obj_scores, obj_poses, obj_poses_sig2 = detect_objs(part_scores, part_poses)

print(obj_scores)                    # 10 scores, one per detected object
print(obj_poses)                     # 10 capsules, each a 4 x 4 pose matrix

Classify sequences of token embeddings:

from heinsen_routing import Routing

tok_scores = torch.randn(n)          # token scores, n is variable
tok_embs = torch.randn(n, 1024)      # token embeddings, n is variable
tok_embs = tok_embs.unsqueeze(1)     # reshape to n x 1 x 1024 (n matrices)

classify = Routing(d_cov=1, d_inp=1024, d_out=8, n_out=2)  # variable n_inp
class_scores, class_embs, class_embs_sig2 = classify(tok_scores, tok_embs)

print(class_scores)                  # 2 scores, one per class
print(class_embs)                    # 2 capsules, each a 1 x 8 matrix

Predict variable numbers of targets:

from heinsen_routing import Routing

attr_scores = torch.randn(10)        # 10 scores
attr_caps = torch.randn(10, 1, 256)  # 10 capsules with 1 x 256 features

predict = Routing(d_cov=1, d_inp=256, d_out=64, n_inp=10)  # variable n_out
pred_scores, pred_caps, pred_caps_sig2 = predict(attr_scores, attr_caps, n_out=n)

print(pred_scores)                   # n scores, one per prediction
print(pred_caps)                     # n capsules with 1 x 64 features

Installation

  1. Download one file: heinsen_routing.py.
  2. Import the module: from heinsen_routing import Routing.
  3. Use it as shown above.

Note: requires a working installation of PyTorch.

Why?

Initial evaluations show that our learning algorithm, without change, achieves state-of-the-art results in two domains, vision and language. In our experience, this is unusual, and therefore worthy of attention and further research:

Figs. 1 and 2 from paper

Moreover, we find evidence that our learning algorithm, when we apply it to a visual recognition task, learns to perform a form of "reverse graphics." The following visualization, from our paper, shows a two-dimensional approximation of the trajectories of the pose vectors of an activated class capsule as we change viewpoint elevation of the same object from one image to the next:

Fig. 4 from paper

Our algorithm is a new, general-purpose form of "routing by agreement" (Hinton et al., 2018) which uses expectation-maximization (EM) to cluster similar votes from input capsules to output capsules in a layer of a neural network. A capsule is a group of neurons whose outputs represent different properties of the same entity in different contexts. Routing by agreement is an iterative form of clustering in which each output capsule detects an entity by looking for agreement among votes from input capsules that have already detected parts of the entity in a previous layer.

Replication of results in paper

If you wish to replicate our results, we recommend recreating our setup in a virtual Python environment, with the same versions of all libraries and dependencies. Runing the code requires at least one Nvidia GPU with 11GB+ RAM, along with a working installation of CUDA 10 or newer. The code is meant to be easily modifiable to work with greater numbers of GPUs, or with TPUs. It is also meant to be easily modifiable to work with frameworks other than PyTorch (as long as they support Einsten summation notation for describing multilinear operations), such as TensorFlow.

To replicate our environment and results, follow these steps:

  1. Change to the directory in which you cloned this repository:
cd /home/<my_name>/<my_directory>
  1. Create a new Python 3 virtual environment:
virtualenv --python=python3 python
  1. Activate the virtual environment:
source ./python/bin/activate
  1. Install required Python libraries in environment:
pip install --upgrade pip
pip install --upgrade -r requirements.txt
  1. Install other dependencies:
mkdir deps
git clone https://github.com/glassroom/torch_train_test_loop.git deps/torch_train_test_loop
git clone https://github.com/ndrplz/small_norb.git deps/small_norb
  1. Download and decompress smallNORB files:
mkdir .data
mkdir .data/smallnorb
cd .data/smallnorb
wget https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/smallnorb-5x46789x9x18x6x2x96x96-training-dat.mat.gz
wget https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/smallnorb-5x46789x9x18x6x2x96x96-training-cat.mat.gz
wget https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/smallnorb-5x46789x9x18x6x2x96x96-training-info.mat.gz
wget https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/smallnorb-5x01235x9x18x6x2x96x96-testing-dat.mat.gz
wget https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/smallnorb-5x01235x9x18x6x2x96x96-testing-cat.mat.gz
wget https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/smallnorb-5x01235x9x18x6x2x96x96-testing-info.mat.gz
for FILE in *.gz; do gunzip -k $FILE; done
cd ../..
  1. Run the Jupyter notebooks:

Make sure the virtual environment is activated beforehand. Also, you may want to modify the code to use more than one GPU device (recommended). You can run the notebooks non-interactively or interactively:

  • To run the notebooks non-interactively, use jupyter nbconvert --execute, optionally specifying whether you want the output, including visualizations, in nicely formatted html, pdf, or some other format. See these instructions.

  • To run the notebooks interactively, run jupyter notebook. You should see two notebooks that replicate the results in our paper. Open and run them using the Jupyter interface.

The results shown in the paper were obtained by training each model 10 times and using the end-of-training snapshot with the lowest validation error for testing. Some variability in training is normal, because each output capsule must learn to execute an expectation-maximization (EM) loop, which is known to be sensitive to initialization. As we mention in the paper, you may be able to obtain better performance with more careful tweaking of layer sizes and training regime.

Pretrained weights

We have made pretrained weights available for the smallNORB and SST models:

import torch
from models import SmallNORBClassifier, SSTClassifier

# Load pretrained smallNORM model.
model = SmallNORBClassifier(n_objs=5, n_parts=64, d_chns=64)
model.load_state_dict(torch.load('smallNORB_pretrained_model_state_dict.pt'))

# Load SST model pretrained on binary dataset.
model = SSTClassifier(d_depth=37, d_emb=1280, d_inp=64, d_cap=2, n_parts=64, n_classes=2)
model.load_state_dict(torch.load('SST2R_pretrained_model_state_dict.pt'))

# Load SST model pretrained on fine-grained dataset.
model = SSTClassifier(d_depth=37, d_emb=1280, d_inp=64, d_cap=2, n_parts=64, n_classes=5)
model.load_state_dict(torch.load('SST5R_pretrained_model_state_dict.pt'))

Notes

Our paper is still work-in-progress, subject to revision. Comments and suggestions are welcome! We typeset the paper with the ACL conference's LaTeX template for no other reason than we find its two-column format, with fewer words per line, easier to read. We briefly considered submitting our work to an academic conference, but by the time we had finished running illustrative evaluations on academic datasets, documenting and writing up the results, and removing all traces of internal code, the deadline for NIPS had passed and we didn't want to wait much longer. We decided to post everything here and let the work speak for itself.

We have tested our code only on Ubuntu Linux 18.04 with Python 3.6+.

Citing

If our work is helpful to your research, please cite it:

@misc{heinsen2019algorithm,
    title={An Algorithm for Routing Capsules in All Domains},
    author={Franz A. Heinsen},
    year={2019},
    eprint={1911.00792},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How is this used at GlassRoom?

We conceived and implemented this routing algorithm to be a component (i.e., a layer) of larger models that are in turn part of our AI software, nicknamed Graham. Our implementation of the algorithm is designed to be plugged into or tacked onto existing PyTorch models with minimal hassle.

Most of the original work we do at GlassRoom tends to be either proprietary in nature or tightly coupled to internal code, so we cannot share it with outsiders. In this case, however, we were able to isolate our code and release it as stand-alone open-source software without having to disclose any key intellectual property.

We hope others find our work and our code useful.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].