Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → szagoruyko → Attention Transfer

szagoruyko / Attention Transfer

Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Labels

jupyter-notebook deep-learning pytorch attention

Projects that are alternatives of or similar to Attention Transfer

Graph attention pool

Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

Stars: ✭ 186 (-84.89%)

Mutual labels: jupyter-notebook, attention

BERT-NER (nert-bert) with google bert https://github.com/google-research.

Stars: ✭ 339 (-72.46%)

Mutual labels: jupyter-notebook, attention

Hierarchical Attention Networks for Chinese Sentiment Classification

Stars: ✭ 206 (-83.27%)

Mutual labels: jupyter-notebook, attention

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

Stars: ✭ 161 (-86.92%)

Mutual labels: jupyter-notebook, attention

Attentive Neural Processes

implementing "recurrent attentive neural processes" to forecast power usage (w. LSTM baseline, MCDropout)

Stars: ✭ 33 (-97.32%)

Mutual labels: jupyter-notebook, attention

All about attention in neural networks. Soft attention, attention maps, local and global attention and multi-head attention.

Stars: ✭ 175 (-85.78%)

Mutual labels: jupyter-notebook, attention

Pytorch Seq2seq

Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.

Stars: ✭ 3,418 (+177.66%)

Mutual labels: jupyter-notebook, attention

Bertqa Attention On Steroids

BertQA - Attention on Steroids

Stars: ✭ 112 (-90.9%)

Mutual labels: jupyter-notebook, attention

My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!

Stars: ✭ 908 (-26.24%)

Mutual labels: jupyter-notebook, attention

Pytorch Original Transformer

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.

Stars: ✭ 411 (-66.61%)

Mutual labels: jupyter-notebook, attention

Multihead Siamese Nets

Implementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task.

Stars: ✭ 144 (-88.3%)

Mutual labels: jupyter-notebook, attention

Natural Language Processing Tutorial for Deep Learning Researchers

Stars: ✭ 9,895 (+703.82%)

Mutual labels: jupyter-notebook, attention

Chinese Chatbot

中文聊天机器人，基于10万组对白训练而成，采用注意力机制，对一般问题都会生成一个有意义的答复。已上传模型，可直接运行，跑不起来直播吃键盘。

Stars: ✭ 124 (-89.93%)

Mutual labels: jupyter-notebook, attention

Rnn For Joint Nlu

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling" (https://arxiv.org/abs/1609.01454)

Stars: ✭ 176 (-85.7%)

Mutual labels: jupyter-notebook, attention

Nlp Models Tensorflow

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

Stars: ✭ 1,603 (+30.22%)

Mutual labels: jupyter-notebook, attention

Jddc solution 4th

2018-JDDC大赛第4名的解决方案

Stars: ✭ 235 (-80.91%)

Mutual labels: jupyter-notebook, attention

Deep learning nlp

Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP

Stars: ✭ 407 (-66.94%)

Mutual labels: jupyter-notebook, attention

Deeplearning Nlp Models

A small, interpretable codebase containing the re-implementation of a few "deep" NLP models in PyTorch. Colab notebooks to run with GPUs. Models: word2vec, CNNs, transformer, gpt.

Stars: ✭ 64 (-94.8%)

Mutual labels: jupyter-notebook, attention

Machine Learning

My Attempt(s) In The World Of ML/DL....

Stars: ✭ 78 (-93.66%)

Mutual labels: jupyter-notebook, attention

Quantum machine learning live

This is the code for "Quantum Machine Learning LIVE" By Siraj Raval on Youtube

Stars: ✭ 80 (-93.5%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

Attention Transfer

PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer" https://arxiv.org/abs/1612.03928
Conference paper at ICLR2017: https://openreview.net/forum?id=Sks9_ajex

What's in this repo so far:

Activation-based AT code for CIFAR-10 experiments
Code for ImageNet experiments (ResNet-18-ResNet-34 student-teacher)
Jupyter notebook to visualize attention maps of ResNet-34 visualize-attention.ipynb

Coming:

grad-based AT
Scenes and CUB activation-based AT code

The code uses PyTorch https://pytorch.org. Note that the original experiments were done using torch-autograd, we have so far validated that CIFAR-10 experiments are exactly reproducible in PyTorch, and are in process of doing so for ImageNet (results are very slightly worse in PyTorch, due to hyperparameters).

bibtex:

@inproceedings{Zagoruyko2017AT,
    author = {Sergey Zagoruyko and Nikos Komodakis},
    title = {Paying More Attention to Attention: Improving the Performance of
             Convolutional Neural Networks via Attention Transfer},
    booktitle = {ICLR},
    url = {https://arxiv.org/abs/1612.03928},
    year = {2017}}

Requirements

First install PyTorch, then install torchnet:

pip install git+https://github.com/pytorch/[email protected]

then install other Python packages:

pip install -r requirements.txt

Experiments

CIFAR-10

This section describes how to get the results in the table 1 of the paper.

First, train teachers:

python cifar.py --save logs/resnet_40_1_teacher --depth 40 --width 1
python cifar.py --save logs/resnet_16_2_teacher --depth 16 --width 2
python cifar.py --save logs/resnet_40_2_teacher --depth 40 --width 2

To train with activation-based AT do:

python cifar.py --save logs/at_16_1_16_2 --teacher_id resnet_16_2_teacher --beta 1e+3

To train with KD:

python cifar.py --save logs/kd_16_1_16_2 --teacher_id resnet_16_2_teacher --alpha 0.9

We plan to add AT+KD with decaying beta to get the best knowledge transfer results soon.

ImageNet

Pretrained model

We provide ResNet-18 pretrained model with activation based AT:

Model	val error
ResNet-18	30.4, 10.8
ResNet-18-ResNet-34-AT	29.3, 10.0

Download link: https://s3.amazonaws.com/modelzoo-networks/resnet-18-at-export.pth

Model definition: https://github.com/szagoruyko/functional-zoo/blob/master/resnet-18-at-export.ipynb

Convergence plot:

Train from scratch

Download pretrained weights for ResNet-34 (see also functional-zoo for more information):

wget https://s3.amazonaws.com/modelzoo-networks/resnet-34-export.pth

Prepare the data following fb.resnet.torch and run training (e.g. using 2 GPUs):

python imagenet.py --imagenetpath ~/ILSVRC2012 --depth 18 --width 1 \
                   --teacher_params resnet-34-export.hkl --gpu_id 0,1 --ngpu 2 \
                   --beta 1e+3

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 1,231

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (12) 🔗