All Projects → pomonam → Attentioncluster

pomonam / Attentioncluster

Licence: apache-2.0
TensorFlow Implementation of "Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification"

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Attentioncluster

Punctuator2
A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text
Stars: ✭ 483 (+1363.64%)
Mutual labels:  attention
Text Classification
Implementation of papers for text classification task on DBpedia
Stars: ✭ 682 (+1966.67%)
Mutual labels:  attention
Nlp tensorflow project
Use tensorflow to achieve some NLP project, eg: classification chatbot ner attention QAetc.
Stars: ✭ 27 (-18.18%)
Mutual labels:  attention
Performer Pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
Stars: ✭ 546 (+1554.55%)
Mutual labels:  attention
Vad
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Stars: ✭ 622 (+1784.85%)
Mutual labels:  attention
Tf Rnn Attention
Tensorflow implementation of attention mechanism for text classification tasks.
Stars: ✭ 735 (+2127.27%)
Mutual labels:  attention
Rnn Nlu
A TensorFlow implementation of Recurrent Neural Networks for Sequence Classification and Sequence Labeling
Stars: ✭ 463 (+1303.03%)
Mutual labels:  attention
Defactonlp
DeFactoNLP: An Automated Fact-checking System that uses Named Entity Recognition, TF-IDF vector comparison and Decomposable Attention models.
Stars: ✭ 30 (-9.09%)
Mutual labels:  attention
Awesome Fast Attention
list of efficient attention modules
Stars: ✭ 627 (+1800%)
Mutual labels:  attention
Cell Detr
Official and maintained implementation of the paper Attention-Based Transformers for Instance Segmentation of Cells in Microstructures [BIBM 2020].
Stars: ✭ 26 (-21.21%)
Mutual labels:  attention
Speech Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Stars: ✭ 565 (+1612.12%)
Mutual labels:  attention
Simplecvreproduction
Reproduce simple cv project including attention module, classification, object detection, segmentation, keypoint detection, tracking 😄 etc.
Stars: ✭ 602 (+1724.24%)
Mutual labels:  attention
Spatial Transformer Network
A Tensorflow implementation of Spatial Transformer Networks.
Stars: ✭ 794 (+2306.06%)
Mutual labels:  attention
Residual Attention Network
Residual Attention Network for Image Classification
Stars: ✭ 525 (+1490.91%)
Mutual labels:  attention
Isab Pytorch
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
Stars: ✭ 21 (-36.36%)
Mutual labels:  attention
Chinesenre
中文实体关系抽取,pytorch,bilstm+attention
Stars: ✭ 463 (+1303.03%)
Mutual labels:  attention
Nlp paper study
研读顶会论文,复现论文相关代码
Stars: ✭ 691 (+1993.94%)
Mutual labels:  attention
Attentive Neural Processes
implementing "recurrent attentive neural processes" to forecast power usage (w. LSTM baseline, MCDropout)
Stars: ✭ 33 (+0%)
Mutual labels:  attention
Banglatranslator
Bangla Machine Translator
Stars: ✭ 21 (-36.36%)
Mutual labels:  attention
Pytorch Gat
My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!
Stars: ✭ 908 (+2651.52%)
Mutual labels:  attention

AttentionCluster

This code implements attention clusters with shifting operation. It was developed on top of starters code provided by Google AI. Detailed table of contents and descriptions can be found at the original repository.

The module was implemented & tested in TensorFlow 1.8.0. Attention Cluster is distributed under Apache-2 License (see the LICENCE file).

Differences with the original paper

  • The respository makes use of youtube-8m dataset. The original paper uses Flash-MNIST.
  • Empirically, I found that batch normalization layer at attention mechanism increases the convergence time & GAP.
  • In between MoE, I used wide context gating developed from [2].
  • Dropout layers were actively used to prevent overfitting. This was inspired by [3].

Training

Training dataset is available in Google Cloud Platform. In order to use the following command, first download GCP SDK. It is recommanded to adopt early stopping.

gcloud ml-engine local train --package-path=youtube-8m --module-name=youtube-8m.train -- --train_data_pattern='gs://youtube8m-ml-us-east1/2/frame/train/train*.tfrecord' --frame_features=True --base_learning_rate=0.0002 --model=AttentionClusterModule --feature_names='rgb,audio' --feature_sizes='1024,128' --batch_size=128 --train_dir=AttentionClusterModule --base_learning_rate=0.0002 --runtime-version=1.8 --video_cluster_size=128 --audio_cluster_size=16 --shift_operation=True --filter_size=2 --cluster_dropout=0.7 --ff_dropout=0.8 --hidden_size=512 --moe_num_mixtures=2 --learning_rate_decay_examples=2000000 --learning_rate_decay=0.85 --num_epochs=4 --moe_l2=1e-6 --max_step=400000 

Evaluation

Validation / Test dataset is also available in Google Cloud Platform. With this parameter settings, I was able to acheive 86.8 GAP on test data.

gcloud ml-engine local train --package-path=youtube-8m --module-name=youtube-8m.eval -- --eval_data_pattern='gs://youtube8m-ml-us-east1/2/frame/validate/validate*.tfrecord' --frame_features=True --model=AttentionClusterModule --feature_names='rgb,audio' --feature_sizes='1024,128' --batch_size=128 --train_dir=AttentionClusterModule --base_learning_rate=0.0002 --run_once=True --video_cluster_size=128 --audio_cluster_size=16 --shift_operation=True --filter_size=2 --cluster_dropout=0.7 --ff_dropout=0.8 --hidden_size=512 --moe_num_mixtures=2 --learning_rate_decay_examples=2000000 --learning_rate_decay=0.85 --num_epochs=4 --moe_l2=1e-6 --max_step=400000 

References

Please note that I am not the author of the following references.

[1] https://arxiv.org/abs/1711.09550
[2] https://arxiv.org/abs/1706.06905
[3] https://arxiv.org/abs/1706.03762

Changes

  • 1.00 (05 August 2018)
    • Initial public release

Contributors

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].