Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → lucidrains → memory-compressed-attention

lucidrains / memory-compressed-attention

Licence: MIT license

Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning artificial-intelligence attention-mechanism

Projects that are alternatives of or similar to memory-compressed-attention

Lightnetplusplus

LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation

Stars: ✭ 218 (+363.83%)

Mutual labels: attention-mechanism

Attention-based bidirectional LSTM for Classification Task (ICASSP)

Stars: ✭ 87 (+85.11%)

Mutual labels: attention-mechanism

Sentiment Analysis with Deep Learning models. Implemented with Tensorflow and Keras.

Stars: ✭ 35 (-25.53%)

Mutual labels: attention-mechanism

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Stars: ✭ 211 (+348.94%)

Mutual labels: attention-mechanism

Code for paper "Attention on Attention for Image Captioning". ICCV 2019

Stars: ✭ 242 (+414.89%)

Mutual labels: attention-mechanism

Transformers-RL

An easy PyTorch implementation of "Stabilizing Transformers for Reinforcement Learning"

Stars: ✭ 107 (+127.66%)

Mutual labels: attention-mechanism

Keras Attention Mechanism

Attention mechanism Implementation for Keras.

Stars: ✭ 2,504 (+5227.66%)

Mutual labels: attention-mechanism

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

Stars: ✭ 109 (+131.91%)

Mutual labels: attention-mechanism

Attentionalpoolingaction

Code/Model release for NIPS 2017 paper "Attentional Pooling for Action Recognition"

Stars: ✭ 248 (+427.66%)

Mutual labels: attention-mechanism

An implementation of the Show, Attend and Tell paper in Tensorflow, for the OpenAI Im2LaTeX suggested problem

Stars: ✭ 16 (-65.96%)

Mutual labels: attention-mechanism

Triplet Attention

Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]

Stars: ✭ 222 (+372.34%)

Mutual labels: attention-mechanism

Linformer Pytorch

My take on a practical implementation of Linformer for Pytorch.

Stars: ✭ 239 (+408.51%)

Mutual labels: attention-mechanism

question-generation

Neural Models for Key Phrase Detection and Question Generation

Stars: ✭ 29 (-38.3%)

Mutual labels: attention-mechanism

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Stars: ✭ 3,661 (+7689.36%)

Mutual labels: attention-mechanism

Video-Description-with-Spatial-Temporal-Attention

[ACM MM 2017 & IEEE TMM 2020] This is the Theano code for the paper "Video Description with Spatial Temporal Attention"

Stars: ✭ 53 (+12.77%)

Mutual labels: attention-mechanism

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

Stars: ✭ 213 (+353.19%)

Mutual labels: attention-mechanism

A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

Stars: ✭ 90 (+91.49%)

Mutual labels: attention-mechanism

A Neural Network based Chatbot

Stars: ✭ 68 (+44.68%)

Mutual labels: attention-mechanism

Asymmetric Multi-Task Attention Network for Prostate Bed Segmentation in CT Images

Stars: ✭ 26 (-44.68%)

Mutual labels: attention-mechanism

TianChi AIEarth

TianChi AIEarth Contest Solution

Stars: ✭ 57 (+21.28%)

Mutual labels: attention-mechanism

View All Similar Projects ➔

Memory Compressed Attention

Implementation of the Self-Attention layer of the proposed Memory-Compressed Attention, in Pytorch. This repository offers both the causal and non-causal variant, and will take care of the padding if the sequence length is not divisible by the compression ratio.

The code also resolves an edge-case where the very first query have no keys to attend to in the auto-regressive scenario. The solution is to use null key/values, appended to the final compressed set, so that there is always at least 1 key for all queries to attend to.

Install

$ pip install memory_compressed_attention

Usage

import torch
from memory_compressed_attention import MemoryCompressedAttention

attn = MemoryCompressedAttention(
    dim = 512,
    heads = 8,                 # number of heads
    causal = False,            # auto-regressive or not
    compression_factor = 3,    # compression ratio
    dropout = 0.1              # dropout post-attention
)

x = torch.randn(1, 1024, 512)
mask = torch.ones(1, 1024).bool()

attn(x, input_mask = mask) # (1, 1024, 512)

Citations

@misc{liu2018generating,
    title={Generating Wikipedia by Summarizing Long Sequences},
    author={Peter J. Liu and Mohammad Saleh and Etienne Pot and Ben Goodrich and Ryan Sepassi and Lukasz Kaiser and Noam Shazeer},
    year={2018},
    eprint={1801.10198},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 47

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗