All Projects → lucidrains → linformer

lucidrains / linformer

Licence: MIT License
Implementation of Linformer for Pytorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to linformer

En-transformer
Implementation of E(n)-Transformer, which extends the ideas of Welling's E(n)-Equivariant Graph Neural Network to attention
Stars: ✭ 131 (+10.08%)
Mutual labels:  transformer, attention-mechanism
pynmt
a simple and complete pytorch implementation of neural machine translation system
Stars: ✭ 13 (-89.08%)
Mutual labels:  transformer, attention-mechanism
h-transformer-1d
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
Stars: ✭ 121 (+1.68%)
Mutual labels:  transformer, attention-mechanism
Self Attention Cv
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
Stars: ✭ 209 (+75.63%)
Mutual labels:  transformer, attention-mechanism
enformer-pytorch
Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch
Stars: ✭ 146 (+22.69%)
Mutual labels:  transformer, attention-mechanism
Transformers-RL
An easy PyTorch implementation of "Stabilizing Transformers for Reinforcement Learning"
Stars: ✭ 107 (-10.08%)
Mutual labels:  transformer, attention-mechanism
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-80.67%)
Mutual labels:  transformer, attention-mechanism
Transformer In Generating Dialogue
An Implementation of 'Attention is all you need' with Chinese Corpus
Stars: ✭ 121 (+1.68%)
Mutual labels:  transformer, attention-mechanism
OverlapPredator
[CVPR 2021, Oral] PREDATOR: Registration of 3D Point Clouds with Low Overlap.
Stars: ✭ 293 (+146.22%)
Mutual labels:  transformer, attention-mechanism
visualization
a collection of visualization function
Stars: ✭ 189 (+58.82%)
Mutual labels:  transformer, attention-mechanism
Linear Attention Transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
Stars: ✭ 205 (+72.27%)
Mutual labels:  transformer, attention-mechanism
Image-Caption
Using LSTM or Transformer to solve Image Captioning in Pytorch
Stars: ✭ 36 (-69.75%)
Mutual labels:  transformer, attention-mechanism
Eeg Dl
A Deep Learning library for EEG Tasks (Signals) Classification, based on TensorFlow.
Stars: ✭ 165 (+38.66%)
Mutual labels:  transformer, attention-mechanism
TianChi AIEarth
TianChi AIEarth Contest Solution
Stars: ✭ 57 (-52.1%)
Mutual labels:  transformer, attention-mechanism
Routing Transformer
Fully featured implementation of Routing Transformer
Stars: ✭ 149 (+25.21%)
Mutual labels:  transformer, attention-mechanism
CrabNet
Predict materials properties using only the composition information!
Stars: ✭ 57 (-52.1%)
Mutual labels:  transformer, attention-mechanism
Eqtransformer
EQTransformer, a python package for earthquake signal detection and phase picking using AI.
Stars: ✭ 95 (-20.17%)
Mutual labels:  transformer, attention-mechanism
Overlappredator
[CVPR 2021, Oral] PREDATOR: Registration of 3D Point Clouds with Low Overlap.
Stars: ✭ 106 (-10.92%)
Mutual labels:  transformer, attention-mechanism
dodrio
Exploring attention weights in transformer-based models with linguistic knowledge.
Stars: ✭ 233 (+95.8%)
Mutual labels:  transformer, attention-mechanism
FragmentVC
Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention
Stars: ✭ 134 (+12.61%)
Mutual labels:  transformer, attention-mechanism

Linformer for Pytorch

An implementation of Linformer in Pytorch. Linformer comes with two deficiencies. (1) It does not work for the auto-regressive case. (2) Assumes a fixed sequence length. However, if benchmarks show it to perform well enough, it will be added to this repository as a self-attention layer to be used in the encoder.

Linformer has been put into production by Facebook!

Install

$ pip install linformer

Usage

Linformer language model

import torch
from linformer import LinformerLM

model = LinformerLM(
    num_tokens = 20000,
    dim = 512,
    seq_len = 4096,
    depth = 12,
    heads = 8,
    dim_head = 128,        # be able to set the dimension of each head in multi-head attention
    k = 256,               # this is the k that the key/values are projected to along the sequence dimension
    one_kv_head = True,    # share one key/value head across all heads
    share_kv = False,      # share the same projection for keys and values
    reversible = True      # make network reversible, like Reformer
)

x = torch.randint(0, 20000, (1, 4096))
model(x) # (1, 4096, 20000)

Linformer

import torch
from linformer import Linformer

model = Linformer(
    dim = 512,
    seq_len = 4096,
    depth = 12,
    heads = 8,
    k = 256,
    one_kv_head = True,
    share_kv = True
)

x = torch.randn(1, 4096, 512)
model(x) # (1, 4096, 512)

Single Self-Attention layer

import torch
from linformer import LinformerSelfAttention

attn = LinformerSelfAttention(
    dim = 512,
    seq_len = 4096,
    heads = 8,
    k = 256,
    one_kv_head = True,
    share_kv = True
)

x = torch.randn(1, 4096, 512)
attn(x) # (1, 4096, 512)

Self-Attention layer above receiving contextual keys. The sequence length is validated on the length of the contextual keys instead of the source sequence.

import torch
from linformer import LinformerSelfAttention

attn = LinformerSelfAttention(
    dim = 512,
    seq_len = 8192,
    heads = 8,
    k = 256,
    one_kv_head = True,
    share_kv = True
)

x = torch.randn(1, 2048, 512)
context = torch.randn(1, 8192, 512)
attn(x, context) # (1, 2048, 512)

Citations

@misc{wang2020linformer,
    title={Linformer: Self-Attention with Linear Complexity},
    author={Sinong Wang and Belinda Z. Li and Madian Khabsa and Han Fang and Hao Ma},
    year={2020},
    eprint={2006.04768},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
@inproceedings{kitaev2020reformer,
    title       = {Reformer: The Efficient Transformer},
    author      = {Nikita Kitaev and Lukasz Kaiser and Anselm Levskaya},
    booktitle   = {International Conference on Learning Representations},
    year        = {2020},
    url         = {https://openreview.net/forum?id=rkgNKkHtvB}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].