All Projects → lucidrains → long-short-transformer

lucidrains / long-short-transformer

Licence: MIT license
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to long-short-transformer

Vit Pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Stars: ✭ 7,199 (+6889.32%)
Mutual labels:  transformers, attention-mechanism
nuwa-pytorch
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
Stars: ✭ 347 (+236.89%)
Mutual labels:  transformers, attention-mechanism
transganformer
Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper
Stars: ✭ 137 (+33.01%)
Mutual labels:  transformers, attention-mechanism
Reformer Pytorch
Reformer, the efficient Transformer, in Pytorch
Stars: ✭ 1,644 (+1496.12%)
Mutual labels:  transformers, attention-mechanism
Dalle Pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Stars: ✭ 3,661 (+3454.37%)
Mutual labels:  transformers, attention-mechanism
uniformer-pytorch
Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks, debuted in ICLR 2022
Stars: ✭ 90 (-12.62%)
Mutual labels:  transformers, attention-mechanism
STAM-pytorch
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
Stars: ✭ 109 (+5.83%)
Mutual labels:  transformers, attention-mechanism
RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
Stars: ✭ 473 (+359.22%)
Mutual labels:  transformers, attention-mechanism
SAMN
This is our implementation of SAMN: Social Attentional Memory Network
Stars: ✭ 45 (-56.31%)
Mutual labels:  attention-mechanism
anonymisation
Anonymization of legal cases (Fr) based on Flair embeddings
Stars: ✭ 85 (-17.48%)
Mutual labels:  transformers
elastic transformers
Making BERT stretchy. Semantic Elasticsearch with Sentence Transformers
Stars: ✭ 153 (+48.54%)
Mutual labels:  transformers
Brain-Tumor-Segmentation
Attention-Guided Version of 2D UNet for Automatic Brain Tumor Segmentation
Stars: ✭ 125 (+21.36%)
Mutual labels:  attention-mechanism
Text-Summarization
Abstractive and Extractive Text summarization using Transformers.
Stars: ✭ 38 (-63.11%)
Mutual labels:  transformers
transformers-interpret
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
Stars: ✭ 861 (+735.92%)
Mutual labels:  transformers
deepconsensus
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
Stars: ✭ 124 (+20.39%)
Mutual labels:  transformers
pysentimiento
A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks
Stars: ✭ 274 (+166.02%)
Mutual labels:  transformers
Machine-Translation-Hindi-to-english-
Machine translation is the task of converting one language to other. Unlike the traditional phrase-based translation system which consists of many small sub-components that are tuned separately, neural machine translation attempts to build and train a single, large neural network that reads a sentence and outputs a correct translation.
Stars: ✭ 19 (-81.55%)
Mutual labels:  attention-mechanism
resolutions-2019
A list of data mining and machine learning papers that I implemented in 2019.
Stars: ✭ 19 (-81.55%)
Mutual labels:  attention-mechanism
Linear-Attention-Mechanism
Attention mechanism
Stars: ✭ 27 (-73.79%)
Mutual labels:  attention-mechanism
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-77.67%)
Mutual labels:  attention-mechanism

Long-Short Transformer

Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch

Install

$ pip install long-short-transformer

Usage

import torch
from long_short_transformer import LongShortTransformer

model = LongShortTransformer(
    num_tokens = 20000,
    dim = 512,
    depth = 6,             # how deep
    heads = 8,             # number of heads
    dim_head = 64,         # dimension per head
    max_seq_len = 1024,    # maximum sequence length
    window_size = 128,     # local attention window size
    r = 256                # like linformer, the sequence length is projected down to this value to avoid the quadratic, where r << n (seq len)
)

x = torch.randint(0, 20000, (1, 1024))
mask = torch.ones(1, 1024).bool()

logits = model(x, mask = mask) # (1, 1024, 20000)

For the autoregressive case, you will have to also supply the segment_size and set causal to True

import torch
from long_short_transformer import LongShortTransformer

model = LongShortTransformer(
    num_tokens = 20000,
    dim = 512,
    depth = 6,             # how deep
    heads = 8,             # number of heads
    dim_head = 64,         # dimension per head
    causal = True,         # autoregressive or not
    max_seq_len = 1024,    # maximum sequence length
    window_size = 128,     # local attention window size
    segment_size = 16,     # sequence is divided into segments of this size, to be projected down to r
    r = 1                  # paper claimed best results with segment to r of 16:1
)

x = torch.randint(0, 20000, (1, 1024))
mask = torch.ones(1, 1024).bool()

logits = model(x, mask = mask) # (1, 1024, 20000)

You can test the autoregressive on enwik8 with

$ python train.py

Citations

@misc{zhu2021longshort,
    title   = {Long-Short Transformer: Efficient Transformers for Language and Vision}, 
    author  = {Chen Zhu and Wei Ping and Chaowei Xiao and Mohammad Shoeybi and Tom Goldstein and Anima Anandkumar and Bryan Catanzaro},
    year    = {2021},
    eprint  = {2107.02192},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].