All Projects → The-AI-Summer → Self Attention Cv

The-AI-Summer / Self Attention Cv

Licence: mit
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Self Attention Cv

visualization
a collection of visualization function
Stars: ✭ 189 (-9.57%)
Mutual labels:  transformer, attention, attention-mechanism
Pytorch Original Transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
Stars: ✭ 411 (+96.65%)
Mutual labels:  attention-mechanism, attention, transformer
CrabNet
Predict materials properties using only the composition information!
Stars: ✭ 57 (-72.73%)
Mutual labels:  transformer, attention, attention-mechanism
h-transformer-1d
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
Stars: ✭ 121 (-42.11%)
Mutual labels:  transformer, attention, attention-mechanism
Se3 Transformer Pytorch
Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. This specific repository is geared towards integration with eventual Alphafold2 replication.
Stars: ✭ 73 (-65.07%)
Mutual labels:  artificial-intelligence, attention-mechanism, transformer
Performer Pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
Stars: ✭ 546 (+161.24%)
Mutual labels:  artificial-intelligence, attention-mechanism, attention
Neural sp
End-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+95.22%)
Mutual labels:  attention-mechanism, attention, transformer
Routing Transformer
Fully featured implementation of Routing Transformer
Stars: ✭ 149 (-28.71%)
Mutual labels:  artificial-intelligence, attention-mechanism, transformer
Global Self Attention Network
A Pytorch implementation of Global Self-Attention Network, a fully-attention backbone for vision tasks
Stars: ✭ 64 (-69.38%)
Mutual labels:  artificial-intelligence, attention-mechanism, attention
Isab Pytorch
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
Stars: ✭ 21 (-89.95%)
Mutual labels:  artificial-intelligence, attention-mechanism, attention
Lambda Networks
Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute
Stars: ✭ 1,497 (+616.27%)
Mutual labels:  artificial-intelligence, attention-mechanism, attention
Linear Attention Transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
Stars: ✭ 205 (-1.91%)
Mutual labels:  artificial-intelligence, attention-mechanism, transformer
Log Anomaly Detector
Log Anomaly Detection - Machine learning to detect abnormal events logs
Stars: ✭ 169 (-19.14%)
Mutual labels:  artificial-intelligence, machine-learning-algorithms
Eeg Dl
A Deep Learning library for EEG Tasks (Signals) Classification, based on TensorFlow.
Stars: ✭ 165 (-21.05%)
Mutual labels:  attention-mechanism, transformer
Multimodal Sentiment Analysis
Attention-based multimodal fusion for sentiment analysis
Stars: ✭ 172 (-17.7%)
Mutual labels:  attention-mechanism, attention
Transformers.jl
Julia Implementation of Transformer models
Stars: ✭ 173 (-17.22%)
Mutual labels:  attention, transformer
Slot Attention
Implementation of Slot Attention from GoogleAI
Stars: ✭ 168 (-19.62%)
Mutual labels:  artificial-intelligence, attention-mechanism
Awesome Deep Learning For Chinese
最全的中文版深度学习资源索引,包括论文,慕课,开源框架,数据集等等
Stars: ✭ 174 (-16.75%)
Mutual labels:  artificial-intelligence, machine-learning-algorithms
Neat Vision
Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)
Stars: ✭ 213 (+1.91%)
Mutual labels:  attention-mechanism, attention
Graph attention pool
Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)
Stars: ✭ 186 (-11%)
Mutual labels:  attention-mechanism, attention

Self-attention building blocks for computer vision applications in PyTorch

Implementation of self attention mechanisms for computer vision in PyTorch with einsum and einops. Focused on computer vision self-attention modules.

Install it via pip

$ pip install self-attention-cv

It would be nice to pre-install pytorch in your environment, in case you don't have a GPU.

Related articles

Code Examples

Multi-head attention

import torch
from self_attention_cv import MultiHeadSelfAttention

model = MultiHeadSelfAttention(dim=64)
x = torch.rand(16, 10, 64)  # [batch, tokens, dim]
mask = torch.zeros(10, 10)  # tokens X tokens
mask[5:8, 5:8] = 1
y = model(x, mask)

Axial attention

import torch
from self_attention_cv import AxialAttentionBlock
model = AxialAttentionBlock(in_channels=256, dim=64, heads=8)
x = torch.rand(1, 256, 64, 64)  # [batch, tokens, dim, dim]
y = model(x)

Vanilla Transformer Encoder

import torch
from self_attention_cv import TransformerEncoder
model = TransformerEncoder(dim=64,blocks=6,heads=8)
x = torch.rand(16, 10, 64)  # [batch, tokens, dim]
mask = torch.zeros(10, 10)  # tokens X tokens
mask[5:8, 5:8] = 1
y = model(x,mask)

Vision Transformer with/without ResNet50 backbone for image classification

import torch
from self_attention_cv import ViT, ResNet50ViT

model1 = ResNet50ViT(img_dim=128, pretrained_resnet=False, 
                        blocks=6, num_classes=10, 
                        dim_linear_block=256, dim=256)
# or
model2 = ViT(img_dim=256, in_channels=3, patch_dim=16, num_classes=10,dim=512)
x = torch.rand(2, 3, 256, 256)
y = model2(x) # [2,10]

A re-implementation of Unet with the Vision Transformer encoder

import torch
from self_attention_cv.transunet import TransUnet
a = torch.rand(2, 3, 128, 128)
model = TransUnet(in_channels=3, img_dim=128, vit_blocks=8,
vit_dim_linear_mhsa_block=512, classes=5)
y = model(a) # [2, 5, 128, 128]

Bottleneck Attention block

import torch
from self_attention_cv.bottleneck_transformer import BottleneckBlock
inp = torch.rand(1, 512, 32, 32)
bottleneck_block = BottleneckBlock(in_channels=512, fmap_size=(32, 32), heads=4, out_channels=1024, pooling=True)
y = bottleneck_block(inp)

Position embeddings are also available

1D Positional Embeddings

import torch
from self_attention_cv.pos_embeddings import AbsPosEmb1D,RelPosEmb1D

model = AbsPosEmb1D(tokens=20, dim_head=64)
# batch heads tokens dim_head
q = torch.rand(2, 3, 20, 64)
y1 = model(q)

model = RelPosEmb1D(tokens=20, dim_head=64, heads=3)
q = torch.rand(2, 3, 20, 64)
y2 = model(q)

2D Positional Embeddings

import torch
from self_attention_cv.pos_embeddings import RelPosEmb2D
dim = 32  # spatial dim of the feat map
model = RelPosEmb2D(
    feat_map_size=(dim, dim),
    dim_head=128)

q = torch.rand(2, 4, dim*dim, 128)
y = model(q)

Acknowledgments

Thanks to Alex Rogozhnikov @arogozhnikov for the awesome einops package. For my re-implementations I have studied and borrowed code from many repositories of Phil Wang @lucidrains. By studying his code I have managed to grasp self-attention, discover nlp stuff that are never referred in the papers, and learn from his clean coding style.

Cited as

@article{adaloglou2021transformer,
    title   = "Transformers in Computer Vision",
    author  = "Adaloglou, Nikolas",
    journal = "https://theaisummer.com/",
    year    = "2021",
    howpublished = {https://github.com/The-AI-Summer/self-attention-cv},
  }

References

  1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. arXiv preprint arXiv:1706.03762.
  2. Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A., & Chen, L. C. (2020, August). Axial-deeplab: Stand-alone axial-attention for panoptic segmentation. In European Conference on Computer Vision (pp. 108-126). Springer, Cham.
  3. Srinivas, A., Lin, T. Y., Parmar, N., Shlens, J., Abbeel, P., & Vaswani, A. (2021). Bottleneck Transformers for Visual Recognition. arXiv preprint arXiv:2101.11605.
  4. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  5. Ramachandran, P., Parmar, N., Vaswani, A., Bello, I., Levskaya, A., & Shlens, J. (2019). Stand-alone self-attention in vision models. arXiv preprint arXiv:1906.05909.
  6. Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., ... & Zhou, Y. (2021). Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306.
  7. Wang, S., Li, B., Khabsa, M., Fang, H., & Ma, H. (2020). Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768.
  8. Bertasius, G., Wang, H., & Torresani, L. (2021). Is Space-Time Attention All You Need for Video Understanding?. arXiv preprint arXiv:2102.05095.
  9. Shaw, P., Uszkoreit, J., & Vaswani, A. (2018). Self-attention with relative position representations. arXiv preprint arXiv:1803.02155.

Support

If you really like this repository and find it useful, please consider (★) starring it, so that it can reach a broader audience of like-minded people. It would be highly appreciated :) !

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].