All Projects → Multi-Modal-Transformer → Similar Projects or Alternatives

41 Open source projects that are alternatives of or similar to Multi-Modal-Transformer

image-classification
A collection of SOTA Image Classification Models in PyTorch
Stars: ✭ 70 (+14.75%)
Mutual labels:  vision-transformer, mlp-mixer
deep-text-recognition-benchmark
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Stars: ✭ 123 (+101.64%)
Mutual labels:  vision-transformer
Splice
Official Pytorch Implementation for "Splicing ViT Features for Semantic Appearance Transfer" presenting "Splice" (CVPR 2022)
Stars: ✭ 126 (+106.56%)
Mutual labels:  vision-transformer
GFNet
[NeurIPS 2021] Global Filter Networks for Image Classification
Stars: ✭ 199 (+226.23%)
Mutual labels:  vision-transformer
Combining-EfficientNet-and-Vision-Transformers-for-Video-Deepfake-Detection
Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" available on Arxiv and was submitted to ICIAP 2021.
Stars: ✭ 39 (-36.07%)
Mutual labels:  vision-transformer
MPViT
MPViT:Multi-Path Vision Transformer for Dense Prediction in CVPR 2022
Stars: ✭ 193 (+216.39%)
Mutual labels:  vision-transformer
PASSL
PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,BEiT,MAE等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
Stars: ✭ 134 (+119.67%)
Mutual labels:  vision-transformer
koclip
KoCLIP: Korean port of OpenAI CLIP, in Flax
Stars: ✭ 80 (+31.15%)
Mutual labels:  vision-transformer
LaTeX-OCR
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Stars: ✭ 1,566 (+2467.21%)
Mutual labels:  vision-transformer
visualization
a collection of visualization function
Stars: ✭ 189 (+209.84%)
Mutual labels:  vision-transformer
iPerceive
Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering | Python3 | PyTorch | CNNs | Causality | Reasoning | LSTMs | Transformers | Multi-Head Self Attention | Published in IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
Stars: ✭ 52 (-14.75%)
Mutual labels:  multi-modal
OASIS
Official implementation of the paper "You Only Need Adversarial Supervision for Semantic Image Synthesis" (ICLR 2021)
Stars: ✭ 232 (+280.33%)
Mutual labels:  multi-modal
InterpretDL
InterpretDL: Interpretation of Deep Learning Models,基于『飞桨』的模型可解释性算法库。
Stars: ✭ 121 (+98.36%)
Mutual labels:  vision-transformer
pytorch-vit
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Stars: ✭ 250 (+309.84%)
Mutual labels:  vision-transformer
nemar
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
Stars: ✭ 120 (+96.72%)
Mutual labels:  multi-modal
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+1245.9%)
Mutual labels:  vision-transformer
YOLOS
You Only Look at One Sequence (NeurIPS 2021)
Stars: ✭ 612 (+903.28%)
Mutual labels:  vision-transformer
TransMorph Transformer for Medical Image Registration
TransMorph: Transformer for Unsupervised Medical Image Registration (PyTorch)
Stars: ✭ 130 (+113.11%)
Mutual labels:  vision-transformer
ViT-V-Net for 3D Image Registration Pytorch
Vision Transformer for 3D medical image registration (Pytorch).
Stars: ✭ 169 (+177.05%)
Mutual labels:  vision-transformer
VidSitu
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
Stars: ✭ 41 (-32.79%)
Mutual labels:  video-language
EGSC-IT
Tensorflow implementation of ICLR2019 paper "Exemplar Guided Unsupervised Image-to-Image Translation with Semantic Consistency"
Stars: ✭ 29 (-52.46%)
Mutual labels:  multi-modal
Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
Stars: ✭ 2,828 (+4536.07%)
Mutual labels:  vision-transformer
ICON
(TPAMI2022) Salient Object Detection via Integrity Learning.
Stars: ✭ 125 (+104.92%)
Mutual labels:  mlp-mixer
Evo-ViT
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Stars: ✭ 50 (-18.03%)
Mutual labels:  vision-transformer
semantic-segmentation
SOTA Semantic Segmentation Models in PyTorch
Stars: ✭ 464 (+660.66%)
Mutual labels:  vision-transformer
SReT
Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"
Stars: ✭ 51 (-16.39%)
Mutual labels:  vision-transformer
TRAR-VQA
[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation
Stars: ✭ 49 (-19.67%)
Mutual labels:  multi-modal
MMTOD
Multi-modal Thermal Object Detector
Stars: ✭ 38 (-37.7%)
Mutual labels:  multi-modal
ImageNet21K
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
Stars: ✭ 565 (+826.23%)
Mutual labels:  vision-transformer
ReferFormer
[CVPR2022] Official Implementation of ReferFormer
Stars: ✭ 230 (+277.05%)
Mutual labels:  video-language
mobilevit-pytorch
A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".
Stars: ✭ 349 (+472.13%)
Mutual labels:  vision-transformer
transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
Stars: ✭ 201 (+229.51%)
Mutual labels:  vision-transformer
skill-sample-nodejs-berry-bash
Demonstrates the use of interactive render template directives through multi modal screen design.
Stars: ✭ 22 (-63.93%)
Mutual labels:  multi-modal
libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Stars: ✭ 284 (+365.57%)
Mutual labels:  vision-transformer
keras-vision-transformer
The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET
Stars: ✭ 91 (+49.18%)
Mutual labels:  vision-transformer
pytorch-cifar-model-zoo
Implementation of Conv-based and Vit-based networks designed for CIFAR.
Stars: ✭ 62 (+1.64%)
Mutual labels:  vision-transformer
VT-UNet
[MICCAI2022] This is an official PyTorch implementation for A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation
Stars: ✭ 151 (+147.54%)
Mutual labels:  vision-transformer
Dalle Pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Stars: ✭ 3,661 (+5901.64%)
Mutual labels:  multi-modal
Valhalla
Open Source Routing Engine for OpenStreetMap
Stars: ✭ 1,794 (+2840.98%)
Mutual labels:  multi-modal
Ghostnet
CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.
Stars: ✭ 1,744 (+2759.02%)
Mutual labels:  vision-transformer
SwinIR
SwinIR: Image Restoration Using Swin Transformer (official repository)
Stars: ✭ 1,260 (+1965.57%)
Mutual labels:  vision-transformer
1-41 of 41 similar projects