All Categories → No Category → vision-and-language

Top 16 vision-and-language open source projects

rosita
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
MIA
Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)
synse-zsl
Official PyTorch code for the ICIP 2021 paper 'Syntactically Guided Generative Embeddings For Zero Shot Skeleton Action Recognition'
clip playground
An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities
iMIX
A framework for Multimodal Intelligence research from Inspur HSSLAB.
VidSitu
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
CBP
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"
calvin
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
1-16 of 16 vision-and-language projects