All Projects → pytorch_violet → Similar Projects or Alternatives

25 Open source projects that are alternatives of or similar to pytorch_violet

[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Stars: ✭ 57 (-52.1%)

Mutual labels: vision-and-language, pre-training, video-question-answering

rosita

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Stars: ✭ 36 (-69.75%)

Mutual labels: vision-and-language, pre-training

VQ-APC

Vector Quantized Autoregressive Predictive Coding (VQ-APC)

Stars: ✭ 34 (-71.43%)

Mutual labels: pre-training

moment detr

[NeurIPS 2021] Moment-DETR code and QVHighlights dataset

Stars: ✭ 143 (+20.17%)

Mutual labels: video-retrieval

distill-and-select

Authors official PyTorch implementation of the "DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval" [IJCV 2022]

Stars: ✭ 43 (-63.87%)

Mutual labels: video-retrieval

Kaleido-BERT

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain.

Stars: ✭ 252 (+111.76%)

Mutual labels: pre-training

lang2seg

Referring Expression Object Segmentation with Caption-Aware Consistency, BMVC 2019

Stars: ✭ 30 (-74.79%)

Mutual labels: vision-and-language

robo-vln

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Stars: ✭ 34 (-71.43%)

Mutual labels: vision-and-language

MIA

Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" （NeurIPS 2019）

Stars: ✭ 57 (-52.1%)

Mutual labels: vision-and-language

synse-zsl

Official PyTorch code for the ICIP 2021 paper 'Syntactically Guided Generative Embeddings For Zero Shot Skeleton Action Recognition'

Stars: ✭ 14 (-88.24%)

Mutual labels: vision-and-language

clip playground

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities

Stars: ✭ 80 (-32.77%)

Mutual labels: vision-and-language

stanford-cs231n-assignments-2020

This repository contains my solutions to the assignments for Stanford's CS231n "Convolutional Neural Networks for Visual Recognition" (Spring 2020).

Stars: ✭ 84 (-29.41%)

Mutual labels: vision-and-language

iMIX

A framework for Multimodal Intelligence research from Inspur HSSLAB.

Stars: ✭ 21 (-82.35%)

Mutual labels: vision-and-language

VidSitu

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

Stars: ✭ 41 (-65.55%)

Mutual labels: vision-and-language

CBP

Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"

Stars: ✭ 52 (-56.3%)

Mutual labels: vision-and-language

wikiHow paper list

A paper list of research conducted based on wikiHow

Stars: ✭ 25 (-78.99%)

Mutual labels: vision-and-language

TRAR-VQA

[ICCV 2021] TRAR: Routing the Attention Spans in Transformers for Visual Question Answering -- Official Implementation

Stars: ✭ 49 (-58.82%)

Mutual labels: vision-and-language

calvin

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

Stars: ✭ 105 (-11.76%)

Mutual labels: vision-and-language

X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)

Stars: ✭ 283 (+137.82%)

Mutual labels: vision-and-language

SIGIR2021 Conure

One Person, One Model, One World: Learning Continual User Representation without Forgetting

Stars: ✭ 23 (-80.67%)

Mutual labels: pre-training

VarCLR

VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

Stars: ✭ 30 (-74.79%)

Mutual labels: pre-training

awesome-graph-self-supervised-learning

Awesome Graph Self-Supervised Learning

Stars: ✭ 805 (+576.47%)

Mutual labels: pre-training

ViCC

[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.

Stars: ✭ 33 (-72.27%)

Mutual labels: video-retrieval

TVQAplus

[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering

Stars: ✭ 99 (-16.81%)

Mutual labels: video-question-answering

NExT-QA

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

Stars: ✭ 50 (-57.98%)

Mutual labels: video-question-answering

1-25 of 25 similar projects