A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

Stars: ✭ 295 (-55.77%)

Mutual labels: vqa

Nscl Pytorch Release

PyTorch implementation for the Neuro-Symbolic Concept Learner (NS-CL).

Stars: ✭ 276 (-58.62%)

Mutual labels: vqa

MICCAI21 MMQ

Multiple Meta-model Quantifying for Medical Visual Question Answering

Stars: ✭ 16 (-97.6%)

Mutual labels: vqa

bottom-up-features

Bottom-up features extractor implemented in PyTorch.

Stars: ✭ 62 (-90.7%)

Mutual labels: vqa

rosita

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Stars: ✭ 36 (-94.6%)

Mutual labels: vqa

vqa-soft

Accompanying code for "A Simple Loss Function for Improving the Convergence and Accuracy of Visual Question Answering Models" CVPR 2017 VQA workshop paper.

Stars: ✭ 14 (-97.9%)

Mutual labels: vqa

FigureQA-baseline

TensorFlow implementation of the CNN-LSTM, Relation Network and text-only baselines for the paper "FigureQA: An Annotated Figure Dataset for Visual Reasoning"

Stars: ✭ 28 (-95.8%)

Mutual labels: vqa

DVQA dataset

DVQA Dataset: A Bar chart question answering dataset presented at CVPR 2018

Stars: ✭ 20 (-97%)

Mutual labels: vqa

just-ask

[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Stars: ✭ 57 (-91.45%)

Mutual labels: vqa

AoA-pytorch

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering

Stars: ✭ 33 (-95.05%)

Mutual labels: vqa

probnmn-clevr

Code for ICML 2019 paper "Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering" [long-oral]

Stars: ✭ 63 (-90.55%)

Mutual labels: vqa

iMIX

A framework for Multimodal Intelligence research from Inspur HSSLAB.

Stars: ✭ 21 (-96.85%)

Mutual labels: vqa

Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Stars: ✭ 484 (-27.44%)

Mutual labels: vqa

mmgnn textvqa

A Pytorch implementation of CVPR 2020 paper: Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Stars: ✭ 41 (-93.85%)

Mutual labels: vqa

hcrn-videoqa

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

Stars: ✭ 111 (-83.36%)

Mutual labels: vqa

cfvqa

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Stars: ✭ 96 (-85.61%)

Mutual labels: vqa

ZS-F-VQA

Code and Data for paper: Zero-shot Visual Question Answering using Knowledge Graph [ ISWC 2021 ]

Stars: ✭ 51 (-92.35%)

Mutual labels: vqa

VideoNavQA

An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)

Stars: ✭ 22 (-96.7%)

Mutual labels: vqa

neuro-symbolic-ai-soc

Neuro-Symbolic Visual Question Answering on Sort-of-CLEVR using PyTorch

Stars: ✭ 41 (-93.85%)

Mutual labels: vqa

self critical vqa

Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''

Stars: ✭ 39 (-94.15%)

Mutual labels: vqa

Openvqa

A lightweight, scalable, and general framework for visual question answering research

Stars: ✭ 198 (-70.31%)

Mutual labels: vqa

Clipbert

[CVPR 2021 Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning for image-text and video-text tasks.

Stars: ✭ 168 (-74.81%)

Mutual labels: vqa

Pytorch Vqa

Strong baseline for visual question answering

Stars: ✭ 158 (-76.31%)

Mutual labels: vqa

Vqa Mfb

Stars: ✭ 153 (-77.06%)

Mutual labels: vqa

Vqa regat

Research Code for ICCV 2019 paper "Relation-aware Graph Attention Network for Visual Question Answering"

Stars: ✭ 129 (-80.66%)

Mutual labels: vqa

Papers

读过的CV方向的一些论文，图像生成文字、弱监督分割等

Stars: ✭ 99 (-85.16%)

Mutual labels: vqa

Vqa Tensorflow

Tensorflow Implementation of Deeper LSTM+ normalized CNN for Visual Question Answering

Stars: ✭ 98 (-85.31%)

Mutual labels: vqa

Mullowbivqa

Hadamard Product for Low-rank Bilinear Pooling

Stars: ✭ 57 (-91.45%)

Mutual labels: vqa

Vqa

CloudCV Visual Question Answering Demo

Stars: ✭ 57 (-91.45%)

Mutual labels: vqa

Conditional Batch Norm

Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language"

Stars: ✭ 51 (-92.35%)

Mutual labels: vqa

Bottom Up Attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Stars: ✭ 989 (+48.28%)

Mutual labels: vqa

Vizwiz Vqa Pytorch

PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind People

Stars: ✭ 33 (-95.05%)

Mutual labels: vqa

Visual Question Answering

📷 ❓ Visual Question Answering Demo and Algorithmia API

Stars: ✭ 18 (-97.3%)

Mutual labels: vqa

1-39 of 39 similar projects