All Projects → Oscar → Similar Projects or Alternatives

103 Open source projects that are alternatives of or similar to Oscar

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Stars: ✭ 989 (+149.75%)

[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Stars: ✭ 57 (-85.61%)

Mutual labels: vqa

BUTD model

A pytorch implementation of "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" for image captioning.

Stars: ✭ 28 (-92.93%)

Mutual labels: image-captioning

hcrn-videoqa

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

Stars: ✭ 111 (-71.97%)

Mutual labels: vqa

Show and Tell

Show and Tell : A Neural Image Caption Generator

Stars: ✭ 74 (-81.31%)

Mutual labels: image-captioning

image-captioning-DLCT

Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).

Stars: ✭ 134 (-66.16%)

Mutual labels: image-captioning

udacity-cvnd-projects

My solutions to the projects assigned for the Udacity Computer Vision Nanodegree

Stars: ✭ 36 (-90.91%)

Mutual labels: image-captioning

captioning chainer

A fast implementation of Neural Image Caption by Chainer

Stars: ✭ 17 (-95.71%)

Mutual labels: image-captioning

AoA-pytorch

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering

Stars: ✭ 33 (-91.67%)

Mutual labels: vqa

neuro-symbolic-ai-soc

Neuro-Symbolic Visual Question Answering on Sort-of-CLEVR using PyTorch

Stars: ✭ 41 (-89.65%)

Mutual labels: vqa

Pytorch Vqa

Strong baseline for visual question answering

Stars: ✭ 158 (-60.1%)

Mutual labels: vqa

Image-Captioining

The objective is to process by generating textual description from an image – based on the objects and actions in the image. Using generative models so that it creates novel sentences. Pipeline type models uses two separate learning process, one for language modelling and other for image recognition. It first identifies objects in image and prov…

Stars: ✭ 20 (-94.95%)

Mutual labels: image-captioning

vqa-soft

Accompanying code for "A Simple Loss Function for Improving the Convergence and Accuracy of Visual Question Answering Models" CVPR 2017 VQA workshop paper.

Stars: ✭ 14 (-96.46%)

Mutual labels: vqa

LaBERT

A length-controllable and non-autoregressive image captioning model.

Stars: ✭ 50 (-87.37%)

Mutual labels: image-captioning

cvpr18-caption-eval

Learning to Evaluate Image Captioning. CVPR 2018

Stars: ✭ 79 (-80.05%)

Mutual labels: image-captioning

Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Stars: ✭ 484 (+22.22%)

Mutual labels: vqa

Machine-Learning

The projects I do in Machine Learning with PyTorch, keras, Tensorflow, scikit learn and Python.

Stars: ✭ 54 (-86.36%)

Mutual labels: image-captioning

Image-Captioning-with-Beam-Search

Generating image captions using Xception Network and Beam Search in Keras

Stars: ✭ 18 (-95.45%)

Mutual labels: image-captioning

Adaptiveattention

Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"

Stars: ✭ 303 (-23.48%)

Mutual labels: image-captioning

ZS-F-VQA

Code and Data for paper: Zero-shot Visual Question Answering using Knowledge Graph [ ISWC 2021 ]

Stars: ✭ 51 (-87.12%)

Mutual labels: vqa

RSTNet

RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words (CVPR 2021)

Stars: ✭ 71 (-82.07%)

Mutual labels: image-captioning

Openvqa

A lightweight, scalable, and general framework for visual question answering research

Stars: ✭ 198 (-50%)

Mutual labels: vqa

MICCAI21 MMQ

Multiple Meta-model Quantifying for Medical Visual Question Answering

Stars: ✭ 16 (-95.96%)

Mutual labels: vqa

Show-Attend-and-Tell

A PyTorch implementation of the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

Stars: ✭ 58 (-85.35%)

Mutual labels: image-captioning

Vqa regat

Research Code for ICCV 2019 paper "Relation-aware Graph Attention Network for Visual Question Answering"

Stars: ✭ 129 (-67.42%)

Mutual labels: vqa

Vqa Tensorflow

Tensorflow Implementation of Deeper LSTM+ normalized CNN for Visual Question Answering

Stars: ✭ 98 (-75.25%)

Mutual labels: vqa

gramtion

Twitter bot for generating photo descriptions (alt text)

Stars: ✭ 21 (-94.7%)

Mutual labels: image-captioning

CS231n

My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition

Stars: ✭ 30 (-92.42%)

Mutual labels: image-captioning

pix2code-pytorch

PyTorch implementation of pix2code. 🔥

Stars: ✭ 24 (-93.94%)

Mutual labels: image-captioning

Nscl Pytorch Release

PyTorch implementation for the Neuro-Symbolic Concept Learner (NS-CL).

Stars: ✭ 276 (-30.3%)

Mutual labels: vqa

probnmn-clevr

Code for ICML 2019 paper "Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering" [long-oral]

Stars: ✭ 63 (-84.09%)

Mutual labels: vqa

FigureQA-baseline

TensorFlow implementation of the CNN-LSTM, Relation Network and text-only baselines for the paper "FigureQA: An Annotated Figure Dataset for Visual Reasoning"

Stars: ✭ 28 (-92.93%)

Mutual labels: vqa

iMIX

A framework for Multimodal Intelligence research from Inspur HSSLAB.

Stars: ✭ 21 (-94.7%)

Mutual labels: vqa

Scan

PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

Stars: ✭ 306 (-22.73%)

Mutual labels: image-captioning

Udacity

This repo includes all the projects I have finished in the Udacity Nanodegree programs

Stars: ✭ 57 (-85.61%)

Mutual labels: image-captioning

DVQA dataset

DVQA Dataset: A Bar chart question answering dataset presented at CVPR 2018

Stars: ✭ 20 (-94.95%)

Mutual labels: vqa

mmgnn textvqa

A Pytorch implementation of CVPR 2020 paper: Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Stars: ✭ 41 (-89.65%)

Mutual labels: vqa

im2p

Tensorflow implement of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs

Stars: ✭ 43 (-89.14%)

Mutual labels: image-captioning

catr

Image Captioning Using Transformer

Stars: ✭ 206 (-47.98%)

Mutual labels: image-captioning

Image-Caption

Using LSTM or Transformer to solve Image Captioning in Pytorch

Stars: ✭ 36 (-90.91%)

Mutual labels: image-captioning

CS231n

CS231n Assignments Solutions - Spring 2020

Stars: ✭ 48 (-87.88%)

Mutual labels: image-captioning

Virtex

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

Stars: ✭ 323 (-18.43%)

Mutual labels: image-captioning

cfvqa

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Stars: ✭ 96 (-75.76%)

Mutual labels: vqa

localized-narratives

Localized Narratives

Stars: ✭ 60 (-84.85%)

Mutual labels: image-captioning

VideoNavQA

An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)

Stars: ✭ 22 (-94.44%)

Mutual labels: vqa

stylenet

A pytorch implemention of "StyleNet: Generating Attractive Visual Captions with Styles"

Stars: ✭ 58 (-85.35%)

Mutual labels: image-captioning

self critical vqa

Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''

Stars: ✭ 39 (-90.15%)

Mutual labels: vqa

Awesome-Captioning

A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)

Stars: ✭ 56 (-85.86%)

Mutual labels: image-captioning

Clipbert

[CVPR 2021 Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning for image-text and video-text tasks.

Stars: ✭ 168 (-57.58%)

Mutual labels: vqa

Awesome Visual Question Answering

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

Stars: ✭ 295 (-25.51%)

Mutual labels: vqa

Vqa Mfb

Stars: ✭ 153 (-61.36%)

Mutual labels: vqa

Image-Captioning

Image Captioning with Keras

Stars: ✭ 60 (-84.85%)

Mutual labels: image-captioning

Papers

读过的CV方向的一些论文，图像生成文字、弱监督分割等

Stars: ✭ 99 (-75%)