MmfA modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
AoA-pytorchA Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
iPerceiveApplying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering | Python3 | PyTorch | CNNs | Causality | Reasoning | LSTMs | Transformers | Multi-Head Self Attention | Published in IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
VidSitu[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
MTL-AQAWhat and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]