Show Control And TellShow, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019
AoanetCode for paper "Attention on Attention for Image Captioning". ICCV 2019
Caption generatorA modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
DataturksML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.
Sca Cnn.cvpr17Image Captions Generation with Spatial and Channel-wise Attention
Up Down CaptionerAutomatic image captioning model based on Caffe, using features from bottom-up attention.
Image CaptioningImplementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
Show Adapt And TellCode for "Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner" in ICCV 2017
SightseqComputer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
Gisgis (go image server) go 实现的图片服务,实现基本的上传,下载,存储,按比例裁剪等功能
Video2descriptionVideo to Text: Generates description in natural language for given video (Video Captioning)
ArnetCVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
CameramanagerSimple Swift class to provide all the configurations you need to create custom camera view in your app
Coco CnEnriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks
Image CaptioningImage Captioning: Implementing the Neural Image Caption Generator with python
Image captioninggenerate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset
Bottom Up AttentionBottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Punny captionsAn implementation of the NAACL 2018 paper "Punny Captions: Witty Wordplay in Image Descriptions".
Im2pTensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs
Self Critical.pytorchUnofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
OmninetOfficial Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
NeuralmonkeyAn open-source tool for sequence learning in NLP built on TensorFlow.
Virtex[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
Cs231Complete Assignments for CS231n: Convolutional Neural Networks for Visual Recognition
ScanPyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
AdaptiveattentionImplementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
im2pTensorflow implement of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs
stylenetA pytorch implemention of "StyleNet: Generating Attractive Visual Captions with Styles"
CS231nMy solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
image-captioning-DLCTOfficial pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
Machine-LearningThe projects I do in Machine Learning with PyTorch, keras, Tensorflow, scikit learn and Python.
Image-CaptionUsing LSTM or Transformer to solve Image Captioning in Pytorch
RSTNetRSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words (CVPR 2021)
Awesome-CaptioningA curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
Show-Attend-and-TellA PyTorch implementation of the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
AdaptivePytorch Implementation of Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
MIACode for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)
gramtionTwitter bot for generating photo descriptions (alt text)
Image-CaptioiningThe objective is to process by generating textual description from an image – based on the objects and actions in the image. Using generative models so that it creates novel sentences. Pipeline type models uses two separate learning process, one for language modelling and other for image recognition. It first identifies objects in image and prov…
LaBERTA length-controllable and non-autoregressive image captioning model.