Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → saahiluppal → catr

saahiluppal / catr

Licence: Apache-2.0 license

Image Captioning Using Transformer

Programming Languages

139335 projects - #7 most used programming language

Jupyter Notebook

11667 projects

Labels

transformer image-captioning

Projects that are alternatives of or similar to catr

RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words (CVPR 2021)

Stars: ✭ 71 (-65.53%)

Mutual labels: transformer, image-captioning

Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection

Stars: ✭ 116 (-43.69%)

Mutual labels: transformer, image-captioning

Using LSTM or Transformer to solve Image Captioning in Pytorch

Stars: ✭ 36 (-82.52%)

Mutual labels: transformer, image-captioning

Fairseq Image Captioning

Transformer-based image captioning extension for pytorch/fairseq

Stars: ✭ 180 (-12.62%)

Mutual labels: transformer, image-captioning

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

Stars: ✭ 448 (+117.48%)

Mutual labels: transformer, image-captioning

Meshed Memory Transformer

Meshed-Memory Transformer for Image Captioning. CVPR 2020

Stars: ✭ 230 (+11.65%)

Mutual labels: transformer, image-captioning

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Stars: ✭ 29 (-85.92%)

Mutual labels: transformer

Transformer Temporal Tagger

Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging

Stars: ✭ 55 (-73.3%)

Mutual labels: transformer

Cross-lingual-Summarization

Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention

Stars: ✭ 28 (-86.41%)

Mutual labels: transformer

Serialize PHP variables, including objects, in any format. Support to unserialize it too.

Stars: ✭ 47 (-77.18%)

Mutual labels: transformer

CS231n Assignments Solutions - Spring 2020

Stars: ✭ 48 (-76.7%)

Mutual labels: image-captioning

dingo-serializer-switch

A middleware to switch fractal serializers in dingo

Stars: ✭ 49 (-76.21%)

Mutual labels: transformer

les-military-mrc-rank7

莱斯杯：全国第二届“军事智能机器阅读”挑战赛 - Rank7 解决方案

Stars: ✭ 37 (-82.04%)

Mutual labels: transformer

Code for the CHAMPS Predicting Molecular Properties Kaggle competition

Stars: ✭ 49 (-76.21%)

Mutual labels: transformer

Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).

Stars: ✭ 201 (-2.43%)

Mutual labels: transformer

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

Stars: ✭ 284 (+37.86%)

Mutual labels: transformer

text simplification

Text Simplification Model based on Encoder-Decoder (includes Transformer and Seq2Seq) model.

Stars: ✭ 66 (-67.96%)

Mutual labels: transformer

project-code-py

Leetcode using AI

Stars: ✭ 100 (-51.46%)

Mutual labels: transformer

Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"

Stars: ✭ 385 (+86.89%)

Mutual labels: transformer

Neural-Scam-Artist

Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

Stars: ✭ 18 (-91.26%)

Mutual labels: transformer

View All Similar Projects ➔

CA⫶TR: Image Captioning with Transformers

PyTorch training code and pretrained models for CATR (CAption TRansformer).

The models are also available via torch hub, to load model with pretrained weights simply do:

model = torch.hub.load('saahiluppal/catr', 'v3', pretrained=True)  # you can choose between v1, v2 and v3

Samples:

All these images has been annotated by CATR.

Test with your own bunch of images:

$ python predict.py --path /path/to/image --v v2  // You can choose between v1, v2, v3 [default is v3]

Or Try it out in colab notebook

Usage

There are no extra compiled components in CATR and package dependencies are minimal, so the code is very simple to use. We provide instructions how to install dependencies. First, clone the repository locally:

$ git clone https://github.com/saahiluppal/catr.git

Then, install PyTorch 1.5+ and torchvision 0.6+ along with remaining dependencies:

$ pip install -r requirements.txt

That's it, should be good to train and test caption models.

Data preparation

Download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:

path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

Training

Tweak the hyperparameters from configuration file.

To train baseline CATR on a single GPU for 30 epochs run:

$ python main.py

We train CATR with AdamW setting learning rate in the transformer to 1e-4 and 1e-5 in the backbone. Horizontal flips, scales an crops are used for augmentation. Images are rescaled to have max size 299. The transformer is trained with dropout of 0.1, and the whole model is trained with grad clip of 0.1.

Testing

To test CATR with your own images.

$ python predict.py --path /path/to/image --v v2  // You can choose between v1, v2, v3 [default is v3]

License

CATR is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 206

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (12) 🔗