All Projects → zjuchenlong → Sca Cnn.cvpr17

zjuchenlong / Sca Cnn.cvpr17

Image Captions Generation with Spatial and Channel-wise Attention

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Sca Cnn.cvpr17

Image-Caption
Using LSTM or Transformer to solve Image Captioning in Pytorch
Stars: ✭ 36 (-81.82%)
Mutual labels:  image-captioning, attention-mechanism
Deepalignmentnetwork
A deep neural network for face alignment
Stars: ✭ 480 (+142.42%)
Mutual labels:  cvpr-2017, theano
Retinal-Disease-Diagnosis-With-Residual-Attention-Networks
Using Residual Attention Networks to diagnose retinal diseases in medical images
Stars: ✭ 14 (-92.93%)
Mutual labels:  resnet, attention-mechanism
Adaptiveattention
Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
Stars: ✭ 303 (+53.03%)
Mutual labels:  attention-mechanism, image-captioning
A Pytorch Tutorial To Image Captioning
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
Stars: ✭ 1,867 (+842.93%)
Mutual labels:  attention-mechanism, image-captioning
simpleAICV-pytorch-ImageNet-COCO-training
SimpleAICV:pytorch training example on ImageNet(ILSVRC2012)/COCO2017/VOC2007+2012 datasets.Include ResNet/DarkNet/RetinaNet/FCOS/CenterNet/TTFNet/YOLOv3/YOLOv4/YOLOv5/YOLOX.
Stars: ✭ 276 (+39.39%)
Mutual labels:  coco, resnet
Tf Faster Rcnn
Tensorflow Faster RCNN for Object Detection
Stars: ✭ 3,604 (+1720.2%)
Mutual labels:  resnet, coco
Aoanet
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
Stars: ✭ 242 (+22.22%)
Mutual labels:  attention-mechanism, image-captioning
Show Attend And Tell
TensorFlow Implementation of "Show, Attend and Tell"
Stars: ✭ 869 (+338.89%)
Mutual labels:  attention-mechanism, image-captioning
Vqa.pytorch
Visual Question Answering in Pytorch
Stars: ✭ 602 (+204.04%)
Mutual labels:  resnet, coco
Nmt Keras
Neural Machine Translation with Keras
Stars: ✭ 501 (+153.03%)
Mutual labels:  attention-mechanism, theano
Pytorch Imagenet Cifar Coco Voc Training
Training examples and results for ImageNet(ILSVRC2012)/CIFAR100/COCO2017/VOC2007+VOC2012 datasets.Image Classification/Object Detection.Include ResNet/EfficientNet/VovNet/DarkNet/RegNet/RetinaNet/FCOS/CenterNet/YOLOv3.
Stars: ✭ 130 (-34.34%)
Mutual labels:  resnet, coco
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (-36.36%)
Mutual labels:  attention-mechanism, image-captioning
Eeg Dl
A Deep Learning library for EEG Tasks (Signals) Classification, based on TensorFlow.
Stars: ✭ 165 (-16.67%)
Mutual labels:  attention-mechanism, resnet
Vip
Video Platform for Action Recognition and Object Detection in Pytorch
Stars: ✭ 175 (-11.62%)
Mutual labels:  resnet
Cvpr 2017 Abstracts Collection
Collection of CVPR 2017, including titles, links, authors, abstracts and my own comments
Stars: ✭ 186 (-6.06%)
Mutual labels:  cvpr-2017
Slot filling intent joint model
attention based joint model for intent detection and slot filling
Stars: ✭ 175 (-11.62%)
Mutual labels:  attention-mechanism
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (-12.63%)
Mutual labels:  resnet
Pytorch Deeplab Xception
DeepLab v3+ model in PyTorch. Support different backbones.
Stars: ✭ 2,466 (+1145.45%)
Mutual labels:  resnet
Attentive Gan Derainnet
Unofficial tensorflow implemention of "Attentive Generative Adversarial Network for Raindrop Removal from A Single Image (CVPR 2018) " model https://maybeshewill-cv.github.io/attentive-gan-derainnet/
Stars: ✭ 184 (-7.07%)
Mutual labels:  attention-mechanism

SCA-CNN

Source code for the paper: SCA-CNN: Spatial and Channel-wise Attention in Convolution Networks for Imgae Captioning

This code is based on arctic-captions and arctic-capgen-vid.

This code is only for two-layered attention model in ResNet-152 Network for MS COCO dataset. Other networks (VGG-19) or datasets (Flickr30k/Flickr8k) can also be used with minor modifications.

Dependencies

  • A python library: Theano.

  • Other python package dependencies like numpy/scipy, skimage, opencv, sklearn, hdf5 which can be installed by pip, or simply run

    $ pip install -r requirements.txt
    
  • Caffe for image CNN feature extraction. You should install caffe and building the pycaffe interface to extract the image CNN feature.

  • The official coco evaluation scrpits coco-caption for results evaluation. Install it by simply adding it into $PYTHONPATH.

Getting Started

  1. Get the code $ git clone the repo and install the dependencies

  2. Save the pretrained CNN weights Save the ResNet-152 weights pretrained on ImageNet. Before running the code, set the variable deploy and model in save_resnet_weight.py to your own path. Then run:

$ cd cnn
$ python save_resnet_weight.py
  1. Preprocessing the dataset For the preprocessing of captioning, we directly use the processed JSON blob from neuraltalk. Similar to step 2, set the PATH in cnn_until.py and make_coco.py to your own install path. Then run:
$ cd data
$ python make_coco.py
  1. Training The results are saved in the directory exp.
$ THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python sca_resnet_branch2b.py

Citation

If you find this code useful, please cite the following paper:

@inproceedings{chen2016sca,
  title={SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning},
  author={Chen, Long and Zhang, Hanwang and Xiao, Jun and Nie, Liqiang and Shao, Jian and Liu, Wei and Chua, Tat-Seng},
  booktitle={CVPR},
  year={2017}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].