All Projects → cai-lw → image-captioning-chinese

cai-lw / image-captioning-chinese

Licence: other
Image Captioning in Chinese using LSTM RNN with attention mechanism

Programming Languages

python
139335 projects - #7 most used programming language
TeX
3793 projects

Image Captioning in Chinese

A course project of Pattern Recognition at Tsinghua University in the spring semester of 2017.

Implemented two RNN-based image captioning models from two corresponding papers:

  • "Show and Tell", simple LSTM RNN: Vinyals, Oriol, et al. "Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge."
  • "Show, Attend and Tell", LSTM RNN with attention: Xu, K., et al. "Show, attend and tell: Neural image caption generation with visual attention."

Dependencies

  • tensorflow 1.1
  • tensorlayer 1.4.3
  • jieba 0.38
  • h5py 2.7.0

Dataset

Images are from MS COCO. To save time from running huge CNNs, they are provided as feature vectors from a pre-trained CNN. To prevent cheating (manual solving), only a small fraction of the original images are provided.

Captions are labeled by students in the course, so they may not be high-quality.

The dataset can be downloaded at Google Drive or 百度网盘.

Usage

Download METEOR and put it in directory meteor-1.5, and run make_val_meteor.py to produce METEOR-compatible validation data.

lstm.py is the "Show and Tell" model,and lstm_attention.py is the "Show, Attend and Tell" model. Both models have many configurable hyperparameters. Run them with --help argument to learn more.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].