All Projects → zhoubolei → Vqabaseline

zhoubolei / Vqabaseline

Simple Baseline for Visual Question Answering

Programming Languages

lua
6591 projects

Simple Baseline for Visual Question Answering

We descrive a very simple bag-of-words baseline for visual question answering. The description of the baseline is in the arXiv paper http://arxiv.org/pdf/1512.02167.pdf. The code is developed by Bolei Zhou and Yuandong Tian.

results

Demo is available at http://visualqa.csail.mit.edu/

To train the model using the code, the following data of the VQA dataset are needed:

The pre-trained model used in the paper is at http://visualqa.csail.mit.edu/coco_qadevi_BOWIMG_bestepoch93_final.t7model. It has 55.89 on the Open-Ended and 61.69 on Multiple-Choice for the test-standard of COCO VQA dataset.

Contact Bolei Zhou ([email protected]) if you have any questions.

Please cite our arXiv note if you use our code:

B. Zhou, Y. Tian, S. Suhkbaatar, A. Szlam, R. Fergus. Simple Baseline for Visual Question Answering. arXiv:1512.02167

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].