All Projects → jiasenlu → Hiecoattenvqa

jiasenlu / Hiecoattenvqa

Projects that are alternatives of or similar to Hiecoattenvqa

Tf tutorial plus
Tutorials for TensorFlow APIs the official documentation doesn't cover
Stars: ✭ 293 (-1.68%)
Mutual labels:  jupyter-notebook
Master
A machine learning course using Python, Jupyter Notebooks, and OpenML
Stars: ✭ 297 (-0.34%)
Mutual labels:  jupyter-notebook
Tensorwatch
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Stars: ✭ 3,191 (+970.81%)
Mutual labels:  jupyter-notebook
Sklearn Evaluation
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.
Stars: ✭ 294 (-1.34%)
Mutual labels:  jupyter-notebook
Ethereum future
This is the Code for "Ethereum Future Prices" by Siraj Raval on Youtube
Stars: ✭ 294 (-1.34%)
Mutual labels:  jupyter-notebook
Amazon Forecast Samples
Notebooks and examples on how to onboard and use various features of Amazon Forecast.
Stars: ✭ 296 (-0.67%)
Mutual labels:  jupyter-notebook
Show and tell.tensorflow
Stars: ✭ 294 (-1.34%)
Mutual labels:  jupyter-notebook
Pycaret
An open-source, low-code machine learning library in Python
Stars: ✭ 4,594 (+1441.61%)
Mutual labels:  jupyter-notebook
Zero to deep learning video
Repository for the Zero to Deep Learning® Video Course
Stars: ✭ 296 (-0.67%)
Mutual labels:  jupyter-notebook
Ocropy
Python-based tools for document analysis and OCR
Stars: ✭ 3,138 (+953.02%)
Mutual labels:  jupyter-notebook
Lyrics Conditioned Neural Melody Generation
Stars: ✭ 296 (-0.67%)
Mutual labels:  jupyter-notebook
Human Segmentation Pytorch
Human segmentation models, training/inference code, and trained weights, implemented in PyTorch
Stars: ✭ 289 (-3.02%)
Mutual labels:  jupyter-notebook
Xhamster analysis
The data analysiser and predictor of https://xhamster.com/
Stars: ✭ 297 (-0.34%)
Mutual labels:  jupyter-notebook
Mlpractical
Machine Learning Practical course repository
Stars: ✭ 295 (-1.01%)
Mutual labels:  jupyter-notebook
Pyprobml
Python code for "Machine learning: a probabilistic perspective" (2nd edition)
Stars: ✭ 4,197 (+1308.39%)
Mutual labels:  jupyter-notebook
Awesome Gee
A curated list of Google Earth Engine resources
Stars: ✭ 292 (-2.01%)
Mutual labels:  jupyter-notebook
Musicnn
Pronounced as "musician", musicnn is a set of pre-trained deep convolutional neural networks for music audio tagging.
Stars: ✭ 297 (-0.34%)
Mutual labels:  jupyter-notebook
Public plstm
Phased LSTM
Stars: ✭ 298 (+0%)
Mutual labels:  jupyter-notebook
Nerf
Code release for NeRF (Neural Radiance Fields)
Stars: ✭ 4,062 (+1263.09%)
Mutual labels:  jupyter-notebook
Cascaded Fcn
Source code for the MICCAI 2016 Paper "Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional NeuralNetworks and 3D Conditional Random Fields"
Stars: ✭ 296 (-0.67%)
Mutual labels:  jupyter-notebook

Hierarchical Question-Image Co-Attention for Visual Question Answering

Train a Hierarchical Co-Attention model for Visual Question Answering. This current code can get 62.1 on Open-Ended and 66.1 on Multiple-Choice on test-standard split. For COCO-QA, this code can get 65.4 on Accuracy. For more information, please refer the paper https://arxiv.org/abs/1606.00061

Requirements

This code is written in Lua and requires Torch. The preprocssinng code is in Python, and you need to install NLTK if you want to use NLTK to tokenize the question.

You also need to install the following package in order to sucessfully run the code.

Training

We have prepared everything for you ;)

Download Dataset

The first thing you need to do is to download the data and do some preprocessing. Head over to the data/ folder and run

For VQA:

$ python vqa_preprocess.py --download 1 --split 1

--download Ture means you choose to download the VQA data from the VQA website and --split 1 means you use COCO train set to train and validation set to evaluation. --split 2 means you use COCO train+val set to train and test set to evaluate. After this step, it will generate two files under the data folder. vqa_raw_train.json and vqa_raw_test.json

For COCO-QA

$ python vqa_preprocess.py --download 1 

This will download the COCO-QA dataset from here and generate two files under the data folder. cocoqa_raw_train.json and cocoqa_raw_test.json

Download Image Model

Here we use VGG_ILSVRC_19_layers model and Deep Residual network implement by Facebook model.

Head over to the image_model folder and run

$ python download_model.py --download 'VGG' 

This will download the VGG_ILSVRC_19_layers model under image_model folder. To download the Deep Residual Model, you need to change the VGG to Residual.

Generate Image/Question Features

Head over to the prepro folder and run

For VQA:

$ python prepro_vqa.py --input_train_json ../data/vqa_raw_train.json --input_test_json ../data/vqa_raw_test.json --num_ans 1000

to get the question features. --num_ans specifiy how many top answers you want to use during training. You will also see some question and answer statistics in the terminal output. This will generate two files in data/ folder, vqa_data_prepro.h5 and vqa_data_prepro.json.

For COCO-QA

$ python prepro_cocoqa.py --input_train_json ../data/cocoqa_raw_train.json --input_test_json ../data/cocoqa_raw_test.json

COCO-QA use all the answers in train, so there is no --num_ans option. This will generate two files in data/ folder, cocoqa_data_prepro.h5 and cocoqa_data_prepro.json.

Then we are ready to extract the image features.

For VGG image feature:

$ th prepro_img_vgg.lua -input_json ../data/vqa_data_prepro.json -image_root /home/jiasenlu/data/ -cnn_proto ../image_model/VGG_ILSVRC_19_layers_deploy.prototxt -cnn_model ../image_model/VGG_ILSVRC_19_layers.caffemodel

you can change the -gpuid, -backend and -batch_size based on your gpu.

For Deep Residual image feature:

Train the model

We have everything ready to train the VQA and COCO-QA model. Back to the main folder

th train.lua -input_img_train_h5 data/vqa_data_img_vgg_train.h5 -input_img_test_h5 data/vqa_data_img_vgg_test.h5 -input_ques_h5 data/vqa_data_prepro.h5 -input_json data/vqa_data_prepro.json -co_atten_type Alternating -feature_type VGG

to train Alternating co-attention model on VQA using VGG image feature. You can train the Parallel co-attention by setting -co_atten_type Parallel. The prallel co-attention usually takes more time than alternating co-attention.

Note
  • Deep Residual Image Feature is 4 times larger than VGG feature, make sure you have enough RAM when you extract or load the features.
  • If you didn't have large RAM, replace the require 'misc.DataLoader' (Line 11 in train.lua) with require 'misc.DataLoaderDisk. The model will read the data directly from the hard disk (SSD prefered)

Evaluation

Evaluate using Pre-trained Model

The pre-trained model can be download here Note, if you use the vqa train model, you should use the corresponding json file form here

if you use the vqa train+val model, you should use the corresponding json file form here

Metric

To Evaluate VQA, you need to download the VQA evaluation tool. To evaluate COCO-QA, you can use script evaluate_cocoqa.py under metric/ folder. If you need to evaluate based on WUPS, download the evaluation script from here

VQA on Single Image with Free Form Question

We use iTorch to demo the visual question answering with pre-trained model. The script only does the basic tokenize, and please make sure the question is all lowercase, and split by "space".(it's better use NLTK to tokenize and transform the question, you can check the prepro.py for more details.)

In the root folder, open itorch notebook, then you can load any image and ask question using the itorch notebook.

Some of the data file can be download at here

Attention Visualization

Reference

If you use this code as part of any published research, please acknowledge the following paper

@misc{Lu2016Hie,
author = {Lu, Jiasen and Yang, Jianwei and Batra, Dhruv and Parikh, Devi},
title = {Hierarchical Question-Image Co-Attention for Visual Question Answering},
journal = {arXiv preprint arXiv:1606.00061v2},
year = {2016}
}

Attention Demo

teaser results

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].