Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → jnhwkim → Mullowbivqa

jnhwkim / Mullowbivqa

Licence: other

Hadamard Product for Low-rank Bilinear Pooling

Programming Languages

lua

6591 projects

Labels

question-answering vqa

Projects that are alternatives of or similar to Mullowbivqa

Vqa Tensorflow

Tensorflow Implementation of Deeper LSTM+ normalized CNN for Visual Question Answering

Stars: ✭ 98 (+71.93%)

Mutual labels: vqa, question-answering

DVQA dataset

DVQA Dataset: A Bar chart question answering dataset presented at CVPR 2018

Stars: ✭ 20 (-64.91%)

Mutual labels: vqa, question-answering

VideoNavQA

An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)

Stars: ✭ 22 (-61.4%)

Mutual labels: vqa, question-answering

MICCAI21 MMQ

Multiple Meta-model Quantifying for Medical Visual Question Answering

Stars: ✭ 16 (-71.93%)

Mutual labels: vqa, question-answering

hcrn-videoqa

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

Stars: ✭ 111 (+94.74%)

Mutual labels: vqa, question-answering

Mac Network

Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)

Stars: ✭ 444 (+678.95%)

Mutual labels: vqa, question-answering

Bert language understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

Stars: ✭ 933 (+1536.84%)

Mutual labels: question-answering

Vizwiz Vqa Pytorch

PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind People

Stars: ✭ 33 (-42.11%)

Mutual labels: vqa

Visual Question Answering

📷 ❓ Visual Question Answering Demo and Algorithmia API

Stars: ✭ 18 (-68.42%)

Mutual labels: vqa

Nlp chinese corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

Stars: ✭ 6,656 (+11577.19%)

Mutual labels: question-answering

Bert Vietnamese Question Answering

Vietnamese question answering system with BERT

Stars: ✭ 57 (+0%)

Mutual labels: question-answering

Cdqa Annotator

⛔ [NOT MAINTAINED] A web-based annotator for closed-domain question answering datasets with SQuAD format.

Stars: ✭ 48 (-15.79%)

Mutual labels: question-answering

Cnn Question Classification Keras

Chinese Question Classifier (Keras Implementation) on BQuLD

Stars: ✭ 28 (-50.88%)

Mutual labels: question-answering

Knowledge Graphs

A collection of research on knowledge graphs

Stars: ✭ 845 (+1382.46%)

Mutual labels: question-answering

Conversational Ai

Conversational AI Reading Materials

Stars: ✭ 34 (-40.35%)

Mutual labels: question-answering

Deep Embedded Memory Networks

https://arxiv.org/abs/1707.00836

Stars: ✭ 19 (-66.67%)

Mutual labels: question-answering

Conditional Batch Norm

Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language"

Stars: ✭ 51 (-10.53%)

Mutual labels: vqa

Insuranceqa Corpus Zh

🚁 保险行业语料库，聊天机器人

Stars: ✭ 821 (+1340.35%)

Mutual labels: question-answering

Zeronet Dev Center

A Development Center for the ZeroNet. Tutorials on ZeroNet Zite Development, Collaboration, and Questions

Stars: ✭ 21 (-63.16%)

Mutual labels: question-answering

Shift Ctrl F

🔎 Search the information available on a webpage using natural language instead of an exact string match.

Stars: ✭ 1,023 (+1694.74%)

Mutual labels: question-answering

View All Similar Projects ➔

Hadamard Product for Low-rank Bilinear Pooling

Multimodal Low-rank Bilinear Attention Networks (MLB) have an efficient attention mechanism by low-rank bilinear pooling for visual question-answering tasks. MLB achieves a new state-of-the-art performance, having a better parsimonious property than previous methods.

This current code can get 65.07 on Open-Ended and 68.89 on Multiple-Choice on test-standard split for the VQA dataset. For an ensemble model, 66.89 and 70.29, resepectively.

Dependencies

You can install the dependencies:

luarocks install rnn

Training

Please follow the instruction from VQA_LSTM_CNN for preprocessing. --split 2 option allows to use train+val set to train, and test-dev or test-standard set to evaluate. Set --num_ans to 2000 to reproduce the result.

For question features, you need to use this:

skip-thoughts
DPPnet (see 003_skipthoughts_porting)
make_lookuptable.lua

for image features,

$ th prepro_res.lua -input_json data_train-val_test-dev_2k/data_prepro.json -image_root path_to_image_root -cnn_model path to cnn_model

The pretrained ResNet-152 model and related scripts can be found in fb.resnet.torch.

$ th train.lua

With the default parameter, this will take around 2.6 days on a sinlge NVIDIA Titan X GPU, and will generate the model under model/. For the result of the paper, use -seconds option for answer sampling in Section 5. seconds.json file can be optained using prepro_seconds.lua or from here (updated as default).

Evaluation

$ th eval.lua

References

If you use this code as part of any published research, we'd really appreciate it if you could cite the following paper:

@inproceedings{Kim2017,
author = {Kim, Jin-Hwa and On, Kyoung Woon and Lim, Woosang and Kim, Jeonghee and Ha, Jung-Woo and Zhang, Byoung-Tak},
booktitle = {The 5th International Conference on Learning Representations},
title = {{Hadamard Product for Low-rank Bilinear Pooling}},
year = {2017}
}

This code uses Torch7 rnn package and its TrimZero module for question embeddings. Notice that following papers:

@article{Leonard2015a,
author = {L{\'{e}}onard, Nicholas and Waghmare, Sagar and Wang, Yang and Kim, Jin-Hwa},
journal = {arXiv preprint arXiv:1511.07889},
title = {{rnn : Recurrent Library for Torch}},
year = {2015}
}
@inproceedings{Kim2016a,
author = {Kim, Jin-Hwa and Kim, Jeonghee and Ha, Jung-Woo and Zhang, Byoung-Tak},
booktitle = {Proceedings of KIIS Spring Conference},
isbn = {2093-4025},
number = {1},
pages = {165--166},
title = {{TrimZero: A Torch Recurrent Module for Efficient Natural Language Processing}},
volume = {26},
year = {2016}
}

License

BSD 3-Clause License

Patent (Pending)

METHOD AND SYSTEM FOR PROCESSING DATA USING ELEMENT-WISE MULTIPLICATION AND MULTIMODAL RESIDUAL LEARNING FOR VISUAL QUESTION-ANSWERING

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 57

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗