All Projects → thunlp → Openqa

thunlp / Openqa

Licence: mit
The source code of ACL 2018 paper "Denoising Distantly Supervised Open-Domain Question Answering".

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Openqa

Dan Jurafsky Chris Manning Nlp
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Stars: ✭ 124 (-34.04%)
Mutual labels:  question-answering
Nspm
🤖 Neural SPARQL Machines for Knowledge Graph Question Answering.
Stars: ✭ 156 (-17.02%)
Mutual labels:  question-answering
Hq bot
📲 Bot to help solve HQ trivia
Stars: ✭ 167 (-11.17%)
Mutual labels:  question-answering
Kbqa Ar Smcnn
Question answering over Freebase (single-relation)
Stars: ✭ 129 (-31.38%)
Mutual labels:  question-answering
Cape Webservices
Entrypoint for all backend cape webservices
Stars: ✭ 149 (-20.74%)
Mutual labels:  question-answering
Denspi
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)
Stars: ✭ 162 (-13.83%)
Mutual labels:  question-answering
Dynamic Memory Networks Plus Pytorch
Implementation of Dynamic memory networks plus in Pytorch
Stars: ✭ 123 (-34.57%)
Mutual labels:  question-answering
Mspars
Stars: ✭ 177 (-5.85%)
Mutual labels:  question-answering
Pytorch Question Answering
Important paper implementations for Question Answering using PyTorch
Stars: ✭ 154 (-18.09%)
Mutual labels:  question-answering
Improved Dynamic Memory Networks Dmn Plus
Theano Implementation of DMN+ (Improved Dynamic Memory Networks) from the paper by Xiong, Merity, & Socher at MetaMind, http://arxiv.org/abs/1603.01417 (Dynamic Memory Networks for Visual and Textual Question Answering)
Stars: ✭ 165 (-12.23%)
Mutual labels:  question-answering
Question Answering
TensorFlow implementation of Match-LSTM and Answer pointer for the popular SQuAD dataset.
Stars: ✭ 133 (-29.26%)
Mutual labels:  question-answering
Question answering models
This repo collects and re-produces models related to domains of question answering and machine reading comprehension
Stars: ✭ 139 (-26.06%)
Mutual labels:  question-answering
Rczoo
question answering, reading comprehension toolkit
Stars: ✭ 163 (-13.3%)
Mutual labels:  question-answering
Medquad
Medical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
Stars: ✭ 129 (-31.38%)
Mutual labels:  question-answering
Rat Sql
A relation-aware semantic parsing model from English to SQL
Stars: ✭ 169 (-10.11%)
Mutual labels:  question-answering
Knowledge Aware Reader
PyTorch implementation of the ACL 2019 paper "Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader"
Stars: ✭ 123 (-34.57%)
Mutual labels:  question-answering
Chinese Rc Datasets
Collections of Chinese reading comprehension datasets
Stars: ✭ 159 (-15.43%)
Mutual labels:  question-answering
Triviaqa
Code for the TriviaQA reading comprehension dataset
Stars: ✭ 184 (-2.13%)
Mutual labels:  question-answering
Questgen.ai
Question generation using state-of-the-art Natural Language Processing algorithms
Stars: ✭ 169 (-10.11%)
Mutual labels:  question-answering
Awesomemrc
This repo is our research summary and playground for MRC. More features are coming.
Stars: ✭ 162 (-13.83%)
Mutual labels:  question-answering

Open-QA

The source codes for paper "Denoising Distantly Supervised Open-Domain Question Answering", which is modified based on the code of paper " Reading Wikipedia to Answer Open-Domain Questions."

Requirements

pytorch 0.3.0 numpy scikit-learn termcolor regex tqdm prettytable scipy nltk pexpect 4.2.1

Evaluation Results

Dataset Quasar-T SearchQA TrivialQA SQuAD
Models EM F1 EM F1 EM F1 EM F1
GA (Dhingra et al., 2017) 26.4 26.4 - - - -
BiDAF (Seo et al., 2017) 25.9 28.5 28.6 34.6 - - - -
AQA (Buck et al., 2017) - - 40.5 47.4 - - - -
R^3 (Wang et al., 2018a) 35.3 41.7 49 55.3 47.3 53.7 29.1 37.5
Our Model 42.2 49.3 58.8 64.5 48.7 56.3 28.7 36.6

Data

We provide Quasar-T, SearchQA and TrivialQA dataset we used for the task in data/ directory. We preprocess the original data to make it satisfy the input format of our codes, and can be download at here.

To run our code, the dataset should be put in the folder data/ using the following format:

datasets/

  • train.txt, dev.txt, test.txt: format for each line: {"question": quetion, "answers":[answer1, answer2, ...]}.

  • train.json, dev.json, test.json: format [{"question": question, "document":document1},{"question": question, "document":document2}, ...].

embeddings/

  • glove.840B.300d.txt: word vectors obtained from here.

corenlp/

  • all jar files from Stanford Corenlp.

Codes

The source codes of our models are put in the folders src/.

Train and Test

For training and test, you need to:

  1. Pre-train the paragraph reader: python main.py --batch-size 256 --model-name quasart_reader --num-epochs 10 --dataset quasart --mode reader

  2. Pre-train the paragraph selector: python main.py --batch-size 64 --model-name quasart_selector --num-epochs 10 --dataset quasart --mode selector --pretrained models/quasart_reader.mdl

  3. Train the whole model: python main.py --batch-size 32 --model-name quasart_all --num-epochs 10 --dataset quasart --mode all --pretrained models/quasart_selector.mdl

Cite

If you use the code, please cite the following paper:

  1. Yankai Lin, Haozhe Ji, Zhiyuan Liu, and Maosong Sun. 2018. Denoising Distantly Supervised Open-Domain Question Answering. In Proceedings of ACL. pages 1736--1745. [pdf]

Reference

  1. Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William Cohen, and Ruslan Salakhutdinov. 2017. Gated-attention readers for text comprehension. In Proceedings of ACL. pages 1832--1846.

  2. Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017. Bidirectional attention flow for machine comprehension. In Proceedings of ICLR.

  3. Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Andrea Gesmundo, Neil Houlsby, Wojciech Gajewski, and Wei Wang. 2017. Ask the right questions: Active question reformulation with reinforcement learning. arXiv preprint arXiv:1705.07830.

  4. Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang,Tim Klinger, Wei Zhang, Shiyu Chang, Gerald Tesauro, Bowen Zhou, and Jing Jiang. 2018. R3: Reinforced ranker-reader for open-domain question answering. In Proceedings of AAAI. pages 5981--5988.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].