All Projects → obryanlouis → qa

obryanlouis / qa

Licence: other
TensorFlow Models for the Stanford Question Answering Dataset

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to qa

Bi Att Flow
Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.
Stars: ✭ 1,472 (+1944.44%)
Mutual labels:  question-answering, squad
Awesome Qa
😎 A curated list of the Question Answering (QA)
Stars: ✭ 596 (+727.78%)
Mutual labels:  question-answering, squad
Medi-CoQA
Conversational Question Answering on Clinical Text
Stars: ✭ 22 (-69.44%)
Mutual labels:  question-answering, squad
PersianQA
Persian (Farsi) Question Answering Dataset (+ Models)
Stars: ✭ 114 (+58.33%)
Mutual labels:  question-answering, squad
extractive rc by runtime mt
Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"
Stars: ✭ 36 (-50%)
Mutual labels:  question-answering, squad
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+4634.72%)
Mutual labels:  question-answering, squad
co-attention
Pytorch implementation of "Dynamic Coattention Networks For Question Answering"
Stars: ✭ 54 (-25%)
Mutual labels:  question-answering, squad
Question-Answering-based-on-SQuAD
Question Answering System using BiDAF Model on SQuAD v2.0
Stars: ✭ 20 (-72.22%)
Mutual labels:  question-answering, squad
question-answering
No description or website provided.
Stars: ✭ 32 (-55.56%)
Mutual labels:  question-answering, squad
SQUAD2.Q-Augmented-Dataset
Augmented version of SQUAD 2.0 for Questions
Stars: ✭ 31 (-56.94%)
Mutual labels:  question-answering, squad
squadgym
Environment that can be used to evaluate reasoning capabilities of artificial agents
Stars: ✭ 27 (-62.5%)
Mutual labels:  question-answering
CompareModels TRECQA
Compare six baseline deep learning models on TrecQA
Stars: ✭ 61 (-15.28%)
Mutual labels:  question-answering
FastFusionNet
A PyTorch Implementation of FastFusionNet on SQuAD 1.1
Stars: ✭ 38 (-47.22%)
Mutual labels:  squad
text2text
Text2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+161.11%)
Mutual labels:  question-answering
Giveme5W
Extraction of the five journalistic W-questions (5W) from news articles
Stars: ✭ 16 (-77.78%)
Mutual labels:  question-answering
WikiTableQuestions
A dataset of complex questions on semi-structured Wikipedia tables
Stars: ✭ 81 (+12.5%)
Mutual labels:  question-answering
head-qa
HEAD-QA: A Healthcare Dataset for Complex Reasoning
Stars: ✭ 20 (-72.22%)
Mutual labels:  question-answering
unanswerable qa
The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".
Stars: ✭ 21 (-70.83%)
Mutual labels:  question-answering
Gradient-Samples
Samples for TensorFlow binding for .NET by Lost Tech
Stars: ✭ 53 (-26.39%)
Mutual labels:  tensorflow-tutorials
XORQA
This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".
Stars: ✭ 61 (-15.28%)
Mutual labels:  question-answering

Question Answering on SQuAD

This project implements models that train on the Stanford Question Answering Dataset (SQuAD). The SQuAD dataset is comprised of pairs of passages and questions given in English text where the answer to the question is a span of text in the passage. The goal of a model that trains on SQuAD is to predict the answer to a given passage/question pair. The project's main site has examples of some of the passages, questions, and answers, as well as a ranking for the existing models.

Specifically, this project implements:

I primarily made this for my own education, but the code could be used as a starting point for another project. The models are written in TensorFlow and the project uses (optional) AWS S3 storage for model checkpointing and data storage.

Results

Model Dev Em Dev F1 Details
Fusion Net 73.5% 82.0% Checkout 82feaa3f78a51eaeb66c5578c5d5a9f125711312 python3 train_local.py --model_type=fusion_net --rnn_size=128 --batch_size=16 --input_dropout=0.4 --rnn_dropout=0.3 --dropout=0.4 training time ~11 hours over 2 1080 Ti GPUs, ~31 min/epoch
Mnemonic reader 71.2% 80.1% Checkout 82feaa3f78a51eaeb66c5578c5d5a9f125711312 python3 train_local.py --model_type=mnemonic_reader --rnn_size=40 --batch_size=65 --input_dropout=0.3 --rnn_dropout=0.3 --dropout=0.3 training time ~6 hours over 2 1080 Ti GPUs, ~8 min/epoch
Rnet ~60% ~70%
Match LSTM ~58% ~68%

All results are for a single model rather than an ensemble. I didn't train all models for the same duration and there may be bugs or unoptimized hyperparameters in my implementation.

Thanks to @Bearsuny for identifying an issue in the evaluation. It now uses the official/correct scoring mechanism.

Requirements

  • Python 3
  • spaCy and the "en" model
  • Cove vectors - You can skip this part but will probably need to manually remove any cove references in the setup. This also requires pytorch.
  • Tensorflow 1.4
  • cuDNN 7 recommended, GPUs required

Using AWS S3

In order to use AWS S3 for model checkpointing and data storage, you must set up AWS credentials. This page shows how to do it.

After your credentials are set up, you can enable S3 in the project by setting the use_s3 flag to True and setting s3_bucket_name to the name of your S3 bucket.

f.DEFINE_boolean("use_s3", True, ...)
...
f.DEFINE_string("s3_bucket_name", "<YOUR S3 BUCKET HERE>",...)

How to run it

Setup

python3 setup.py

Training

The following command will start model training and create or restore the current model parameters from the last checkpoint (if it exists). After each epcoh, the Dev F1/Em are calculated, and if the F1 score is a new high score, then the model parameters are saved. There is no mechanism to automatically stop training; it should be done manually.

python3 train_local.py --num_gpus=<NUMBER OF GPUS>

Evaluation

The following command will evaluate the model on the Dev dataset and print out the exact match and f1 scores. To make it easier to use the compatible SQuAD-formatted model outputs, the predicted strings for each question will be written to the evaluation_dir in a file called predictions.json. In addition, if the visualize_evaluated_results flag is true, then the passsages, questions, and ground truth spans will be written to output files specified in the evaluation_dir flag.

python3 evaluate_local.py --num_gpus=<NUMBER OF GPUS>

Visualizing training

You can visualize the model loss, gradients, exact match, and f1 scores as the model trains by using TensorBoard at the top level directory of this repository.

tensorboard --logdir=log
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].