All Projects → YerevaNN → R Net In Keras

YerevaNN / R Net In Keras

Licence: mit
Open R-NET implementation and detailed analysis: https://git.io/vd8dx

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to R Net In Keras

FastFusionNet
A PyTorch Implementation of FastFusionNet on SQuAD 1.1
Stars: ✭ 38 (-79.01%)
Mutual labels:  squad
Awesome Qa
😎 A curated list of the Question Answering (QA)
Stars: ✭ 596 (+229.28%)
Mutual labels:  squad
Bi Att Flow
Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.
Stars: ✭ 1,472 (+713.26%)
Mutual labels:  squad
co-attention
Pytorch implementation of "Dynamic Coattention Networks For Question Answering"
Stars: ✭ 54 (-70.17%)
Mutual labels:  squad
Lambdahack
Haskell game engine library for roguelike dungeon crawlers; please offer feedback, e.g., after trying out the sample game with the web frontend at
Stars: ✭ 439 (+142.54%)
Mutual labels:  squad
Reading comprehension tf
Machine Reading Comprehension in Tensorflow
Stars: ✭ 37 (-79.56%)
Mutual labels:  squad
SQUAD2.Q-Augmented-Dataset
Augmented version of SQUAD 2.0 for Questions
Stars: ✭ 31 (-82.87%)
Mutual labels:  squad
Allure
Allure of the Stars is a near-future Sci-Fi roguelike and tactical squad combat game written in Haskell; please offer feedback, e.g., after trying out the web frontend version at
Stars: ✭ 149 (-17.68%)
Mutual labels:  squad
R Net
Tensorflow Implementation of R-Net
Stars: ✭ 582 (+221.55%)
Mutual labels:  squad
Match Lstm
A PyTorch implemention of Match-LSTM, R-NET and M-Reader for Machine Reading Comprehension
Stars: ✭ 92 (-49.17%)
Mutual labels:  squad
Learning to retrieve reasoning paths
The official implementation of ICLR 2020, "Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering".
Stars: ✭ 318 (+75.69%)
Mutual labels:  squad
Drqa
A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.
Stars: ✭ 378 (+108.84%)
Mutual labels:  squad
Qanet
A Tensorflow implementation of QANet for machine reading comprehension
Stars: ✭ 996 (+450.28%)
Mutual labels:  squad
qa
TensorFlow Models for the Stanford Question Answering Dataset
Stars: ✭ 72 (-60.22%)
Mutual labels:  squad
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+1783.43%)
Mutual labels:  squad
Medi-CoQA
Conversational Question Answering on Clinical Text
Stars: ✭ 22 (-87.85%)
Mutual labels:  squad
Fusionnet
My implementation of the FusionNet for machine comprehension
Stars: ✭ 29 (-83.98%)
Mutual labels:  squad
Albert Tf2.0
ALBERT model Pretraining and Fine Tuning using TF2.0
Stars: ✭ 180 (-0.55%)
Mutual labels:  squad
Mnemonicreader
A PyTorch implementation of Mnemonic Reader for the Machine Comprehension task
Stars: ✭ 137 (-24.31%)
Mutual labels:  squad
Bidaf Pytorch
An Implementation of Bidirectional Attention Flow
Stars: ✭ 42 (-76.8%)
Mutual labels:  squad

R-NET implementation in Keras

This repository is an attempt to reproduce the results presented in the technical report by Microsoft Research Asia. The report describes a complex neural network called R-NET designed for question answering.

This blogpost describes the details.

R-NET is currently (August 25, 2017) the best single model on the Stanford QA database: SQuAD. SQuAD dataset uses two performance metrics, exact match (EM) and F1-score (F1). Human performance is estimated to be EM=82.3% and F1=91.2% on the test set.

The report describes two versions of R-NET:

  1. The first one is called R-NET (Wang et al., 2017) (which refers to a paper which not yet available online) and reaches EM=71.3% and F1=79.7% on the test set. It consists of input encoders, a modified version of Match-LSTM, self-matching attention layer (the main contribution of the paper) and a pointer network.
  2. The second version called R-NET (March 2017) has one additional BiGRU between the self-matching attention layer and the pointer network and reaches EM=72.3% and F1=80.7%.

The current best single-model on SQuAD leaderboard has a higher score, which means R-NET development continued after March 2017. Ensemble models reach higher scores.

This repository contains an implementation of the first version, but we cannot yet reproduce the reported results. The best performance we got so far was EM=57.52% and F1=67.42% on the dev set. We are aware of a few differences between our implementation and the network described in the paper:

  1. The first formula in (11) of the report contains a strange summand W_v^Q V_r^Q. Both tensors are trainable and are not used anywhere else in the network. We have replaced this product with a single trainable vector.
  2. The size of the hidden layer should 75 according to the report, but we get better results with a lower number. Overfitting is huge with 75 neurons.
  3. We are not sure whether we applied dropout correctly.
  4. There is nothing about weight initialization or batch generation in the report.
  5. Question-aware passage representation generation (probably) should be done by a bidirectional GRU.

On the other hand we can't rule out that we have bugs in our code.

Instructions (make sure you are running Keras version 2.0.6)

  1. We need to parse and split the data
python parse_data.py data/train-v1.1.json --train_ratio 0.9 --outfile data/train_parsed.json --outfile_valid data/valid_parsed.json
python parse_data.py data/dev-v1.1.json --outfile data/dev_parsed.json
  1. Preprocess the data
python preprocessing.py data/train_parsed.json --outfile data/train_data_str.pkl --include_str
python preprocessing.py data/valid_parsed.json --outfile data/valid_data_str.pkl --include_str
python preprocessing.py data/dev_parsed.json --outfile data/dev_data_str.pkl --include_str
  1. Train the model
python train.py --hdim 45 --batch_size 50 --nb_epochs 50 --optimizer adadelta --lr 1 --dropout 0.2 --char_level_embeddings --train_data data/train_data_str.pkl --valid_data data/valid_data_str.pkl
  1. Predict on dev/test set samples
python predict.py --batch_size 100 --dev_data data/dev_data_str.pkl models/31-t3.05458271443-v3.27696280528.model prediction.json

Our best model can be downloaded from Release v0.1: https://github.com/YerevaNN/R-NET-in-Keras/releases/download/v0.1/31-t3.05458271443-v3.27696280528.model

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].