All Projects → localminimum → R Net

localminimum / R Net

Licence: mit
A Tensorflow Implementation of R-net: Machine reading comprehension with self matching networks

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to R Net

Attention Over Attention Tf Qa
论文“Attention-over-Attention Neural Networks for Reading Comprehension”中AoA模型实现
Stars: ✭ 58 (-81.93%)
Mutual labels:  rnn, qa
Qrn
Query-Reduction Networks (QRN)
Stars: ✭ 137 (-57.32%)
Mutual labels:  rnn, qa
MRC Competition Dureader
机器阅读理解 冠军/亚军代码及中文预训练MRC模型
Stars: ✭ 552 (+71.96%)
Mutual labels:  qa, squad
LSTM-CTC-recaptcha
recaptcha with lstm and mxnet
Stars: ✭ 28 (-91.28%)
Mutual labels:  rnn
EdgarAllanPoetry
Computer-generated poetry
Stars: ✭ 22 (-93.15%)
Mutual labels:  rnn
Musicgenerator
Experiment diverse Deep learning models for music generation with TensorFlow
Stars: ✭ 269 (-16.2%)
Mutual labels:  rnn
Learning to retrieve reasoning paths
The official implementation of ICLR 2020, "Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering".
Stars: ✭ 318 (-0.93%)
Mutual labels:  squad
ru-qa-resources
Список ресурсов на тему QA
Stars: ✭ 15 (-95.33%)
Mutual labels:  qa
Bitcoinforecast
Predict bitcoin price with deep learning
Stars: ✭ 285 (-11.21%)
Mutual labels:  rnn
Lstm Human Activity Recognition
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
Stars: ✭ 2,943 (+816.82%)
Mutual labels:  rnn
Handwritingrecognitionsystem
Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture
Stars: ✭ 262 (-18.38%)
Mutual labels:  rnn
deep-learning-coursera-complete
Deep Learning Specialization by Andrew Ng on Coursera - My Completed Coursework Repo - All 5 Courses
Stars: ✭ 104 (-67.6%)
Mutual labels:  rnn
Rnnsharp
RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling, sequence-to-sequence and so on. It's written by C# language and based on .NET framework 4.6 or above versions. RNNSharp supports many different types of networks, such as forward and bi-directional network, sequence-to-sequence network, and different types of layers, such as LSTM, Softmax, sampled Softmax and others.
Stars: ✭ 277 (-13.71%)
Mutual labels:  rnn
captioning chainer
A fast implementation of Neural Image Caption by Chainer
Stars: ✭ 17 (-94.7%)
Mutual labels:  rnn
Unet Zoo
A collection of UNet and hybrid architectures in PyTorch for 2D and 3D Biomedical Image segmentation
Stars: ✭ 302 (-5.92%)
Mutual labels:  rnn
sgrnn
Tensorflow implementation of Synthetic Gradient for RNN (LSTM)
Stars: ✭ 40 (-87.54%)
Mutual labels:  rnn
Nlu sim
all kinds of baseline models for sentence similarity 句子对语义相似度模型
Stars: ✭ 286 (-10.9%)
Mutual labels:  qa
Pytorch Dnc
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch
Stars: ✭ 264 (-17.76%)
Mutual labels:  rnn
percona-qa
Percona QA is a suite of scripts and utilities that assists in building, continuous integration, automated testing & bug reporting for Percona Server, Percona XtraDB Cluster, Percona XtraBackup, Percona Server for MongoDB, as well as other flavors of MySQL (Oracle, Facebook MyQSL, WebScaleSQL, MariaDB) etc.
Stars: ✭ 55 (-82.87%)
Mutual labels:  qa
Deeplearning.ai Assignments
Stars: ✭ 268 (-16.51%)
Mutual labels:  rnn

R-NET: MACHINE READING COMPREHENSION WITH SELF MATCHING NETWORKS

Tensorflow implementation of https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf Alt text

The dataset used for this task is Stanford Question Answering Dataset (https://rajpurkar.github.io/SQuAD-explorer/). Pretrained GloVe embeddings are used for both words (https://nlp.stanford.edu/projects/glove/) and characters (https://github.com/minimaxir/char-embeddings/blob/master/glove.840B.300d-char.txt).

As of 26 Feb 2018, thanks to @theSage21 we have a working demo of R-net!

Requirements

  • Python2.7
  • NumPy
  • tqdm
  • spacy
  • TensorFlow==1.2

Downloads and Setup

Once you clone this repo, run the following lines from bash just once to process the dataset (SQuAD).

$ pipenv install
$ bash setup.sh
$ pipenv shell
$ python process.py --reduce_glove True --process True

Training / Testing / Debugging / Interactive Demo

You can change the hyperparameters from params.py file to fit the model in your GPU. To train the model, run the following line.

$ python model.py

To test or debug your model after training, change mode="train" to debug or test from params.py file and run the model.

To use demo, put batch size = 1

Tensorboard

Run tensorboard for visualisation.

$ tensorboard --logdir=r-net:train/

Alt text

Log

26/02/18 As of 26th Feb 2018, thanks to @theSage21 we have an html demo that can easily be launched to user's local host and try out R-net on custom paragraphs and questions.

18/10/17 After some hyperparameter searching, our model quickly reaches EM/F1 score of 50/60 in 4 hours with the hyperparameters suggested in params.py file. However, it quickly overfits after that. Current best model reaches EM/F1 of 55/67 on dev set.

05/09/17 After rewriting the architectures, the model converges with full dataset and it takes about 20 hours to reach F1/EM=67/60 on training set and 40/30 on dev set. with batch size of 54. Reproducing the results obtained by R-Net in the original paper is a new work in progress.

02/09/17 One of the challenges I faced while training was to fit a minibatch of size 32 or larger into my GTX 1080. Since SQuAD dataset displayed high variance in data, higher batch size was essential in training (otherwise the model doesn't converge). Reducing GPU memory usage significantly to fit batch size of 32 and higher is a work in progress. If you have any suggestions on reducing the GPU memory usage, please put forward a pr.

27/08/17 As a sanity check I trained the network with 3000 independent randomly sampled question-answering pairs. With my GTX 1080, it took about 4 hours and a half for the model to get the gist of what's going on with the data. With full dataset (90,000+ pairs) we are expecting longer time for convergence. Some sort of normalization method might help speed up convergence (though the authors of the original paper didn't mention anything about the normalization).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].