All Projects → stevezheng23 → Reading_comprehension_tf

stevezheng23 / Reading_comprehension_tf

Licence: apache-2.0
Machine Reading Comprehension in Tensorflow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Reading comprehension tf

Deep Learning With Pytorch Tutorials
深度学习与PyTorch入门实战视频教程 配套源代码和PPT
Stars: ✭ 1,986 (+5267.57%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Articutapi
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。
Stars: ✭ 252 (+581.08%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Image Caption Generator
[DEPRECATED] A Neural Network based generative model for captioning images using Tensorflow
Stars: ✭ 141 (+281.08%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Malware Classification
Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach using Support Vector Machine for Malware Classification
Stars: ✭ 88 (+137.84%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Ciff
Cornell Instruction Following Framework
Stars: ✭ 23 (-37.84%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Top Deep Learning
Top 200 deep learning Github repositories sorted by the number of stars.
Stars: ✭ 1,365 (+3589.19%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Catalyst
🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
Stars: ✭ 224 (+505.41%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Attention Mechanisms
Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Stars: ✭ 203 (+448.65%)
Mutual labels:  natural-language-processing, recurrent-neural-networks, natural-language-understanding
Zhihu
This repo contains the source code in my personal column (https://zhuanlan.zhihu.com/zhaoyeyu), implemented using Python 3.6. Including Natural Language Processing and Computer Vision projects, such as text generation, machine translation, deep convolution GAN and other actual combat code.
Stars: ✭ 3,307 (+8837.84%)
Mutual labels:  natural-language-processing, convolutional-neural-networks, recurrent-neural-networks
Graphbrain
Language, Knowledge, Cognition
Stars: ✭ 294 (+694.59%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Simplednn
SimpleDNN is a machine learning lightweight open-source library written in Kotlin designed to support relevant neural network architectures in natural language processing tasks
Stars: ✭ 81 (+118.92%)
Mutual labels:  artificial-intelligence, natural-language-processing, recurrent-neural-networks
Botlibre
An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
Stars: ✭ 412 (+1013.51%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Coursera Natural Language Processing Specialization
Programming assignments from all courses in the Coursera Natural Language Processing Specialization offered by deeplearning.ai.
Stars: ✭ 39 (+5.41%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Xlnet extension tf
XLNet Extension in TensorFlow
Stars: ✭ 109 (+194.59%)
Mutual labels:  artificial-intelligence, natural-language-processing, natural-language-understanding
Awesome Tensorlayer
A curated list of dedicated resources and applications
Stars: ✭ 248 (+570.27%)
Mutual labels:  natural-language-processing, convolutional-neural-networks, recurrent-neural-networks
Deep Learning With Python
Deep learning codes and projects using Python
Stars: ✭ 195 (+427.03%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (+240.54%)
Mutual labels:  convolutional-neural-networks, recurrent-neural-networks, attention-model
Komputation
Komputation is a neural network framework for the Java Virtual Machine written in Kotlin and CUDA C.
Stars: ✭ 295 (+697.3%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
First Steps Towards Deep Learning
This is an open sourced book on deep learning.
Stars: ✭ 376 (+916.22%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks
Trending Deep Learning
Top 100 trending deep learning repositories sorted by the number of stars gained on a specific day.
Stars: ✭ 543 (+1367.57%)
Mutual labels:  artificial-intelligence, convolutional-neural-networks, recurrent-neural-networks

Machine Reading Comprehension

Machine reading comprehension (MRC), a task which asks machine to read a given context then answer questions based on its understanding, is considered one of the key problems in artificial intelligence and has significant interest from both academic and industry. Over the past few years, great progress has been made in this field, thanks to various end-to-end trained neural models and high quality datasets with large amount of examples proposed. In this repo, I'll share more details on MRC task by re-implementing a few MRC models and testing them on standard MRC datasets.

Figure 1: MRC example from SQuAD 2.0 dev set

Setting

  • Python 3.6.6
  • Tensorflow 1.12
  • NumPy 1.15.4
  • NLTK 3.3
  • Spacy 2.0.12

DataSet

  • SQuAD is a reading comprehension dataset, consisting of questions posed by crowd-workers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
  • GloVe is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.

Usage

  • Preprocess data
# preprocess train data
python squad/preprocess.py --format json --input_file data/squad/train-v1.1/train-v1.1.json --output_file data/squad/train-v1.1/train-v1.1.squad.json
# preprocess dev data
python squad/preprocess.py --format json --input_file data/squad/dev-v1.1/dev-v1.1.json --output_file data/squad/dev-v1.1/dev-v1.1.squad.json
  • Run experiment
# run experiment in train + eval mode
python reading_comprehension_run.py --mode train_eval --config config/config_mrc_template.xxx.json
# run experiment in train only mode
python reading_comprehension_run.py --mode train --config config/config_mrc_template.xxx.json
# run experiment in eval only mode
python reading_comprehension_run.py --mode eval --config config/config_mrc_template.xxx.json
  • Search hyper-parameter
# random search hyper-parameters
python hparam_search.py --base-config config/config_mrc_template.xxx.json --search-config config/config_search_template.xxx.json --num-group 10 --random-seed 100 --output-dir config/search
  • Visualize summary
# visualize summary via tensorboard
tensorboard --logdir=output

Experiment

QANet

QANet is a MRC architecture proposed by Google Brain, which does not require recurrent networks: Its encoder consists exclusively of convolution and self-attention, where convolution models local interactions and self-attention models global interactions.

Figure 2: An overview of the QANet architecture

Figure 3: The experiment details are reported on SQuAD v1 dataset. Both train & dev sets are processed using Spacy. Invalid samples are removed from both train & dev sets. EM results for QANet model with/without EMA are shown on left. F1 results for QANet model with/without EMA are shown on right

Model # Epoch # Train Steps Batch Size Data Size # Head # Dim EM F1
This implementation 13 ~70,000 16 87k (no aug) 8 128 70.2 80.0
Original Paper ~13 35,000 32 87k (no aug) 8 128 N/A 77.0
Original Paper ~55 150,000 32 87k (no aug) 8 128 73.6 82.7

Table 1: The performance results are reported on SQuAD v1 dataset. Both train & dev sets are processed using Spacy. Invalid samples are removed from train set only. Settings for this QANet implementation is selected to be comparable with settings in original paper

BiDAF

BiDAF (Bi-Directional Attention Flow) is a MRC architecture proposed by Allen Institute for Artificial Intelligence (AI2), which consists a multi-stage hierarchical process that represents the context at different levels of granularity and uses bidirectional attention flow mechanism to obtain a query-aware context representation without early summarization.

Figure 4: An overview of the BiDAF architecture

Figure 5: The experiment details are reported on SQuAD v1 dataset. Both train & dev sets are processed using Spacy. Invalid samples are removed from both train & dev sets. EM results for BiDAF model with/without EMA are shown on left. F1 results for BiDAF model with/without EMA are shown on right

Model # Epoch # Train Steps Batch Size Attention Type # Dim EM F1
This implementation 12 ~17,500 60 trilinear 100 68.5 78.2
Original Paper 12 ~17,500 60 trilinear 100 67.7 77.3

Table 2: The performance results are reported on SQuAD v1 dataset. Both train & dev sets are processed using Spacy. Invalid samples are removed from train set only. Settings for this BiDAF implementation is selected to be comparable with settings in original paper

R-Net

R-Net is a MRC architecture proposed by Microsoft Research Asia (MSRA), which first matches the question and passage with gated attention-based recurrent networks to obtain the question-aware passage representation, then uses a self-matching attention mechanism to refine the representation by matching the passage against itself, and finally employs the pointer networks to locate the positions of answers from the passages.

Figure 6: An overview of the R-Net architecture

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].