StarGraph (aka *graph) is a graph database to query large Knowledge Graphs. Playing with Knowledge Graphs can be useful if you are developing AI applications or doing data analysis over complex domains.

Stars: ✭ 24 (-22.58%)

Mutual labels: question-answering

SQUAD2.Q-Augmented-Dataset

Augmented version of SQUAD 2.0 for Questions

Stars: ✭ 31 (+0%)

Mutual labels: question-answering

WikiTableQuestions

A dataset of complex questions on semi-structured Wikipedia tables

Stars: ✭ 81 (+161.29%)

Mutual labels: question-answering

PororoQA

PororoQA, https://arxiv.org/abs/1707.00836

Stars: ✭ 26 (-16.13%)

Mutual labels: question-answering

PersianQA

Persian (Farsi) Question Answering Dataset (+ Models)

Stars: ✭ 114 (+267.74%)

Mutual labels: question-answering

head-qa

HEAD-QA: A Healthcare Dataset for Complex Reasoning

Stars: ✭ 20 (-35.48%)

Mutual labels: question-answering

DockerKeras

We provide GPU-enabled docker images including Keras, TensorFlow, CNTK, MXNET and Theano.

Stars: ✭ 49 (+58.06%)

Mutual labels: cntk

CNTKUnityTools

Some Deep learning tools in Unity using CNTK

Stars: ✭ 21 (-32.26%)

Mutual labels: cntk

GAR

Code and resources for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

Stars: ✭ 38 (+22.58%)

Mutual labels: question-answering

deformer

[ACL 2020] DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering

Stars: ✭ 111 (+258.06%)

Mutual labels: question-answering

KrantikariQA

An InformationGain based Question Answering over knowledge Graph system.

Stars: ✭ 54 (+74.19%)

Mutual labels: question-answering

cherche

📑 Neural Search

Stars: ✭ 196 (+532.26%)

Mutual labels: question-answering

View All Similar Projects ➔

MSMARCO with S-NET Extraction (Extraction-net)

A CNTK(Microsoft deep learning toolkit) implementation of S-NET: FROM ANSR EXTRACTION TO ANSWER GENERATION FOR MACHINE READING COMPREHENSION extraction part with some modifications.
This project is designed for the MSMARCO dataset
Code structure is based on CNTK BIDAF Example
Support MSMARCO V1 and V2!

Requirements

Here are some required libraries for training and evaluations.

General

python3.6
cuda-9.0 (CNTK required)
openmpi-1.10 (CNTK required)
gcc >= 6 (CNTK required)

Python

Please refer requirements.txt

Evaluate with pretrained model

This repo provides pretrained model and pre-processed validation dataset for testing the performance

Please download pretrained model and pre-processed data and put them on the MSMARCO/data and MSMARCO root directory respectively, then decompress them at the right places.

The code structure should be like

MSMARCO
├── data
│   ├── elmo_embedding.bin
│   ├── test.tsv
│   ├── vocabs.pkl
│   ├── data.tar.gz
│   └── ... others
├── model
│   ├── pm.model
│   ├── pm.model.ckp
│   └── pm.model_out.json
└── ... others

After decompressing,

cd Evaluation
sh eval.sh

then you should get the generated answer and rough-l score.

Usage

Preprocess

MSMARCO V1

Download MSMARCO v1 dataset, GloVe embedding.

cd data
python3.6 download.py v1

Convert raw data to tsv format

python3.6 convert_msmarco.py v1 --threads=`nproc`

Convert tsv format to ctf(CNTK input) format and build vocabs dictionary

python3.6 tsv2ctf.py

Generate elmo embedding

sh elmo.sh

MSMARCO V2

Download MSMARCO v2 dataset, GloVe embedding.

cd data
python3.6 download.py v2

Convert raw data to tsv format

python3.6 convert_msmarco.py v2 --threads=`nproc`

Convert tsv format to ctf(CNTK input) format and build vocabs dictionary

python3.6 tsv2ctf.py

Generate elmo embedding

sh elmo.sh

Train (Same for V1 and V2)

cd ../script
mkdir log
sh run.sh

Evaluate develop dataset

MSMARCO V1

cd Evaluation
sh eval.sh v1

MSMARCO v2

cd Evaluation
sh eval.sh v2

Performance

Paper

	rouge-l	bleu_1
S-Net (Extraction)	41.45	44.08
S-Net (Extraction, Ensemble)	42.92	44.97

This implementation

	rouge-l	bleu_1
MSMARCO v1 w/o elmo	38.43	39.14
MSMARCO v1 w/ elmo	39.42	39.47
MSMARCO v2 w/ elmo	43.66	44.44

TODO

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

zlsh80826 / MSMARCO

Programming Languages

Labels

Projects that are alternatives of or similar to MSMARCO

MSMARCO with S-NET Extraction (Extraction-net)

Requirements

General

Python

Evaluate with pretrained model

Usage

Preprocess

MSMARCO V1

MSMARCO V2

Train (Same for V1 and V2)

Evaluate develop dataset

MSMARCO V1

MSMARCO v2

Performance

Paper

This implementation

TODO