All Projects → seriousran → Awesome Qa

seriousran / Awesome Qa

Licence: cc0-1.0
😎 A curated list of the Question Answering (QA)

Projects that are alternatives of or similar to Awesome Qa

extractive rc by runtime mt
Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"
Stars: ✭ 36 (-93.96%)
Mutual labels:  question-answering, squad
Question-Answering-based-on-SQuAD
Question Answering System using BiDAF Model on SQuAD v2.0
Stars: ✭ 20 (-96.64%)
Mutual labels:  question-answering, squad
Bi Att Flow
Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.
Stars: ✭ 1,472 (+146.98%)
Mutual labels:  question-answering, squad
qa
TensorFlow Models for the Stanford Question Answering Dataset
Stars: ✭ 72 (-87.92%)
Mutual labels:  question-answering, squad
SQUAD2.Q-Augmented-Dataset
Augmented version of SQUAD 2.0 for Questions
Stars: ✭ 31 (-94.8%)
Mutual labels:  question-answering, squad
question-answering
No description or website provided.
Stars: ✭ 32 (-94.63%)
Mutual labels:  question-answering, squad
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+471.98%)
Mutual labels:  question-answering, squad
PersianQA
Persian (Farsi) Question Answering Dataset (+ Models)
Stars: ✭ 114 (-80.87%)
Mutual labels:  question-answering, squad
Medi-CoQA
Conversational Question Answering on Clinical Text
Stars: ✭ 22 (-96.31%)
Mutual labels:  question-answering, squad
co-attention
Pytorch implementation of "Dynamic Coattention Networks For Question Answering"
Stars: ✭ 54 (-90.94%)
Mutual labels:  question-answering, squad
Cogqa
Source code and dataset for ACL 2019 paper "Cognitive Graph for Multi-Hop Reading Comprehension at Scale"
Stars: ✭ 399 (-33.05%)
Mutual labels:  question-answering
Gnn4nlp Papers
A list of recent papers about Graph Neural Network methods applied in NLP areas.
Stars: ✭ 405 (-32.05%)
Mutual labels:  question-answering
Paper Reading
Paper reading list in natural language processing, including dialogue systems and text generation related topics.
Stars: ✭ 508 (-14.77%)
Mutual labels:  question-answering
Memn2n Babi Python
End-To-End Memory Networks for bAbI question-answering tasks
Stars: ✭ 570 (-4.36%)
Mutual labels:  question-answering
Drqa
A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.
Stars: ✭ 378 (-36.58%)
Mutual labels:  squad
Qa Survey
北航大数据高精尖中心研究张日崇团队对问答系统的调研。包括知识图谱问答系统(KBQA)和文本问答系统(TextQA),每类系统分别对学术界和工业界进行调研。
Stars: ✭ 502 (-15.77%)
Mutual labels:  question-answering
Autoedit 2
Fast text based video editing, node Electron Os X desktop app, with Backbone front end.
Stars: ✭ 343 (-42.45%)
Mutual labels:  watson
Adam qas
ADAM - A Question Answering System. Inspired from IBM Watson
Stars: ✭ 330 (-44.63%)
Mutual labels:  question-answering
R Net
A Tensorflow Implementation of R-net: Machine reading comprehension with self matching networks
Stars: ✭ 321 (-46.14%)
Mutual labels:  squad
Tapas
End-to-end neural table-text understanding models.
Stars: ✭ 583 (-2.18%)
Mutual labels:  question-answering

Awesome Question Answering Awesome

A curated list of the Question Answering (QA) subject which is a computer science discipline within the fields of information retrieval and natural language processing (NLP) toward using machine learning and deep learning

정보 검색 및 자연 언어 처리 분야의 질의응답에 관한 큐레이션 - 머신러닝과 딥러닝 단계까지
问答系统主题的精选列表,是信息检索和自然语言处理领域的计算机科学学科 - 使用机器学习和深度学习

Contents

Recent Trends

Recent QA Models

Recent Language Models

AAAI 2020

ACL 2019

EMNLP-IJCNLP 2019

Arxiv

Dataset

About QA

Types of QA

  • Single-turn QA: answer without considering any context
  • Conversational QA: use previsous conversation turns

Subtypes of QA

  • Knowledge-based QA
  • Table/List-based QA
  • Text-based QA
  • Community-based QA
  • Visual QA

Analysis and Parsing for Pre-processing in QA systems

Lanugage Analysis

  1. Morphological analysis
  2. Named Entity Recognition(NER)
  3. Homonyms / Polysemy Analysis
  4. Syntactic Parsing (Dependency Parsing)
  5. Semantic Recognition

Most QA systems have roughly 3 parts

  1. Fact extraction
    1. Entity Extraction
      1. Named-Entity Recognition(NER)
    2. Relation Extraction
  2. Understanding the question
  3. Generating an answer

Events

  • Wolfram Alpha launced the answer engine in 2009.
  • IBM Watson system defeated top Jeopardy! champions in 2011.
  • Apple's Siri integrated Wolfram Alpha's answer engine in 2011.
  • Google embraced QA by launching its Knowledge Graph, leveraging the free base knowledge base in 2012.
  • Amazon Echo | Alexa (2015), Google Home | Google Assistant (2016), INVOKE | MS Cortana (2017), HomePod (2017)

Systems

  • IBM Watson - Has state-of-the-arts performance.
  • Facebook DrQA - Applied to the SQuAD1.0 dataset. The SQuAD2.0 dataset has released. but DrQA is not tested yet.
  • MIT media lab's Knowledge graph - Is a freely-available semantic network, designed to help computers understand the meanings of words that people use.

Competitions in QA

Dataset Language Organizer Since Top Rank Model Status Over Human Performance
0 Story Cloze Test English Univ. of Rochester 2016 msap Logistic regression Closed x
1 MS MARCO English Microsoft 2016 YUANFUDAO research NLP MARS Closed o
2 MS MARCO V2 English Microsoft 2018 NTT Media Intelli. Lab. Masque Q&A Style Opened x
3 SQuAD English Univ. of Stanford 2018 XLNet (single model) XLNet Team Closed o
4 SQuAD 2.0 English Univ. of Stanford 2018 PINGAN Omni-Sinitic ALBERT + DAAF + Verifier (ensemble) Opened o
5 TriviaQA English Univ. of Washington 2017 Ming Yan - Closed -
6 decaNLP English Salesforce Research 2018 Salesforce Research MQAN Closed x
7 DuReader Ver1. Chinese Baidu 2015 Tryer T-Reader (single) Closed x
8 DuReader Ver2. Chinese Baidu 2017 renaissance AliReader Opened -
9 KorQuAD Korean LG CNS AI Research 2018 Clova AI LaRva Team LaRva-Kor-Large+ + CLaF (single) Closed o
10 KorQuAD 2.0 Korean LG CNS AI Research 2019 Kangwon National University KNU-baseline(single model) Opened x
11 CoQA English Univ. of Stanford 2018 Zhuiyi Technology RoBERTa + AT + KD (ensemble) Opened o

Publications

Codes

  • BiDAF - Bi-Directional Attention Flow (BIDAF) network is a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization.
    • Official; Tensorflow v1.2
    • Paper
  • QANet - A Q&A architecture does not require recurrent networks: Its encoder consists exclusively of convolution and self-attention, where convolution models local interactions and self-attention models global interactions.
    • Google; Unofficial; Tensorflow v1.5
    • Paper
  • R-Net - An end-to-end neural networks model for reading comprehension style question answering, which aims to answer questions from a given passage.
    • MS; Unofficially by HKUST; Tensorflow v1.5
    • Paper
  • R-Net-in-Keras - R-NET re-implementation in Keras.
    • MS; Unofficial; Keras v2.0.6
    • Paper
  • DrQA - DrQA is a system for reading comprehension applied to open-domain question answering.
    • Facebook; Official; Pytorch v0.4
    • Paper
  • BERT - A new language representation model which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers.
    • Google; Official implementation; Tensorflow v1.11.0
    • Paper

Lectures

Slides

Dataset Collections

Datasets

  • AI2 Science Questions v2.1(2017)
  • Children's Book Test
  • It is one of the bAbI project of Facebook AI Research which is organized towards the goal of automatic text understanding and reasoning. The CBT is designed to measure directly how well language models can exploit wider linguistic context.
  • CODAH Dataset
  • DeepMind Q&A Dataset; CNN/Daily Mail
    • Hermann et al. (2015) created two awesome datasets using news articles for Q&A research. Each dataset contains many documents (90k and 197k each), and each document companies on average 4 questions approximately. Each question is a sentence with one missing word/phrase which can be found from the accompanying document/context.
    • Paper: https://arxiv.org/abs/1506.03340
  • ELI5
  • GraphQuestions
    • On generating Characteristic-rich Question sets for QA evaluation.
  • LC-QuAD
    • It is a gold standard KBQA (Question Answering over Knowledge Base) dataset containing 5000 Question and SPARQL queries. LC-QuAD uses DBpedia v04.16 as the target KB.
  • MS MARCO
  • MultiRC
  • NarrativeQA
  • NewsQA
  • Qestion-Answer Dataset by CMU
    • This is a corpus of Wikipedia articles, manually-generated factoid questions from them, and manually-generated answers to these questions, for use in academic research. These data were collected by Noah Smith, Michael Heilman, Rebecca Hwa, Shay Cohen, Kevin Gimpel, and many students at Carnegie Mellon University and the University of Pittsburgh between 2008 and 2010.
  • SQuAD1.0
    • Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
    • Paper: https://arxiv.org/abs/1606.05250
  • SQuAD2.0
    • SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 new, unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
    • Paper: https://arxiv.org/abs/1806.03822
  • Story cloze test
    • 'Story Cloze Test' is a new commonsense reasoning framework for evaluating story understanding, story generation, and script learning. This test requires a system to choose the correct ending to a four-sentence story.
    • Paper: https://arxiv.org/abs/1604.01696
  • TriviaQA
    • TriviaQA is a reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions.
    • Paper: https://arxiv.org/abs/1705.03551
  • WikiQA
    • A publicly available set of question and sentence pairs for open-domain question answering.

The DeepQA Research Team in IBM Watson's publication within 5 years

  • 2015
    • "Automated Problem List Generation from Electronic Medical Records in IBM Watson", Murthy Devarakonda, Ching-Huei Tsou, IAAI, 2015.
    • "Decision Making in IBM Watson Question Answering", J. William Murdock, Ontology summit, 2015.
    • "Unsupervised Entity-Relation Analysis in IBM Watson", Aditya Kalyanpur, J William Murdock, ACS, 2015.
    • "Commonsense Reasoning: An Event Calculus Based Approach", E T Mueller, Morgan Kaufmann/Elsevier, 2015.
  • 2014

MS Research's publication within 5 years

Google AI's publication within 5 years

Facebook AI Research's publication within 5 years

Books

  • Natural Language Question Answering system Paperback - Boris Galitsky (2003)
  • New Directions in Question Answering - Mark T. Maybury (2004)
  • Part 3. 5. Question Answering in The Oxford Handbook of Computational Linguistics - Sanda Harabagiu and Dan Moldovan (2005)
  • Chap.28 Question Answering in Speech and Language Processing - Daniel Jurafsky & James H. Martin (2017)

Links

Contributing

Contributions welcome! Read the contribution guidelines first.

License

CC0

To the extent possible under law, seriousmac (the maintainer) has waived all copyright and related or neighboring rights to this work.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].