All Projects → abachaa → Medquad

abachaa / Medquad

Medical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites

Projects that are alternatives of or similar to Medquad

Spago
Self-contained Machine Learning and Natural Language Processing library in Go
Stars: ✭ 854 (+562.02%)
Mutual labels:  question-answering, natural-language-processing
Bidaf Keras
Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2
Stars: ✭ 60 (-53.49%)
Mutual labels:  question-answering, natural-language-processing
Acl18 results
Code to reproduce results in our ACL 2018 paper "Did the Model Understand the Question?"
Stars: ✭ 31 (-75.97%)
Mutual labels:  question-answering, natural-language-processing
Chat
基于自然语言理解与机器学习的聊天机器人,支持多用户并发及自定义多轮对话
Stars: ✭ 516 (+300%)
Mutual labels:  question-answering, natural-language-processing
Neuronblocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Stars: ✭ 1,356 (+951.16%)
Mutual labels:  question-answering, natural-language-processing
Insuranceqa Corpus Zh
🚁 保险行业语料库,聊天机器人
Stars: ✭ 821 (+536.43%)
Mutual labels:  question-answering, natural-language-processing
Cdqa Annotator
⛔ [NOT MAINTAINED] A web-based annotator for closed-domain question answering datasets with SQuAD format.
Stars: ✭ 48 (-62.79%)
Mutual labels:  question-answering, natural-language-processing
Adam qas
ADAM - A Question Answering System. Inspired from IBM Watson
Stars: ✭ 330 (+155.81%)
Mutual labels:  question-answering, natural-language-processing
Sentence Similarity
PyTorch implementations of various deep learning models for paraphrase detection, semantic similarity, and textual entailment
Stars: ✭ 96 (-25.58%)
Mutual labels:  question-answering, natural-language-processing
Neural kbqa
Knowledge Base Question Answering using memory networks
Stars: ✭ 87 (-32.56%)
Mutual labels:  question-answering, natural-language-processing
Paper Reading
Paper reading list in natural language processing, including dialogue systems and text generation related topics.
Stars: ✭ 508 (+293.8%)
Mutual labels:  question-answering, natural-language-processing
Clicr
Machine reading comprehension on clinical case reports
Stars: ✭ 123 (-4.65%)
Mutual labels:  question-answering, natural-language-processing
Cdqa
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
Stars: ✭ 500 (+287.6%)
Mutual labels:  question-answering, natural-language-processing
Knowledge Graphs
A collection of research on knowledge graphs
Stars: ✭ 845 (+555.04%)
Mutual labels:  question-answering, natural-language-processing
Gnn4nlp Papers
A list of recent papers about Graph Neural Network methods applied in NLP areas.
Stars: ✭ 405 (+213.95%)
Mutual labels:  question-answering, natural-language-processing
Conversational Ai
Conversational AI Reading Materials
Stars: ✭ 34 (-73.64%)
Mutual labels:  question-answering, natural-language-processing
Cmrc2018
A Span-Extraction Dataset for Chinese Machine Reading Comprehension (CMRC 2018)
Stars: ✭ 238 (+84.5%)
Mutual labels:  question-answering, natural-language-processing
Jack
Jack the Reader
Stars: ✭ 242 (+87.6%)
Mutual labels:  question-answering, natural-language-processing
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-34.11%)
Mutual labels:  question-answering, natural-language-processing
Chatbot
Русскоязычный чатбот
Stars: ✭ 106 (-17.83%)
Mutual labels:  question-answering, natural-language-processing

MedQuAD: Medical Question Answering Dataset

MedQuAD includes 47,457 medical question-answer pairs created from 12 NIH websites (e.g. cancer.gov, niddk.nih.gov, GARD, MedlinePlus Health Topics). The collection covers 37 question types (e.g. Treatment, Diagnosis, Side Effects) associated with diseases, drugs and other medical entities such as tests.

We included additional annotations in the XML files, that could be used for diverse IR and NLP tasks, such as the question type, the question focus, its syonyms, its UMLS Concept Unique Identifier (CUI) and Semantic Type. We added the category of the question focus (Disease, Drug or Other) in the 4 MedlinePlus collections. All other collections are about diseases.

The paper cited below describes the collection, the construction method as well as its use and evaluation within a medical question answering system.

N.B. We removed the answers from 3 subsets to respect the MedlinePlus copyright (https://medlineplus.gov/copyright.html):
(1) A.D.A.M. Medical Encyclopedia, (2) MedlinePlus Drug information, and (3) MedlinePlus Herbal medicine and supplement information. -- We kept all the other information including the URLs in case you want to crawl the answers. Please contact me if you have any questions.


QA Test Collection

We used the test questions of the TREC-2017 LiveQA medical task: https://github.com/abachaa/LiveQA_MedicalTask_TREC2017/tree/master/TestDataset.

As described in our BMC paper, we have manually judged the answers retrieved by the IR and QA systems from the MedQuAD collection. We used the same judgment scores as the LiveQA Track: 1-Incorrect, 2-Related, 3-Incomplete, and 4-Excellent. -- Format of the qrels file: Question_ID judgment Answer_ID

The QA test collection contains 2,479 judged answers that can be used to evaluate the performance of IR & QA systems on the LiveQA-Med test questions: https://github.com/abachaa/MedQuAD/blob/master/QA-TestSet-LiveQA-Med-Qrels-2479-Answers.zip


Reference

If you use the MedQuAD dataset and/or the collection of 2,479 judged answers, please cite the following paper: "A Question-Entailment Approach to Question Answering". Asma Ben Abacha and Dina Demner-Fushman. BMC Bioinformatics, 2019.

@ARTICLE{BenAbacha-BMC-2019,    
	  author    = {Asma {Ben Abacha} and Dina Demner{-}Fushman},
	  title     = {A Question-Entailment Approach to Question Answering},
	  journal = {{BMC} Bioinform.}, 
	  volume    = {20},
	  number    = {1},
 		  pages     = {511:1--511:23},
	  year      = {2019},
url       = {https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3119-4}
	   }     

Contact Information

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].