kelvin-jiang / FreebaseQA

Licence: CC-BY-4.0 license

The release of the FreebaseQA data set (NAACL 2019).

Projects that are alternatives of or similar to FreebaseQA

Question-Answering-based-on-SQuAD

Question Answering System using BiDAF Model on SQuAD v2.0

Stars: ✭ 20 (-63.64%)

Mutual labels: question-answering, nlp-datasets

bilkent-turkish-writings-dataset

Turkish writings dataset that promotes creativity, content, composition, grammar, spelling and punctuation.

Stars: ✭ 30 (-45.45%)

Mutual labels: nlp-datasets

FlowQA

Implementation of conversational QA model: FlowQA (with slight improvement)

Stars: ✭ 197 (+258.18%)

Mutual labels: question-answering

golang-interview-questions

golang 面试集锦

Stars: ✭ 42 (-23.64%)

Mutual labels: question-answering

hf-experiments

Experiments with Hugging Face 🔬 🤗

Stars: ✭ 37 (-32.73%)

Mutual labels: question-answering

Instahelp

Instahelp is a Q&A portal website similar to Quora

Stars: ✭ 21 (-61.82%)

Mutual labels: question-answering

MoTIS

Mobile(iOS) Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP). Accepted at NAACL 2022.

Stars: ✭ 60 (+9.09%)

Mutual labels: naacl

KitanaQA

KitanaQA: Adversarial training and data augmentation for neural question-answering models

Stars: ✭ 58 (+5.45%)

Mutual labels: question-answering

pair2vec

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

Stars: ✭ 62 (+12.73%)

Mutual labels: question-answering

dialogbot

dialogbot, provide search-based dialogue, task-based dialogue and generative dialogue model. 对话机器人，基于问答型对话、任务型对话、聊天型对话等模型实现，支持网络检索问答，领域知识问答，任务引导问答，闲聊问答，开箱即用。

Stars: ✭ 96 (+74.55%)

Mutual labels: question-answering

django-simple-forum

full featured forum, easy to integrate and use.

Stars: ✭ 65 (+18.18%)

Mutual labels: question-answering

GrailQA

No description or website provided.

Stars: ✭ 72 (+30.91%)

Mutual labels: question-answering

Dynamic-Coattention-Network-for-SQuAD

Tensorflow implementation of DCN for question answering on the Stanford Question Answering Dataset (SQuAD)

Stars: ✭ 14 (-74.55%)

Mutual labels: question-answering

HAR

Code for WWW2019 paper "A Hierarchical Attention Retrieval Model for Healthcare Question Answering"

Stars: ✭ 22 (-60%)

Mutual labels: question-answering

TeBaQA

A question answering system which utilises machine learning.

Stars: ✭ 17 (-69.09%)

Mutual labels: question-answering

Multi-Hop-Knowledge-Paths-Human-Needs

Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs

Stars: ✭ 17 (-69.09%)

Mutual labels: naacl

TransTQA

Author: Wenhao Yu ([email protected]). EMNLP'20. Transfer Learning for Technical Question Answering.

Stars: ✭ 12 (-78.18%)

Mutual labels: question-answering

BERT-for-Chinese-Question-Answering

No description or website provided.

Stars: ✭ 75 (+36.36%)

Mutual labels: question-answering

HHH-An-Online-Question-Answering-System-for-Medical-Questions

HBAM: Hierarchical Bi-directional Word Attention Model

Stars: ✭ 44 (-20%)

Mutual labels: question-answering

strategyqa

The official code of TACL 2021, "Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies".

Stars: ✭ 27 (-50.91%)

Mutual labels: question-answering

View All Similar Projects ➔

FreebaseQA (v1.0): A Trivia-type QA Data Set over the Freebase Knowledge Graph

This repository contains FreebaseQA, a new data set for open-domain QA over the Freebase knowledge graph. The question-answer pairs in this data set are collected from various sources, including the TriviaQA data set (Joshi et al., 2017) and other trivia websites (QuizBalls, QuizZone, KnowQuiz), and are matched against Freebase to generate relevant subject-predicate-object triples that were further verified by human annotators. As all questions in FreebaseQA are composed independently for human contestants in various trivia-like competitions, this data set shows richer linguistic variation and complexity than existing QA data sets, making it a good test-bed for emerging KB-QA systems.

If you find this data set useful, please cite the paper:

[1] K. Jiang, D. Wu and H. Jiang, "FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase," Proc. of North American Chapter of the Association for Computational Linguistics (NAACL), June 2019.

All data is distributed under the CC-BY-4.0 license.

Data Set Files

This data set contains 28,348 unique questions that are divided into three subsets: train (20,358), dev (3,994) and eval (3,996), formatted as JSON files: FreebaseQA-[train|dev|eval].json.

We have also included FreebaseQA-partial.json, which is not officially part of FreebaseQA but may be useful for training models for certain NLP tasks such as named entity recognition and entity linking.

Each file is formatted as follows:

Dataset: The name of this data set
Version: The version of the FreebaseQA data set
Questions: The set of unique questions in this data set
- Question-ID: The unique ID of each question
- RawQuestion: The original question collected from data sources
- ProcessedQuestion: The question processed with some operations such as removal of trailing question mark and decapitalization
- Parses: The semantic parse(s) for the question
  - Parse-Id: The ID of each semantic parse
  - PotentialTopicEntityMention: The potential topic entity mention in the question
  - TopicEntityName: The name or alias of the topic entity in the question from Freebase
  - TopicEntityMid: The Freebase MID of the topic entity in the question
  - InferentialChain: The path from the topic entity node to the answer node in Freebase, labeled as a predicate
  - Answers: The answer found from this parse
    - AnswersMid: The Freebase MID of the answer
    - AnswersName: The answer string from the original question-answer pair

Evaluation Metrics

Accuracy is used as the evaluation metric for this data set, i.e. a question is considered correct only if the predicted answer is exactly the same as one of the given answers.

Freebase Extract

We have extracted a subset of Freebase (2.2GB zip), which includes all relevant entities (16M) and triples (182M) to all FreebaseQA questions. The subset can accompany the FreebaseQA data set in order to evaluate the accuracy of trained models in answering questions. The subset may be downloaded from the following link: https://www.dropbox.com/sh/a25p7j2ir8gqnvx/AABJvjoI9mbHYj3hyfuxSdGaa?dl=0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

kelvin-jiang / FreebaseQA

Labels

Projects that are alternatives of or similar to FreebaseQA

FreebaseQA (v1.0): A Trivia-type QA Data Set over the Freebase Knowledge Graph

Data Set Files

Evaluation Metrics

Freebase Extract