All Projects → jetnew → DrFAQ

jetnew / DrFAQ

Licence: other
DrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to DrFAQ

text2text
Text2Text: Cross-lingual natural language processing and generation toolkit
Stars: ✭ 188 (+548.28%)
Mutual labels:  question-answering, bert
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+22851.72%)
Mutual labels:  question-answering, bert
cdQA-ui
⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.
Stars: ✭ 19 (-34.48%)
Mutual labels:  question-answering, bert
BERT-for-Chinese-Question-Answering
No description or website provided.
Stars: ✭ 75 (+158.62%)
Mutual labels:  question-answering, bert
spacy-sentence-bert
Sentence transformers models for SpaCy
Stars: ✭ 88 (+203.45%)
Mutual labels:  spacy, bert
SQUAD2.Q-Augmented-Dataset
Augmented version of SQUAD 2.0 for Questions
Stars: ✭ 31 (+6.9%)
Mutual labels:  question-answering, bert
iamQA
中文wiki百科QA阅读理解问答系统,使用了CCKS2016数据的NER模型和CMRC2018的阅读理解模型,还有W2V词向量搜索,使用torchserve部署
Stars: ✭ 46 (+58.62%)
Mutual labels:  question-answering, bert
TriB-QA
吹逼我们是认真的
Stars: ✭ 45 (+55.17%)
Mutual labels:  question-answering, bert
bert-tensorflow-pytorch-spacy-conversion
Instructions for how to convert a BERT Tensorflow model to work with HuggingFace's pytorch-transformers, and spaCy. This walk-through uses DeepPavlov's RuBERT as example.
Stars: ✭ 26 (-10.34%)
Mutual labels:  spacy, bert
anonymisation
Anonymization of legal cases (Fr) based on Flair embeddings
Stars: ✭ 85 (+193.1%)
Mutual labels:  spacy, bert
KitanaQA
KitanaQA: Adversarial training and data augmentation for neural question-answering models
Stars: ✭ 58 (+100%)
Mutual labels:  question-answering, bert
contextualSpellCheck
✔️Contextual word checker for better suggestions
Stars: ✭ 274 (+844.83%)
Mutual labels:  spacy, bert
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+689.66%)
Mutual labels:  question-answering, bert
Medi-CoQA
Conversational Question Answering on Clinical Text
Stars: ✭ 22 (-24.14%)
Mutual labels:  question-answering, bert
hf-experiments
Experiments with Hugging Face 🔬 🤗
Stars: ✭ 37 (+27.59%)
Mutual labels:  question-answering, huggingface
mcQA
🔮 Answering multiple choice questions with Language Models.
Stars: ✭ 23 (-20.69%)
Mutual labels:  question-answering, bert
FinBERT-QA
Financial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (+141.38%)
Mutual labels:  question-answering, bert
cmrc2019
A Sentence Cloze Dataset for Chinese Machine Reading Comprehension (CMRC 2019)
Stars: ✭ 118 (+306.9%)
Mutual labels:  question-answering, bert
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+11655.17%)
Mutual labels:  question-answering, bert
converse
Conversational text Analysis using various NLP techniques
Stars: ✭ 147 (+406.9%)
Mutual labels:  spacy, huggingface

DrFAQ

  • DrFAQ is a plug-and-play question answering chatbot that can be generally applied to any organiation's text corpora.
  • Designed and implemented a NLP Question Answering architecture using spaCy, huggingface’s BERT language model, ElasticSearch, Telegram Bot API, and hosted on Heroku.

News

  • 4 Mar 2021 - Transfer learning of language models alongside evaluation study is currently in progress.
  • 13 Dec 2019 - Implementation of 4-step question-answering methodology completed.

Objective

  • Given an organisation's corpus of documents, generate a chatbot to enable natural question-answering capabilities.

Methodology

When a question is asked, the following processes are performed:

  1. FAQ Question Matching using spaCy's Similarity - /match
    • From a given list of Frequently Asked Questions (FAQs), the chatbot detects similarity to the specified question and selects the best answer from the existing list.
  2. NLP Question Answering using huggingface's BERT - /nlp
    • If the question asked is dissimilar to any existing FAQs, perform question answering on the knowledge base and return a sufficiently confident answer.
  3. Answer Search using ElasticSearch - /search
    • If the answer is not sufficiently confident, perform a search on the document corpus and return the search results.
  4. Human Intervention
    • If the search results are still not relevant, prompt a human to add the question-answer pair to the existing list of specified FAQs, or speak to a human.

Research

  • Transfer learning of language models researched in a benchmark study shows that:
    • If a large and clean QA dataset is available, RoBERTa is the best language model.
    • If only a small and unclean generated QA dataset is available, MobileBERT is the best language model.
    • If the QA dataset contains many 'Who' questions, RoBERTa should be considered.

Future Work

  • Release DrFAQ as a pip package.
  • Make an interactive demo available.
  • Integrate abstractive question-answering into the methodology.
  • Leverage databases and cloud services.

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].