Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → eva-n27 → BERT-for-Chinese-Question-Answering

eva-n27 / BERT-for-Chinese-Question-Answering

Licence: Apache-2.0 license

No description or website provided.

Programming Languages

139335 projects - #7 most used programming language

Labels

pytorch question-answering bert

Projects that are alternatives of or similar to BERT-for-Chinese-Question-Answering

Text2Text: Cross-lingual natural language processing and generation toolkit

Stars: ✭ 188 (+150.67%)

Mutual labels: question-answering, bert

Nlp chinese corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

Stars: ✭ 6,656 (+8774.67%)

Mutual labels: question-answering, bert

⛔ [NOT MAINTAINED] A web interface for cdQA and other question answering systems.

Stars: ✭ 19 (-74.67%)

Mutual labels: question-answering, bert

🔮 Answering multiple choice questions with Language Models.

Stars: ✭ 23 (-69.33%)

Mutual labels: question-answering, bert

A Sentence Cloze Dataset for Chinese Machine Reading Comprehension (CMRC 2019)

Stars: ✭ 118 (+57.33%)

Mutual labels: question-answering, bert

Conversational Question Answering on Clinical Text

Stars: ✭ 22 (-70.67%)

Mutual labels: question-answering, bert

中文wiki百科QA阅读理解问答系统，使用了CCKS2016数据的NER模型和CMRC2018的阅读理解模型，还有W2V词向量搜索,使用torchserve部署

Stars: ✭ 46 (-38.67%)

Mutual labels: question-answering, bert

SQUAD2.Q-Augmented-Dataset

Augmented version of SQUAD 2.0 for Questions

Stars: ✭ 31 (-58.67%)

Mutual labels: question-answering, bert

Financial Domain Question Answering with pre-trained BERT Language Model

Stars: ✭ 70 (-6.67%)

Mutual labels: question-answering, bert

DrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.

Stars: ✭ 29 (-61.33%)

Mutual labels: question-answering, bert

🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.

Stars: ✭ 3,409 (+4445.33%)

Mutual labels: question-answering, bert

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Stars: ✭ 229 (+205.33%)

Mutual labels: question-answering, bert

吹逼我们是认真的

Stars: ✭ 45 (-40%)

Mutual labels: question-answering, bert

KitanaQA: Adversarial training and data augmentation for neural question-answering models

Stars: ✭ 58 (-22.67%)

Mutual labels: question-answering, bert

NLP-Review-Scorer

Score your NLP paper review

Stars: ✭ 25 (-66.67%)

Mutual labels: bert

label-studio-transformers

Label data using HuggingFace's transformers and automatically get a prediction service

Stars: ✭ 117 (+56%)

Mutual labels: bert

Instahelp is a Q&A portal website similar to Quora

Stars: ✭ 21 (-72%)

Mutual labels: question-answering

roberta-wwm-base-distill

this is roberta wwm base distilled model which was distilled from roberta wwm by roberta wwm large

Stars: ✭ 61 (-18.67%)

Mutual labels: bert

A question answering system which utilises machine learning.

Stars: ✭ 17 (-77.33%)

Mutual labels: question-answering

A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)

Stars: ✭ 97 (+29.33%)

Mutual labels: bert

View All Similar Projects ➔

BERT-for-Chinese-Question-Answering

本仓库的代码来源于PyTorch Pretrained Bert，仅做适配中文的QA任务的修改

主要修改的地方为read_squad_examples函数，由于SQuAD是英文的，因此源代码处理的方式是按照英文的方式，即此处。

另外，增加了训练中每隔save_checkpoints_steps次进行evaluate，并保存dev上效果最好的模型参数。

因此修改为：

1.先使用tokenizer先使用tokenizer.basic_tokenizer.tokenize对doc进行处理得到doc_tokens（代码161行）

2.对orig_answer_text使用tokenizer.basic_tokenizer.tokenize，然后再计算answer的start_position和end_position（代码172-191）

使用方法

首先需要将你的语料转换成SQuAD形式的，将数据以及模型文件放到data目录下（需要自己创建）
执行

python3 run_squad.py \
  --do_train 
  --do_predict 
  --save_checkpoints_steps 3000 
  --train_batch_size 12 
  --num_train_epochs 5

测试 eval.py中增加了使用BERT的tokenization，然后再计算EM和F1

python3 eval.py data/squad_dev.json output/predictions.json

欢迎各位大佬批评和指正，感谢

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 75

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗