All Projects → WenRichard → Kbqa Bert

WenRichard / Kbqa Bert

Licence: mit
基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Kbqa Bert

Geistmap
An experimental personal knowledge base with a focus on connections
Stars: ✭ 425 (-49.76%)
Mutual labels:  knowledge-graph
Web kg
爬取百度百科中文页面,抽取三元组信息,构建中文知识图谱
Stars: ✭ 549 (-35.11%)
Mutual labels:  knowledge-graph
Awesome Knowledge Management
A curated list of amazingly awesome articles, people, applications, software libraries and projects related to the knowledge management space
Stars: ✭ 758 (-10.4%)
Mutual labels:  knowledge-graph
Lightkg
基于Pytorch和torchtext的知识图谱深度学习框架。
Stars: ✭ 452 (-46.57%)
Mutual labels:  knowledge-graph
Deepke
基于深度学习的开源中文关系抽取框架
Stars: ✭ 525 (-37.94%)
Mutual labels:  knowledge-graph
Knowledge graph attention network
KGAT: Knowledge Graph Attention Network for Recommendation, KDD2019
Stars: ✭ 610 (-27.9%)
Mutual labels:  knowledge-graph
Gnn4nlp Papers
A list of recent papers about Graph Neural Network methods applied in NLP areas.
Stars: ✭ 405 (-52.13%)
Mutual labels:  knowledge-graph
Multi Drug Embedding
Method for drug repurposing from knowledge graphs and literature
Stars: ✭ 18 (-97.87%)
Mutual labels:  knowledge-graph
Sentibridge
SentiBridge: A Knowledge Base for Entity-Sentiment Representation
Stars: ✭ 542 (-35.93%)
Mutual labels:  knowledge-graph
Awesome chinese medical nlp
中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc
Stars: ✭ 623 (-26.36%)
Mutual labels:  knowledge-graph
Graphin
A React toolkit for graph visualization based on G6
Stars: ✭ 482 (-43.03%)
Mutual labels:  knowledge-graph
Knowledge Graph Learning
A curated list of awesome knowledge graph tutorials, projects and communities.
Stars: ✭ 516 (-39.01%)
Mutual labels:  knowledge-graph
Tw5 Tiddlymap
Map drawing and topic visualization for your wiki
Stars: ✭ 620 (-26.71%)
Mutual labels:  knowledge-graph
Ripplenet
A tensorflow implementation of RippleNet
Stars: ✭ 434 (-48.7%)
Mutual labels:  knowledge-graph
Recbole
A unified, comprehensive and efficient recommendation library
Stars: ✭ 780 (-7.8%)
Mutual labels:  knowledge-graph
Kglib
Grakn Knowledge Graph Library (ML R&D)
Stars: ✭ 405 (-52.13%)
Mutual labels:  knowledge-graph
Research
novel deep learning research works with PaddlePaddle
Stars: ✭ 609 (-28.01%)
Mutual labels:  knowledge-graph
Contextualise
Contextualise is a simple but effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
Stars: ✭ 899 (+6.26%)
Mutual labels:  knowledge-graph
Chatbot cn
基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
Stars: ✭ 791 (-6.5%)
Mutual labels:  knowledge-graph
Dgl Ke
High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
Stars: ✭ 625 (-26.12%)
Mutual labels:  knowledge-graph

KBQA-BERT

基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式

Introduction

本项目主要由两个重要的点组成,一是基于BERT的命名实体识别,二是基于BERT的句子相似度计算,本项目将这两个模块进行融合,构建基于BERT的KBQA问答系统,在命名实体识别上分为online predict和outline predict;在句子相似度上,也分为online predict和outline predict,2个模块互不干扰,做到了高内聚低耦合的效果,最后的kbqa相当于融合这2个模块进行outline predict,具体介绍请见我的知乎专栏

------------------------------------------- 2019/6/15 更新 ----------------------------------------

把过去一段时间同学们遇到的主要问题汇总一下,下面是一些FAQ:

Q: 运行run_ner.py时未找到dev.txt,请问这个文件是怎么生成的呢?
A: 这一部分我记得当初是没有足够多的数据,我把生成的test.txt copy, 改成dev.txt了。

Q: 你好,我下载了你的项目,但在运行run_ner的时候总是会卡在Saving checkpoint 0 to....这里,请问是什么原因呢?
A: ner部分是存在一些问题,我也没有解决,但是我没有遇到这种情况。微调bert大概需要12GB左右的显存,大家可以把batch_size和max_length调小一点,说不定会解决这个问题!。

Q: 该项目有没有相应的论文呢?
A: 回答是肯定的,有的,送上 论文传送门!

Q: 数据下载失败,不满足现有数据?
A: 数据在Data中,更多的数据在NLPCC2016NLPCC2017

PS:这个项目有很多需要提高的地方,如果大家有好点子,欢迎pull,感谢!这段时间发论文找工作比较忙,邮件和issue没有及时回复望见谅!

------------------------------------------- 2019/6/15 更新 ----------------------------------------

环境配置

Python版本为3.6
tensorflow版本为1.13
XAMPP版本为3.3.2
Navicat Premium12

目录说明

bert文件夹是google官方下载的
Data文件夹存放原始数据和处理好的数据
    construct_dataset.py  生成NER_Data的数据
    construct_dataset_attribute.py  生成Sim_Data的数据
    triple_clean.py  生成三元组数据
    load_dbdata.py  将数据导入mysql db
ModelParams文件夹需要下载BERT的中文配置文件:chinese_L-12_H-768_A-12
Output文件夹存放输出的数据

基于BERT的命名实体识别模块
- lstm_crf_layer.py
- run_ner.py
- tf_metrics.py
- conlleval.py
- conlleval.pl
- run_ner.sh

基于BERT的句子相似度计算模块
- args.py
- run_similarity.py

KBQA模块
- terminal_predict.py
- terminal_ner.sh
- kbqa_test.py

使用说明

- run_ner.sh
NER训练和调参

- terminal_ner.sh
do_predict_online=True  NER线上预测
do_predict_outline=True  NER线下预测

- args.py
train = True  预训练模型
test = True  SIM线上测试

- run_similarity.py
python run一下就可以啦

- kbqa_test.py
基于KB的问答测试

实验分析

NER图

kb图


如果觉得我的工作对您有帮助,请不要吝啬右上角的小星星哦!欢迎Fork和Star!也欢迎一起建设这个项目!
有时间就会更新问答相关项目,有兴趣的同学可以follow一下
留言请在Issues或者email [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].