All Projects → lixiang0 → Web_kg

lixiang0 / Web_kg

爬取百度百科中文页面,抽取三元组信息,构建中文知识图谱

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Web kg

Economic audit knowledge graph
经济责任审计知识图谱:网络爬虫、关系抽取、领域词汇判定
Stars: ✭ 98 (-82.15%)
Mutual labels:  knowledge-graph, spider, neo4j
Baiduspider
BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索,百度视频搜索,百度资讯搜索,百度文库搜索,百度经验搜索和百度百科搜索。
Stars: ✭ 105 (-80.87%)
Mutual labels:  baidu, spider
Image Downloader
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
Stars: ✭ 1,173 (+113.66%)
Mutual labels:  baidu, spider
ComplexNetwork
中国娱乐圈关系挖掘,可以快速的查询明星之间的关系。This is a complex network of course assignments. The realization of the relationship analysis and visualization of China's entertainment industry, you can quickly query the relationship between the stars
Stars: ✭ 24 (-95.63%)
Mutual labels:  neo4j, knowledge-graph
Tw5 Tiddlymap
Map drawing and topic visualization for your wiki
Stars: ✭ 620 (+12.93%)
Mutual labels:  wiki, knowledge-graph
R6 Operator Counters
A website with a graph visualisation of how operators counter each other in Rainbow Six Siege.
Stars: ✭ 51 (-90.71%)
Mutual labels:  wiki, neo4j
BaiduSpider
项目已经移动至:https://github.com/BaiduSpider/BaiduSpider !! 一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索,百度视频搜索,百度资讯搜索,百度文库搜索,百度经验搜索和百度百科搜索。
Stars: ✭ 29 (-94.72%)
Mutual labels:  spider, baidu
Decryptlogin
APIs for loginning some websites by using requests.
Stars: ✭ 1,861 (+238.98%)
Mutual labels:  baidu, spider
ChineseStarsRelationship
中国明星数据爬取。你甚至可以拿到互联网上所有的人之间的关系,接下来你可以自己发挥!基于这些数据,你可以完成更多有趣的事情。比如说社交网络分析,关系网络可视化,算法研究,和其他有意思的事情。Chinese star data crawling. You can even get all the people on the internet! Based on these data, you can do more interesting things. For example, social network analysis, relational network visualization, algorithm research, and other interesting things.
Stars: ✭ 26 (-95.26%)
Mutual labels:  spider, knowledge-graph
knowledge-graph
Graph Data Visualization Demo| 图数据搜索可视化应用案例
Stars: ✭ 30 (-94.54%)
Mutual labels:  neo4j, knowledge-graph
covid-19-community
Community effort to build a Neo4j Knowledge Graph (KG) that links heterogeneous data about COVID-19
Stars: ✭ 95 (-82.7%)
Mutual labels:  neo4j, knowledge-graph
Semanticmediawiki
🔗 Semantic MediaWiki turns MediaWiki into a knowledge management platform with query and export capabilities
Stars: ✭ 359 (-34.61%)
Mutual labels:  wiki, knowledge-graph
Kglab
Graph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.
Stars: ✭ 98 (-82.15%)
Mutual labels:  knowledge-graph, neo4j
Baiduimagespider
一个超级轻量的百度图片爬虫
Stars: ✭ 591 (+7.65%)
Mutual labels:  baidu, spider
Stock Knowledge Graph
利用网络上公开的数据构建一个小型的证券知识图谱/知识库
Stars: ✭ 1,182 (+115.3%)
Mutual labels:  knowledge-graph, neo4j
nlm
Memory for Knowledge Graph, using Neo4j. 知识图谱存储与查询。
Stars: ✭ 43 (-92.17%)
Mutual labels:  neo4j, knowledge-graph
knowledge-graph-nlp-in-action
从模型训练到部署,实战知识图谱(Knowledge Graph)&自然语言处理(NLP)。涉及 Tensorflow, Bert+Bi-LSTM+CRF,Neo4j等 涵盖 Named Entity Recognition,Text Classify,Information Extraction,Relation Extraction 等任务。
Stars: ✭ 58 (-89.44%)
Mutual labels:  neo4j, knowledge-graph
Geistmap
An experimental personal knowledge base with a focus on connections
Stars: ✭ 425 (-22.59%)
Mutual labels:  wiki, knowledge-graph
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+773.04%)
Mutual labels:  spider
Haipproxy
💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+809.47%)
Mutual labels:  spider

开源web知识图谱项目

  • 爬取百度百科中文页面
  • 解析三元组和网页内容
  • 构建中文知识图谱
  • 构建百科bot(构建中)
update 20200720

Windows上的部署参考如何在Windows上部署,感谢LMY-nlp0701!

update 20191121
  • 迁移代码到爬虫框架scrapy
  • 优化了抽取部分代码
  • 数据持久化迁移到mongodb
  • 修复chatbot失效问题
  • 开放neo4j后台界面,可以查看知识图谱成型效果
Tips
  • 如果是项目问题,请提issue。
  • 如果涉及到不方便公开的,请发邮件。
  • ChatBot请访问链接
  • 成型的百科知识图谱访问链接,用户名:neo4j,密码:123。效果如下:

环境

  • python 3.6
  • re:url正则匹配
  • scrapy:网页爬虫和网页解析
  • neo4j:知识图谱图数据库,安装可以参考链接
  • pip install neo4j-driver:neo4j python驱动
  • pip install pymongodb:mongodb的python支持
  • mongodb数据库:安装参考链接

代码执行:

cd WEB_KG/baike
scrapy crawl baike

执行界面(按ctrl+c停止):

知识图谱效果图

mongodb存储的网页内容

mongodb存储的三元组

neo4j后台界面

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].