All Projects → ahmadassaf → KBE

ahmadassaf / KBE

Licence: other
Node.js application to extract the knowledge represented in Google infoboxes (aka Google Knowlege Graph Panel)

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to KBE

Kglib
Grakn Knowledge Graph Library (ML R&D)
Stars: ✭ 405 (+1400%)
Mutual labels:  knowledge-graph, knowledgebase
OLGA
an Ontology SDK
Stars: ✭ 36 (+33.33%)
Mutual labels:  knowledge-graph, knowledgebase
kglib
TypeDB-ML is the Machine Learning integrations library for TypeDB
Stars: ✭ 523 (+1837.04%)
Mutual labels:  knowledge-graph, knowledgebase
Geistmap
An experimental personal knowledge base with a focus on connections
Stars: ✭ 425 (+1474.07%)
Mutual labels:  knowledge-graph, knowledgebase
Open Semantic Entity Search Api
Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names
Stars: ✭ 98 (+262.96%)
Mutual labels:  knowledge-graph, knowledgebase
Datasets knowledge embedding
Datasets for Knowledge Graph Completion with textual information about the entities
Stars: ✭ 116 (+329.63%)
Mutual labels:  knowledge-graph, knowledgebase
Simple
SimplE Embedding for Link Prediction in Knowledge Graphs
Stars: ✭ 104 (+285.19%)
Mutual labels:  knowledge-graph, knowledgebase
Piggydb
Piggydb is a Web notebook application that provides you with a platform to build your knowledge personally or collaboratively.
Stars: ✭ 130 (+381.48%)
Mutual labels:  knowledge-graph, knowledgebase
awesome-knowledge-graphs
Graph databases, Knowledge Graphs, SPARQ
Stars: ✭ 56 (+107.41%)
Mutual labels:  knowledge-graph
BiLSTM-and-CNN-for-Link-Prediction
Keras implementation of path-based link prediction model for knowledge graph completion
Stars: ✭ 17 (-37.04%)
Mutual labels:  knowledge-graph
KCL
Code and Data for the paper: Molecular Contrastive Learning with Chemical Element Knowledge Graph [AAAI 2022]
Stars: ✭ 61 (+125.93%)
Mutual labels:  knowledge-graph
rclc
Rich Context leaderboard competition, including the corpus and current SOTA for required tasks.
Stars: ✭ 20 (-25.93%)
Mutual labels:  knowledge-graph
ComplexNetwork
中国娱乐圈关系挖掘,可以快速的查询明星之间的关系。This is a complex network of course assignments. The realization of the relationship analysis and visualization of China's entertainment industry, you can quickly query the relationship between the stars
Stars: ✭ 24 (-11.11%)
Mutual labels:  knowledge-graph
CONVEX
As far as we know, CONVEX is the first unsupervised method for conversational question answering over knowledge graphs. A demo and our benchmark (and more) can be found at
Stars: ✭ 24 (-11.11%)
Mutual labels:  knowledge-graph
Market-Trend-Prediction
This is a project of build knowledge graph course. The project leverages historical stock price, and integrates social media listening from customers to predict market Trend On Dow Jones Industrial Average (DJIA).
Stars: ✭ 57 (+111.11%)
Mutual labels:  knowledge-graph
knowledge-graph-change-language
Tools for working with KGCL
Stars: ✭ 14 (-48.15%)
Mutual labels:  knowledge-graph
kgx
KGX is a Python library for exchanging Knowledge Graphs
Stars: ✭ 78 (+188.89%)
Mutual labels:  knowledge-graph
KBQA-Exploration
知识图谱初探,关系抽取,实体抽取,基于kb的问答,基于es的问答,知识图谱可视化
Stars: ✭ 45 (+66.67%)
Mutual labels:  knowledge-graph
Awesome-Federated-Learning-on-Graph-and-GNN-papers
Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.
Stars: ✭ 206 (+662.96%)
Mutual labels:  knowledge-graph
good-reads
List of inspiring articles, blogs, tutorials and books. Tech stuff.
Stars: ✭ 14 (-48.15%)
Mutual labels:  knowledgebase

Knowledge-base Extractor

This is a node.js application that aims at extracting the knowledge represented in the Google infoboxes (aka Google Knowlege Graph Panel).

The Algorithm implemented is the following:

  • Query DBpedia for all concepts (types) for which there is at least one instance that has a link to a Freebase ID
  • For each of these concepts pick (n) instances randomly
  • For each instance, issue a Google Search query:
    • if an infobox is available -> scrap the infobox to extract the properties
    • if no infoxbox is available, check if Google suggests "do you mean ... ?" and if so, traverse the link and look for an infobox
    • if no infobox or correction is available, disambiguate the concept (type) used in the search query and check if an infobox is returned
    • if Google suggests disambiguation in an infobox parse all the links in it -> it is best to find which suggestion maps to the current data-type we are using -> check the Freebase - DBpedia mappings
  • Cluster properties for each concept

Notes

  • The result of our expirement is in the results folder results/dbpedia.json
  • For a more detailed view for each DBpedia class, one can check the files in results/dbpedia

How to run?

  • Clone the repo to your local machine
  • run npm install on the root of the local project directory

We Will automatically create all the required Cache folders:

  • Main cache folder "cache" in the root folder of the application

    • folder called GKB inside the cache folder: This will hold the aggregated Google Knowledge boxes extracted for a DBpedia concept (type)
    • folder called instances_GKB inside the cache folder: This will hold the Google Knowledge box for a single instance
    • folder called instances inside the cache folder: This will hold the DBpedia instances for each concept (type)
    • folder called instance_properties inside the cache folder: Thiw ill hold the distinct list of properties for all the instances of a certain concept
  • run node KBE.js in the console

The application is run in the console and the output will be available in cache/result.json

Crawling Configuration

There is a set of options that you can change found in the file options.json

cache_dbpedia_concepts       : true,
limit_dbpedia_concepts       : true,
limit_dbpedia_instances      : true,
limit_dbpedia_concepts_value : 10,
limit_dbpedia_instances_value: 10,
proxy                        : null
  • cache_dbpedia_concepts cache the concepts retrieved from DBpedia.
  • limit_dbpedia_concepts limit the number of concepts retrieved by DBpedia, false will retrieve all the concepts
  • limit_dbpedia_instances limit the number of instances retrieved for each concept, false will retrieve all the instances
  • limit_dbpedia_concepts_value the number of concepts you wish to retrieve
  • limit_dbpedia_instances_value the number of instances you wish to retrieve for each concept
  • proxy the proxy address string containing ports i.e http:\\proxy:8080

For our experiment the parameters are:

cache_dbpedia_concepts       : true,
limit_dbpedia_concepts       : false,
limit_dbpedia_instances      : true,
limit_dbpedia_concepts_value : null,
limit_dbpedia_instances_value: 100,
proxy                        : null

Moreover, you can always check the corresponding CSS class name selectors for the Google Knowledge Panel and edit them if needed in the same options.json file.

Currently the CSS selectors are:

"knowledgeBox"                : "#kno-result",
"knowledgeBox_disambiguate"   : ".kp-blk",
"property"                    : "._Nl",
"property_value"              : ".kno-fv",
"label"                       : ".kno-ecr-pt",
"description"                 : ".kno-rdesc",
"type"                        : "._kx",
"images"                      : ".bicc",
"special_property"            : ".kno-sh",
"special_property_value"      : "._Zh",
"special_property_value_link" : "a._dt"

Updates

  • Properties now have the direct links to DBpedia ontology
  • Properties scores are normalized

Sample Result

  "Band": {
  	"summary": {
  		"label": {
  			"uri": "http://dbpedia.org/property/label",
  			"count": 100
  		},
  		"description": {
  			"uri": "http://purl.org/dc/elements/1.1/description",
  			"count": 100
  		},
  		"type": {
  			"uri": "http://dbpedia.org/property/type",
  			"count": 100
  		},
  		"origin": {
  			"uri": "http://dbpedia.org/property/origin",
  			"count": 88.17204301075269
  		},
  		"members": {
  			"uri": "http://dbpedia.org/property/members",
  			"count": 88.17204301075269
  		},
  		"albums": {
  			"uri": "http://dbpedia.org/property/albums",
  			"count": 87.09677419354838
  		},
  		"leadSingers": {
  			"uri": "http://dbpedia.org/property/leadSingers",
  			"count": 6.451612903225806
  		},
  		"recordLabel": {
  			"uri": "http://dbpedia.org/property/recordLabel",
  			"count": 12.903225806451612
  		},
  		"awards": {
  			"uri": "http://dbpedia.org/property/awards",
  			"count": 13.978494623655912
  		},
  		"nominations": {
  			"uri": "http://dbpedia.org/property/nominations",
  			"count": 7.526881720430108
  		},
  		"born": {
  			"uri": "http://dbpedia.org/property/born",
  			"count": 2.1505376344086025
  		},
  		"nationality": {
  			"uri": "http://dbpedia.org/property/nationality",
  			"count": 2.1505376344086025
  		},
  		"height": {
  			"uri": "http://dbpedia.org/property/height",
  			"count": 1.0752688172043012
  		}
  	},
  	"infoboxless": [
  		"!Action Pact!",
  		"Allele (band)",
  		"Anti-Pasti",
  		"Armageddon (A&M band)",
  		"Banket (band)",
  		"Battlelore",
  		"Ben Folds Five"
  	],
  	"Unmapped_Properties": {
  		"leadSinger": 1,
  		"recordLabels": 1,
  		"songs": 1,
  		"upcomingEvents": 1,
  		"peopleAlsoSearchFor": 1,
  		"activeFrom": 1,
  		"filmMusicCredits": 1,
  		"activeUntil": 1,
  		"moviesAndTvShows": 1
  	}
  }
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].