All Projects → usc-isi-i2 → Kgtk

usc-isi-i2 / Kgtk

Licence: mit
Knowledge Graph Toolkit

Projects that are alternatives of or similar to Kgtk

Data science blogs
A repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (+71.6%)
Mutual labels:  jupyter-notebook, graphs
Ntds 2017
Material for the EPFL master course "A Network Tour of Data Science", edition 2017.
Stars: ✭ 61 (-24.69%)
Mutual labels:  jupyter-notebook, graphs
Clojure Graph Resources
A curated list of Clojure resources for dealing with graph-like data.
Stars: ✭ 94 (+16.05%)
Mutual labels:  rdf, graphs
Daru View
daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.
Stars: ✭ 65 (-19.75%)
Mutual labels:  jupyter-notebook, graphs
Cnn graph
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Stars: ✭ 1,110 (+1270.37%)
Mutual labels:  jupyter-notebook, graphs
Coding Ninjas Data Structures And Algorithms In Python
Solved problems and assignments of DSA course taught by Coding Ninjas team
Stars: ✭ 70 (-13.58%)
Mutual labels:  jupyter-notebook, graphs
Virtuoso Sparql Endpoint Quickstart
creates a docker image with Virtuoso preloaded with the latest DBpedia dataset
Stars: ✭ 80 (-1.23%)
Mutual labels:  rdf
Mimic Code
MIMIC Code Repository: Code shared by the research community for the MIMIC-III database
Stars: ✭ 1,225 (+1412.35%)
Mutual labels:  jupyter-notebook
Nd101
记录自己深度学习之路的点滴
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Tensorflow object detector
Tensorflow Object Detector
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Dl in nlp deeppavlov cs224n spring2020
"Deep Learning in Natural Language Processing" - a course by DeepPavlov built on top of Stanford's cs224n
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Style Semantics
Code for the paper "Controlling Style and Semantics in Weakly-Supervised Image Generation", ECCV 2020
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Attention Transfer
Improving Convolutional Networks via Attention Transfer (ICLR 2017)
Stars: ✭ 1,231 (+1419.75%)
Mutual labels:  jupyter-notebook
Odscon Sf 2015
Material for ODSCON San Francisco 2015
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Talks odt
Slides and materials for most of my talks by year
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Keras Segnet Basic
SegNet-Basic with Keras
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Wellnessconversation Languagemodel
Korean Language Model을 이용한 심리상담 대화 언어 모델.
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Quickstart Python
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Hands On Algorithmic Problem Solving
A middle-to-high level algorithm book designed with coding interview at heart!
Stars: ✭ 1,227 (+1414.81%)
Mutual labels:  jupyter-notebook
Tutorials2021
Mediterranean Machine Learning school tutorials
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook

KGTK: Knowledge Graph Toolkit

doi travis ci Coverage Status

KGTK is a Python toolkit for building applications using knowledge graphs (KG). KGTK is designed for ease of use, scalability and speed. It represents KGs as simple TSV files with four columns to represent the head, relation and tail of a triple, as well as an identifier for each triple. This simple model allows KGTK to operate on property graphs and on RDF graphs. KGTK offers a comprehensive collection of 20+ commands to import, transform, query and analyze KGs, including wrappers for state of the art graph analytics and deep learning libraries. KGTK is optimized for batch processing, making it easy to write KG pipelines that process large KGs such as Wikidata on a laptop to produce datasets for use in downstream applications. KGTK is open-source software released under the MIT license.

Getting started

Documentation

https://kgtk.readthedocs.io/en/latest/

Demo: try KGTK online with MyBinder

The easiest, no-cost way of trying out KGTK is through MyBinder. We have made available several example notebooks to show some of the features of KGTK, which can be run in two environments:

  • Basic KGTK functionality: This notebook may take 5-10 minutes to launch, please be patient. Note that in this notebook some KGTK commands (graph analytics and embeddings) will not run. To launch the notebook in your browser, click on the "Binder" icon: Binder

  • Advanced KGTK functionality: This notebook may take 10-20 minutes to launch. It includes basic KGTK functionality and graph analytics and embedding capabilities of KGTK: Binder

For executing KGTK with large datasets, we recommend a Docker/local installation.

KGTK notebooks

The examples folder provides a larger and constantly increasing number of easy-to-follow Jupyter Notebooks which showcase different functionalities of KGTK. These include computing:

  • Embeddings for ConceptNet nodes
  • Graph statistics over a curated subset of Wikidata
  • Reachable occupations for selected people in Wikidata
  • PageRank over Wikidata
  • etc.

Releases

Installation

Installation through Docker

docker pull uscisii2/kgtk

To run KGTK in the command line:

docker run -it --rm  --user root -e NB_GID=100 -e GEN_CERT=yes -e GRANT_SUDO=yes uscisii2/kgtk:latest /bin/bash

Note: if you want to load data from your local machine, you will need to mount a volume. For example, to mount the current directory ($PWD) and launch KGTK in command line mode:

docker run -it --rm -v $PWD:/out --user root -e NB_GID=100 -e GEN_CERT=yes -e GRANT_SUDO=yes uscisii2/kgtk:latest /bin/bash

If you want to run KGTK in a Jupyter notebook, mounting the current directory ($PWD) as a folder called /out then you will have to type:

docker run -it -v $PWD:/out -p 8888:8888 uscisii2/kgtk:latest /bin/bash -c "jupyter notebook --ip='*' --port=8888 --no-browser"

More information about versions and tags is available here: https://hub.docker.com/repository/docker/uscisii2/kgtk. For example, the dev branch is available at uscisii2/kgtk:latest-dev.

See additional examples in the documentation.

Local installation

Our installation will be in a conda environment. If you don't have conda installed, follow link to install it. Once installed, follow the instructions below:

  1. Set up your own conda environment:
conda create -n kgtk-env python=3.7
conda activate kgtk-env

Note: Installing Graph-tool is problematic on python 3.8 and out of a virtual environment. Thus: the advised installation path is by using a virtual environment.

  1. Install (the dev branch at this point): pip install kgtk

You can test if kgtk is installed properly now with: kgtk -h.

  1. Download the English model of SpaCY: python -m spacy download en_core_web_sm

  2. Install graph-tool: conda install -c conda-forge graph-tool. If you don't use conda or run into problems, see these instructions.

  3. Python library rdflib has a known issue, where the ttl serialization of decimal values is incorrect. The library will add a .0 at the end of decimal values in scientific notation. This will make the ttl invalid and cannot be loaded into a triplestore.

To solve this issue, run the following commands after the kgtk installation is complete.

pip uninstall rdflib
pip install git+https://github.com/RDFLib/[email protected]

The code fix for this bug is already merged into the library, but has not been released as a pypi package. This step will be removed after rdflib version 6 is released.

Updating your KGTK installation

To update your version of KGTK, just follow the instructions below:

  • If you installed KGTK with through Docker, then just pull the most recent image: docker pull <image_name>, where <image_name> is the tag of the image of interest (e.g. uscisii2/kgtk:latest)
  • If you installed KGTK from pip, then type pip install -U kgtk.
  • If you installed KGTK from GitHub, then type git pull && pip install . Alternatively, you may execute: git pull && python setup.py install.
  • If you installed KGTK in development mode, (i.e., pip install -e); then you only need to do update your repository: git pull.

Running KGTK commands

To list all the available KGTK commands, run:

kgtk -h

To see the arguments of a particular commands, run:

kgtk <command> -h

An example command that computes instances of the subclasses of two classes:

kgtk instances --transitive --class Q13442814,Q12345678

Running unit tests locally

cd kgtk/tests
python -W ignore -m unittest discover

How to cite

@inproceedings{ilievski2020kgtk,
  title={{KGTK}: A Toolkit for Large Knowledge Graph Manipulation and Analysis}},
  author={Ilievski, Filip and Garijo, Daniel and Chalupsky, Hans and Divvala, Naren Teja and Yao, Yixiang and Rogers, Craig and Li, Ronpeng and Liu, Jun and Singh, Amandeep and Schwabe, Daniel and Szekely, Pedro},
  booktitle={International Semantic Web Conference},
  pages={278--293},
  year={2020},
  organization={Springer}
  url={https://arxiv.org/pdf/2006.00088.pdf}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].