All Projects → oeg-upm → morph-kgc

oeg-upm / morph-kgc

Licence: Apache-2.0 license
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to morph-kgc

Mapeathor
Translator of spreadsheet mappings into R2RML, RML or YARRRML
Stars: ✭ 27 (-64.94%)
Mutual labels:  knowledge-graph, data-integration, r2rml, rml
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+6288.31%)
Mutual labels:  etl, data-engineering, data-integration
SDM-RDFizer
An Efficient RML-Compliant Engine for Knowledge Graph Construction
Stars: ✭ 68 (-11.69%)
Mutual labels:  knowledge-graph, data-integration, rml
carml
A pretty sweet RML engine, for RDF.
Stars: ✭ 74 (-3.9%)
Mutual labels:  rdf, r2rml, rml
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+63.64%)
Mutual labels:  etl, data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+2997.4%)
Mutual labels:  etl, data-engineering
Open Semantic Etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (+114.29%)
Mutual labels:  etl, rdf
Grafter
Linked Data & RDF Manufacturing Tools in Clojure
Stars: ✭ 174 (+125.97%)
Mutual labels:  etl, rdf
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+132.47%)
Mutual labels:  etl, data-engineering
knowledge-graph-change-language
Tools for working with KGCL
Stars: ✭ 14 (-81.82%)
Mutual labels:  rdf, knowledge-graph
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (-51.95%)
Mutual labels:  etl, data-engineering
Etl
LinkedPipes ETL is an RDF based, lightweight ETL tool
Stars: ✭ 88 (+14.29%)
Mutual labels:  etl, rdf
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+2.6%)
Mutual labels:  etl, data-engineering
Mara Pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+2290.91%)
Mutual labels:  etl, data-integration
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+2.6%)
Mutual labels:  etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+722.08%)
Mutual labels:  etl, data-engineering
OLGA
an Ontology SDK
Stars: ✭ 36 (-53.25%)
Mutual labels:  rdf, knowledge-graph
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+262.34%)
Mutual labels:  etl, data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+4711.69%)
Mutual labels:  etl, data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+344.16%)
Mutual labels:  etl, data-engineering

morph

License DOI Latest PyPI version Python Version PyPI status build Documentation Status Open In Colab

Morph-KGC is an engine that constructs RDF and RDF-star knowledge graphs from heterogeneous data sources with the R2RML, RML and RML-star mapping languages. Morph-KGC is built on top of pandas and it leverages mapping partitions to significantly reduce execution times and memory consumption for large data sources.

Citing Morph-KGC: If you used Morph-KGC in your work, please cite the SWJ paper:

@article{arenas2022morph,
  title   = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}},
  author  = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar},
  journal = {Semantic Web},
  year    = {2022},
  doi     = {10.3233/SW-223135}
}

Main Features

Documentation

Read the documentation.

Tutorial

Learn quickly with the tutorial in Google Colaboratory!

Getting Started

PyPi is the fastest way to install Morph-KGC:

pip install morph-kgc

We recommend to use virtual environments to install Morph-KGC.

To run the engine via command line you just need to execute the following:

python3 -m morph_kgc config.ini

Check the documentation to see how to generate the configuration INI file. Here you can also see an example INI file.

It is also possible to run Morph-KGC as a library with RDFLib and Oxigraph:

import morph_kgc

# generate the triples and load them to an RDFLib graph
g_rdflib = morph_kgc.materialize('/path/to/config.ini')
# work with the RDFLib graph
q_res = g_rdflib.query(' SELECT DISTINCT ?classes WHERE { ?s a ?classes } ')

# generate the triples and load them to Oxigraph
g_oxigraph = morph_kgc.materialize_oxigraph('/path/to/config.ini')
# work with Oxigraph
q_res = graph.query(' SELECT DISTINCT ?classes WHERE { ?s a ?classes } ')

# the methods above also accept the config as a string
config = """
            [DataSource1]
            mappings: /path/to/mapping/mapping_file.rml.ttl
            db_url: mysql+pymysql://user:password@localhost:3306/db_name
         """
g_rdflib = morph_kgc.materialize(config)

License

Morph-KGC is available under the Apache License 2.0.

Author

Ontology Engineering Group, Universidad Politécnica de Madrid.

Contributors

See the full list of contributors here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].