All Categories → No Category → data-integration

Top 25 data-integration open source projects

Mara Pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
data-product-batch
Template to deploy a Data Product for Batch data processing into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to ingest, provide and create new data assets within the platform.
thymeflow
Installer for Thymeflow, a personal knowledge management system.
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
SDM-RDFizer
An Efficient RML-Compliant Engine for Knowledge Graph Construction
assignPOP
Population Assignment using Genetic, Non-genetic or Integrated Data in a Machine-learning Framework. Methods in Ecology and Evolution. 2018;9:439–446.
bio2bel
A Python framework for integrating biological databases and structured data sources in Biological Expression Language (BEL)
CogStack-NiFi
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Mapeathor
Translator of spreadsheet mappings into R2RML, RML or YARRRML
OpenOmics
A bioinformatics API and web-app to integrate multi-omics datasets & interface with public databases.
data-product-streaming
Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to ingest, provide and create new data assets within the platform.
winter
WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
cosmosR
COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets.
CommonCoreOntologies
The Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
nomenklatura
Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources
R-Learning-Journey
Some of the projects i made when starting to learn R for Data Science at the university
1-25 of 25 data-integration projects