Rudder ServerPrivacy and Security focused Segment-alternative, in Golang and React
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
HudiUpserts, Deletes And Incremental Processing on Big Data.
Awesome Single CellCommunity-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Mara PipelinesA lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
scarchesReference mapping for single-cell genomics
data-product-batchTemplate to deploy a Data Product for Batch data processing into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to ingest, provide and create new data assets within the platform.
thymeflowInstaller for Thymeflow, a personal knowledge management system.
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
SDM-RDFizerAn Efficient RML-Compliant Engine for Knowledge Graph Construction
assignPOPPopulation Assignment using Genetic, Non-genetic or Integrated Data in a Machine-learning Framework. Methods in Ecology and Evolution. 2018;9:439–446.
SchemaMapperA .NET class library that allows you to import data from different sources into a unified destination
bio2belA Python framework for integrating biological databases and structured data sources in Biological Expression Language (BEL)
CogStack-NiFiBuilding data processing pipelines for documents processing with NLP using Apache NiFi and related services
doctoral-thesis📖 Generation and Applications of Knowledge Graphs in Systems and Networks Biology
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
MapeathorTranslator of spreadsheet mappings into R2RML, RML or YARRRML
OpenOmicsA bioinformatics API and web-app to integrate multi-omics datasets & interface with public databases.
data-product-streamingTemplate to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product template can be used by cross-functional teams to ingest, provide and create new data assets within the platform.
winterWInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
cosmosRCOSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets.
CommonCoreOntologiesThe Common Core Ontology Repository holds the current released version of the Common Core Ontology suite.
nomenklaturaFramework and command-line tools for integrating FollowTheMoney data streams from multiple sources
R-Learning-JourneySome of the projects i made when starting to learn R for Data Science at the university