All Categories → Data Processing → data-engineering

Top 96 data-engineering open source projects

ml-in-production
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
prefect-saturn
Python client for using Prefect Cloud with Saturn Cloud
deordie-meetups
DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
get smarties
Dummy variable generation with fit/transform capabilities
contessa
Easy way to define, execute and store quality rules for your data.
papilo
DEPRECATED: Stream data processing micro-framework
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
lrmr
Less-Resilient MapReduce framework for Go
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
awesome-dbt
A curated list of awesome dbt resources
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
61-96 of 96 data-engineering projects