All Projects → Awesome Opensource Data Engineering → Similar Projects or Alternatives

95 Open source projects that are alternatives of or similar to Awesome Opensource Data Engineering

deordie-meetups
DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Stars: ✭ 48 (-87.4%)
Mutual labels:  data-engineering
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (-26.77%)
Mutual labels:  data-engineering
neon-workshop
A Pachyderm deep learning tutorial for conference workshops
Stars: ✭ 19 (-95.01%)
Mutual labels:  data-engineering
Azure-Certification-DP-200
Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution
Stars: ✭ 54 (-85.83%)
Mutual labels:  data-engineering
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (-41.99%)
Mutual labels:  data-engineering
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-93.44%)
Mutual labels:  data-engineering
papilo
DEPRECATED: Stream data processing micro-framework
Stars: ✭ 24 (-93.7%)
Mutual labels:  data-engineering
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (-94.23%)
Mutual labels:  data-engineering
awesome-dbt
A curated list of awesome dbt resources
Stars: ✭ 520 (+36.48%)
Mutual labels:  data-engineering
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
Stars: ✭ 139 (-63.52%)
Mutual labels:  data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-86.09%)
Mutual labels:  data-engineering
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-56.43%)
Mutual labels:  data-engineering
ml-in-production
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (-92.39%)
Mutual labels:  data-engineering
prefect-saturn
Python client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (-96.06%)
Mutual labels:  data-engineering
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-82.94%)
Mutual labels:  data-engineering
contessa
Easy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (-95.54%)
Mutual labels:  data-engineering
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (-62.2%)
Mutual labels:  data-engineering
lrmr
Less-Resilient MapReduce framework for Go
Stars: ✭ 32 (-91.6%)
Mutual labels:  data-engineering
Around Dataengineering
A Data Engineering & Machine Learning Knowledge Hub
Stars: ✭ 257 (-32.55%)
Mutual labels:  data-engineering
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (-90.29%)
Mutual labels:  data-engineering
h4sci-course
ETH PhD Program course
Stars: ✭ 19 (-95.01%)
Mutual labels:  data-engineering
Every Single Day I Tldr
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (-34.65%)
Mutual labels:  data-engineering
yt-channels-DS-AI-ML-CS
A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+172.44%)
Mutual labels:  data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (-53.02%)
Mutual labels:  data-engineering
funsies
funsies is a lightweight workflow engine 🔧
Stars: ✭ 37 (-90.29%)
Mutual labels:  data-engineering
practical-data-engineering
Real estate dagster pipeline
Stars: ✭ 110 (-71.13%)
Mutual labels:  data-engineering
Yuniql
Free and open source schema versioning and database migration made natively with .NET Core.
Stars: ✭ 156 (-59.06%)
Mutual labels:  data-engineering
DataEngineering
This repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (-87.66%)
Mutual labels:  data-engineering
datart
Datart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+173.49%)
Mutual labels:  data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-93.7%)
Mutual labels:  data-engineering
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (-79.79%)
Mutual labels:  data-engineering
Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (-56.17%)
Mutual labels:  data-engineering
awesome-bigquery-views
Useful SQL queries for Blockchain ETL datasets in BigQuery.
Stars: ✭ 325 (-14.7%)
Mutual labels:  data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+872.44%)
Mutual labels:  data-engineering
get smarties
Dummy variable generation with fit/transform capabilities
Stars: ✭ 23 (-93.96%)
Mutual labels:  data-engineering
gallia-core
A schema-aware Scala library for data transformation
Stars: ✭ 44 (-88.45%)
Mutual labels:  data-engineering
Everything-Tech
A collection of online resources to help you on your Tech journey.
Stars: ✭ 396 (+3.94%)
Mutual labels:  data-engineering
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+514.7%)
Mutual labels:  data-engineering
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
Stars: ✭ 26 (-93.18%)
Mutual labels:  data-engineering
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (-71.13%)
Mutual labels:  data-engineering
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-84.78%)
Mutual labels:  data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (-10.24%)
Mutual labels:  data-engineering
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+14.96%)
Mutual labels:  data-engineering
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+60.63%)
Mutual labels:  data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-94.75%)
Mutual labels:  data-engineering
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (-83.2%)
Mutual labels:  data-engineering
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-70.87%)
Mutual labels:  data-engineering
pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 970 (+154.59%)
Mutual labels:  data-engineering
Elastik Nearest Neighbors
Go to: https://github.com/alexklibisz/elastiknn
Stars: ✭ 249 (-34.65%)
Mutual labels:  data-engineering
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-96.33%)
Mutual labels:  data-engineering
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (-40.68%)
Mutual labels:  data-engineering
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (-87.66%)
Mutual labels:  data-engineering
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (-54.59%)
Mutual labels:  data-engineering
Kaggle-project-list
Summary of my projects on kaggle
Stars: ✭ 20 (-94.75%)
Mutual labels:  data-engineering
preprocessy
Python package for Customizable Data Preprocessing Pipelines
Stars: ✭ 34 (-91.08%)
Mutual labels:  data-engineering
Learn Something Every Day
📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
Stars: ✭ 362 (-4.99%)
Mutual labels:  data-engineering
Egeria
Open Metadata and Governance
Stars: ✭ 328 (-13.91%)
Mutual labels:  data-engineering
ClassifyBot
Automate building ML classification pipelines in .NET
Stars: ✭ 16 (-95.8%)
Mutual labels:  data-engineering
mpc-DL-controller
Deep Neural Network architecture as a predictive optimal controller for {HVAC+Solar cell + battery} disturbance afflicted system vs classic Model Predictive Control
Stars: ✭ 37 (-90.29%)
Mutual labels:  data-engineering
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-85.04%)
Mutual labels:  data-engineering
1-60 of 95 similar projects