All Projects → Awesome Opensource Data Engineering → Similar Projects or Alternatives

95 Open source projects that are alternatives of or similar to Awesome Opensource Data Engineering

Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-60.1%)
Mutual labels:  data-engineering
Data Engineering Nanodegree
Projects done in the Data Engineering Nanodegree by Udacity.com
Stars: ✭ 151 (-60.37%)
Mutual labels:  data-engineering
Gcp Data Engineer Exam
Study materials for the Google Cloud Professional Data Engineering Exam
Stars: ✭ 144 (-62.2%)
Mutual labels:  data-engineering
Data Engineering Howto
A list of useful resources to learn Data Engineering from scratch
Stars: ✭ 2,056 (+439.63%)
Mutual labels:  data-engineering
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (-64.04%)
Mutual labels:  data-engineering
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (-64.3%)
Mutual labels:  data-engineering
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-66.67%)
Mutual labels:  data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-66.93%)
Mutual labels:  data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+525.98%)
Mutual labels:  data-engineering
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-67.98%)
Mutual labels:  data-engineering
D6t Python
Accelerate data science
Stars: ✭ 118 (-69.03%)
Mutual labels:  data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+296.59%)
Mutual labels:  data-engineering
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+11090.03%)
Mutual labels:  data-engineering
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+4578.22%)
Mutual labels:  data-engineering
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-78.48%)
Mutual labels:  data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-79.27%)
Mutual labels:  data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-79.27%)
Mutual labels:  data-engineering
Ansible Playbook
Ansible playbook to deploy distributed technologies
Stars: ✭ 61 (-83.99%)
Mutual labels:  data-engineering
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-84.25%)
Mutual labels:  data-engineering
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+164.3%)
Mutual labels:  data-engineering
Dbt Sqlserver
dbt adapter for SQL Server and Azure SQL
Stars: ✭ 41 (-89.24%)
Mutual labels:  data-engineering
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+126.77%)
Mutual labels:  data-engineering
Lakefs
Git-like capabilities for your object storage
Stars: ✭ 847 (+122.31%)
Mutual labels:  data-engineering
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+108.14%)
Mutual labels:  data-engineering
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+1988.19%)
Mutual labels:  data-engineering
Pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 647 (+69.82%)
Mutual labels:  data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+66.14%)
Mutual labels:  data-engineering
Pointblank
Data validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+25.98%)
Mutual labels:  data-engineering
Data Engineering Book
Accumulated knowledge and experience in the field of Data Engineering
Stars: ✭ 471 (+23.62%)
Mutual labels:  data-engineering
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+20.21%)
Mutual labels:  data-engineering
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+1424.41%)
Mutual labels:  data-engineering
Active workflow
Turn complex requirements to workflows without leaving the comfort of your technology stack.
Stars: ✭ 413 (+8.4%)
Mutual labels:  data-engineering
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+1191.08%)
Mutual labels:  data-engineering
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+576.12%)
Mutual labels:  data-engineering
Cookbook
The Data Engineering Cookbook
Stars: ✭ 9,829 (+2479.79%)
Mutual labels:  data-engineering
61-95 of 95 similar projects