All Projects → Airflow Autoscaling Ecs → Similar Projects or Alternatives

181 Open source projects that are alternatives of or similar to Airflow Autoscaling Ecs

Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-39.71%)
Mutual labels:  airflow, data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-85.29%)
Mutual labels:  airflow, data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-82.35%)
Mutual labels:  airflow, data-engineering
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+27.21%)
Mutual labels:  airflow, data-engineering
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-18.38%)
Mutual labels:  airflow, data-engineering
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (-19.12%)
Mutual labels:  airflow, data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-61.03%)
Mutual labels:  airflow, data-engineering
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-81.62%)
Mutual labels:  airflow, data-engineering
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+483.09%)
Mutual labels:  airflow, data-engineering
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+236.76%)
Mutual labels:  airflow, data-engineering
Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (+22.79%)
Mutual labels:  airflow, data-engineering
Around Dataengineering
A Data Engineering & Machine Learning Knowledge Hub
Stars: ✭ 257 (+88.97%)
Mutual labels:  airflow, data-engineering
Data Pipelines With Apache Airflow
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Stars: ✭ 50 (-63.24%)
Mutual labels:  airflow
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+13005.88%)
Mutual labels:  data-engineering
Dbt Sqlserver
dbt adapter for SQL Server and Azure SQL
Stars: ✭ 41 (-69.85%)
Mutual labels:  data-engineering
Objinsync
Continuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (-78.68%)
Mutual labels:  airflow
D6t Python
Accelerate data science
Stars: ✭ 118 (-13.24%)
Mutual labels:  data-engineering
Docker Airflow
Repo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializing GCP while container booting. https://abhioncbr.github.io/docker-airflow/
Stars: ✭ 29 (-78.68%)
Mutual labels:  airflow
Airflow Maintenance Dags
A series of DAGs/Workflows to help maintain the operation of Airflow
Stars: ✭ 914 (+572.06%)
Mutual labels:  airflow
Elyra
Elyra extends JupyterLab Notebooks with an AI centric approach.
Stars: ✭ 839 (+516.91%)
Mutual labels:  airflow
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-41.91%)
Mutual labels:  data-engineering
Automating Your Data Pipeline With Apache Airflow
Automating Your Data Pipeline with Apache Airflow
Stars: ✭ 19 (-86.03%)
Mutual labels:  airflow
Phila Airflow
Stars: ✭ 16 (-88.24%)
Mutual labels:  airflow
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-7.35%)
Mutual labels:  data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+1011.03%)
Mutual labels:  data-engineering
Cookbook
The Data Engineering Cookbook
Stars: ✭ 9,829 (+7127.21%)
Mutual labels:  data-engineering
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+5750%)
Mutual labels:  data-engineering
Argo Workflows
Workflow engine for Kubernetes
Stars: ✭ 10,024 (+7270.59%)
Mutual labels:  airflow
Bitnami Docker Airflow
Bitnami Docker Image for Apache Airflow
Stars: ✭ 89 (-34.56%)
Mutual labels:  airflow
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+640.44%)
Mutual labels:  data-engineering
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-10.29%)
Mutual labels:  data-engineering
Airflow On Kubernetes
Bare minimal Airflow on Kubernetes (Local, EKS, AKS)
Stars: ✭ 38 (-72.06%)
Mutual labels:  airflow
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-34.56%)
Mutual labels:  airflow
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-6.62%)
Mutual labels:  data-engineering
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+535.29%)
Mutual labels:  data-engineering
Airflow Training
Airflow training for the crunch conf
Stars: ✭ 83 (-38.97%)
Mutual labels:  airflow
Databook
A facebook for data
Stars: ✭ 26 (-80.88%)
Mutual labels:  airflow
Afctl
afctl helps to manage and deploy Apache Airflow projects faster and smoother.
Stars: ✭ 116 (-14.71%)
Mutual labels:  airflow
Lakefs
Git-like capabilities for your object storage
Stars: ✭ 847 (+522.79%)
Mutual labels:  data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-41.91%)
Mutual labels:  data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+365.44%)
Mutual labels:  data-engineering
Airflow Pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-5.88%)
Mutual labels:  airflow
Pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 647 (+375.74%)
Mutual labels:  data-engineering
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+778.68%)
Mutual labels:  airflow
Whirl
Fast iterative local development and testing of Apache Airflow workflows
Stars: ✭ 111 (-18.38%)
Mutual labels:  airflow
Incubator Dolphinscheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
Stars: ✭ 6,916 (+4985.29%)
Mutual labels:  airflow
Terraform Aws Airflow
Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Stars: ✭ 69 (-49.26%)
Mutual labels:  airflow
Pointblank
Data validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+252.94%)
Mutual labels:  data-engineering
Ansible Playbook
Ansible playbook to deploy distributed technologies
Stars: ✭ 61 (-55.15%)
Mutual labels:  data-engineering
Data Engineering Book
Accumulated knowledge and experience in the field of Data Engineering
Stars: ✭ 471 (+246.32%)
Mutual labels:  data-engineering
Airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+17621.32%)
Mutual labels:  airflow
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1653.68%)
Mutual labels:  data-engineering
Airflow in docker compose
Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)
Stars: ✭ 109 (-19.85%)
Mutual labels:  airflow
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-55.88%)
Mutual labels:  data-engineering
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+4170.59%)
Mutual labels:  data-engineering
Active workflow
Turn complex requirements to workflows without leaving the comfort of your technology stack.
Stars: ✭ 413 (+203.68%)
Mutual labels:  data-engineering
Discreetly
ETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (-55.88%)
Mutual labels:  airflow
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+203.68%)
Mutual labels:  airflow
Dag Factory
Dynamically generate Apache Airflow DAGs from YAML configuration files
Stars: ✭ 385 (+183.09%)
Mutual labels:  airflow
Aws Ecs Airflow
Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (-21.32%)
Mutual labels:  airflow
1-60 of 181 similar projects