All Projects → AirflowETL → Similar Projects or Alternatives

512 Open source projects that are alternatives of or similar to AirflowETL

JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.

Stars: ✭ 25 (+25%)

Mutual labels: airflow, data-engineering, data-pipeline

polygon-etl

ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub

Stars: ✭ 53 (+165%)

Mutual labels: airflow, etl, data-engineering

hamilton

A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.

Stars: ✭ 612 (+2960%)

Mutual labels: etl, data-engineering, etl-pipeline

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+95%)

Mutual labels: etl, data-pipeline, etl-pipeline

AirflowDataPipeline

Example of an ETL Pipeline using Airflow

Stars: ✭ 24 (+20%)

Mutual labels: airflow, etl, data-engineering

Dataengineeringproject

Example end to end data engineering project.

Stars: ✭ 82 (+310%)

Mutual labels: airflow, data-engineering

Soda Sql

Metric collection, data testing and monitoring for SQL accessible data

Stars: ✭ 173 (+765%)

Mutual labels: airflow, data-engineering

Discreetly

ETLy is an add-on dashboard service on top of Apache Airflow.

Stars: ✭ 60 (+200%)

Mutual labels: airflow, etl

gallia-core

A schema-aware Scala library for data transformation

Stars: ✭ 44 (+120%)

Mutual labels: etl, data-engineering

redis-connect-dist

Real-Time Event Streaming & Change Data Capture

Stars: ✭ 21 (+5%)

Mutual labels: etl, etl-pipeline

arthur-redshift-etl

ELT Code for your Data Warehouse

Stars: ✭ 22 (+10%)

Mutual labels: etl, data-engineering

etl manager

A python package to create a database on the platform using our moj data warehousing framework

Stars: ✭ 14 (-30%)

Mutual labels: etl, data-engineering

Data Engineering Howto

A list of useful resources to learn Data Engineering from scratch

Stars: ✭ 2,056 (+10180%)

Mutual labels: data-engineering, data-pipeline

hive-metastore-client

A client for connecting and running DDLs on hive metastore.

Stars: ✭ 37 (+85%)

Mutual labels: etl, data-engineering

Dataspherestudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

Stars: ✭ 1,195 (+5875%)

Mutual labels: airflow, etl

DIRECT

DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.

Stars: ✭ 20 (+0%)

Mutual labels: etl, etl-pipeline

etlflow

EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.

Stars: ✭ 38 (+90%)

Mutual labels: etl, etl-pipeline

DaFlow

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

Stars: ✭ 24 (+20%)

Mutual labels: etl, etl-pipeline

pangeo-forge-recipes

Python library for building Pangeo Forge recipes.

Stars: ✭ 64 (+220%)

Mutual labels: etl, data-engineering

blockchain-etl-streaming

Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes

Stars: ✭ 57 (+185%)

Mutual labels: etl, data-engineering

Aws Data Wrangler

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Stars: ✭ 2,385 (+11825%)

Mutual labels: etl, data-engineering

Butterfree

A tool for building feature stores.

Stars: ✭ 126 (+530%)

Mutual labels: etl, data-engineering

Airflow Autoscaling Ecs

Airflow Deployment on AWS ECS Fargate Using Cloudformation

Stars: ✭ 136 (+580%)

Mutual labels: airflow, data-engineering

Goodreads etl pipeline

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Stars: ✭ 793 (+3865%)

Mutual labels: airflow, data-engineering

Airbyte

Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.

Stars: ✭ 4,919 (+24495%)

Mutual labels: etl, data-engineering

Incubator Dolphinscheduler

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.

Stars: ✭ 6,916 (+34480%)

Mutual labels: airflow, schedule

Udacity Data Engineering Projects

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

Stars: ✭ 458 (+2190%)

Mutual labels: airflow, data-engineering

Aws Ecs Airflow

Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks

Stars: ✭ 107 (+435%)

Mutual labels: airflow, etl

practical-data-engineering

Real estate dagster pipeline

Stars: ✭ 110 (+450%)

Mutual labels: data-engineering, data-pipeline

Udacity Data Engineering

Udacity Data Engineering Nano Degree (DEND)

Stars: ✭ 89 (+345%)

Mutual labels: airflow, etl

aircal

Visualize Airflow's schedule by exporting future DAG runs as events to Google Calendar.

Stars: ✭ 66 (+230%)

Mutual labels: airflow, schedule

morph-kgc

Powerful RDF Knowledge Graph Generation with [R2]RML Mappings

Stars: ✭ 77 (+285%)

Mutual labels: etl, data-engineering

etl

[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library

Stars: ✭ 279 (+1295%)

Mutual labels: etl, data-engineering

udacity-data-eng-proj2

A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract data from S3, apply a series of transformations and load into S3 and Redshift.

Stars: ✭ 25 (+25%)

Mutual labels: airflow, etl-pipeline

Data-Engineering-Projects

Personal Data Engineering Projects

Stars: ✭ 167 (+735%)

Mutual labels: airflow, data-engineering

csvplus

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

Stars: ✭ 67 (+235%)

Mutual labels: etl, etl-pipeline

versatile-data-kit

Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.

Stars: ✭ 144 (+620%)

Mutual labels: etl, data-engineering

uptasticsearch

An Elasticsearch client tailored to data science workflows.

Stars: ✭ 47 (+135%)

Mutual labels: etl, data-engineering

kafka-connect-datagen

A Kafka Connect source connector that generates data for tests

Stars: ✭ 27 (+35%)

Mutual labels: etl, etl-pipeline

rivery cli

Rivery CLI

Stars: ✭ 16 (-20%)

Mutual labels: etl, data-pipeline

beneath

Beneath is a serverless real-time data platform ⚡️

Stars: ✭ 65 (+225%)

Mutual labels: etl, data-engineering

viewflow

Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.

Stars: ✭ 110 (+450%)

Mutual labels: airflow, data-engineering

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (+295%)

Mutual labels: etl, data-engineering

Sayn

Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

Stars: ✭ 79 (+295%)

Mutual labels: etl, data-engineering

vixtract

www.vixtract.ru

Stars: ✭ 40 (+100%)

Mutual labels: etl, etl-pipeline

Pyspark Example Project

Example project implementing best practices for PySpark ETL jobs and applications.

Stars: ✭ 633 (+3065%)

Mutual labels: etl, data-engineering

Phila Airflow

Stars: ✭ 16 (-20%)

Mutual labels: airflow, etl

Aws Serverless Data Lake Framework

Enterprise-grade, production-hardened, serverless data lake on AWS

Stars: ✭ 179 (+795%)

Mutual labels: etl, data-engineering

Dataform

Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift

Stars: ✭ 342 (+1610%)

Mutual labels: etl, data-engineering

astro

Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

Stars: ✭ 79 (+295%)

Mutual labels: airflow, etl

airflow-dbt-python

A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.

Stars: ✭ 111 (+455%)

Mutual labels: airflow, data-engineering

Benthos

Fancy stream processing made operationally mundane

Stars: ✭ 3,705 (+18425%)

Mutual labels: etl, data-engineering

opentrials-airflow

Configuration and definitions of Airflow for OpenTrials

Stars: ✭ 18 (-10%)

Mutual labels: airflow, data-pipeline

Around Dataengineering

A Data Engineering & Machine Learning Knowledge Hub

Stars: ✭ 257 (+1185%)

Mutual labels: airflow, data-engineering

Example Airflow Dags

Example DAGs using hooks and operators from Airflow Plugins

Stars: ✭ 243 (+1115%)

Mutual labels: airflow, etl

Paperboy

A web frontend for scheduling Jupyter notebook reports

Stars: ✭ 221 (+1005%)

Mutual labels: airflow

Bitnami Docker Airflow

Bitnami Docker Image for Apache Airflow

Stars: ✭ 89 (+345%)

Mutual labels: airflow

MRTScheduleSwiftUI

MRT Schedule & Locator iOS App built using SwiftUI

Stars: ✭ 32 (+60%)

Mutual labels: schedule

Awesome Apache Airflow

Curated list of resources about Apache Airflow

Stars: ✭ 2,755 (+13675%)

Mutual labels: airflow

Airflow Scheduler Failover Controller

A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability

Stars: ✭ 204 (+920%)

Mutual labels: airflow

1-60 of 512 similar projects

›

next*5