All Projects → jghoman → Awesome Apache Airflow

jghoman / Awesome Apache Airflow

Licence: apache-2.0
Curated list of resources about Apache Airflow

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to Awesome Apache Airflow

airflow-user-management-plugin
A plugin for Apache Airflow that allows you to manage the users that can login
Stars: ✭ 13 (-99.53%)
Mutual labels:  airflow, apache-airflow
airflow-client-python
Apache Airflow - OpenApi Client for Python
Stars: ✭ 172 (-93.76%)
Mutual labels:  airflow, apache-airflow
fairflow
Functional Airflow DAG definitions.
Stars: ✭ 38 (-98.62%)
Mutual labels:  airflow, apache-airflow
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (-96.01%)
Mutual labels:  airflow, apache-airflow
openverse-catalog
Identifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-99.02%)
Mutual labels:  airflow, apache-airflow
airflow-prometheus-exporter
Export Airflow metrics (from mysql) in prometheus format
Stars: ✭ 25 (-99.09%)
Mutual labels:  airflow, apache-airflow
airflow-boilerplate
A complete development environment setup for working with Airflow
Stars: ✭ 94 (-96.59%)
Mutual labels:  airflow, apache-airflow
airflow-code-editor
A plugin for Apache Airflow that allows you to edit DAGs in browser
Stars: ✭ 195 (-92.92%)
Mutual labels:  airflow, apache-airflow
Airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+774.81%)
Mutual labels:  airflow, apache-airflow
Bitnami Docker Airflow
Bitnami Docker Image for Apache Airflow
Stars: ✭ 89 (-96.77%)
Mutual labels:  airflow
Beyond Jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-95.1%)
Mutual labels:  airflow
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-97.02%)
Mutual labels:  airflow
Aws Ecs Airflow
Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (-96.12%)
Mutual labels:  airflow
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (-95.06%)
Mutual labels:  airflow
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-96.77%)
Mutual labels:  airflow
Airflow Doc Zh
📖 [译] Airflow 中文文档
Stars: ✭ 169 (-93.87%)
Mutual labels:  airflow
Airflow Training
Airflow training for the crunch conf
Stars: ✭ 83 (-96.99%)
Mutual labels:  airflow
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (-56.62%)
Mutual labels:  airflow
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (-93.72%)
Mutual labels:  airflow
Airflow Exporter
Airflow plugin to export dag and task based metrics to Prometheus.
Stars: ✭ 161 (-94.16%)
Mutual labels:  airflow

Awesome Apache Airflow

contrib badge GitHub commit activity

This is a curated list of resources about Apache Airflow. Please feel free to contribute any items that should be included. Items are generally added at the top of each section so that more fresh items are featured more prominently.

Contents

Vital links

Airflow deployment solutions

Introductions and tutorials

Airflow Summit 2020 videos

The first Airflow Summit 2020 was held in July 2020. It was a truly global, fully online event that was co-hosted by 9 Airflow Meetups from all over the world (Melbourne, Tokyo, Bangalore, Warsaw, Amsterdam, London, NYC, BayArea).

It featured 40+ talks and three workshops. You can check out the talk recordings as a YouTube Airflow Summit 2020 Playlist or see the individual talks here:

Best practices, lessons learned and cool use cases

Books, blogs, podcasts, and such

Slide deck presentations and online videos

Libraries, Hooks, Utilities

  • AirFly - Auto generate Airflow's dag.py on the fly.
  • DEAfrica Airflow - Airflow libraries used by Digital Earth Africa, an humanitarian effort to utilize satellite imagery of Africa.
  • Airflow plugins - Central collection of repositories of various plugins for Airflow, including mailchimp, trello, sftp, GitHub, etc.
  • fileflow - Collection of modules to support large data transfers between Airflow operators through either local file system or S3. This addresses a gap where data is too large for XCOMs but too small or inconvenient for loading directly in the operator. Built by Industry Dive.
  • fairflow - Library to abstract away Airflow's Operators with functional pieces that transform the data from one operator to another.
  • airflow-maintenance-dags - Clairvoyant has a repo of Airflow DAGs that operator on Airflow itself, clearing out various bits of the backing metadata store.
  • test_dags - a more complete solution for DAG integrity tests (first Circle of Data’s Inferno are the first.
  • dag-factory - A library for dynamically generating Apache Airflow DAGs from YAML configuration files.
  • whirl - Fast iterative local development and testing of Apache Airflow workflows.
  • airflow-code-editor - A plugin for Apache Airflow that allows you to edit DAGs in browser.
  • Pylint-Airflow - A Pylint plugin for static code analysis on Airflow code.
  • afctl - A CLI tool that includes everything required to create, manage and deploy airflow projects faster and smoother.
  • Dag Dependencies viewer - A plugin which creates a view to visualize dependencies between the Airflow DAGs
  • Airflow ECR Plugin - Plugin to refresh AWS ECR login token at regular intervals. This is helpful where DockerOperator needs to pull images hosted on ECR.
  • AirflowK8sDebugger - A library for generate k8s pod yaml templates from an Airflow dag using the KubernetesPodOperator.
  • Oozie to Airflow - A tool to easily convert between Apache Oozie workflows and Apache Airflow workflows.
  • Airflow Ditto - An extensible framework to do transformations to an Airflow DAG and convert it into another DAG which is flow-isomorphic with the original DAG, to be able to run it on different environments (e.g. on different clouds, or even different container frameworks - Apache Spark on YARN vs Kubernetes). Comes with out-of-the-box support for EMR-to-HDInsight-DAG transforms.
  • gusty - Create a DAG using any number of YAML, Python, Jupyter Notebook, or R Markdown files that represent individual tasks in the DAG. gusty also configures dependencies, DAGs, and TaskGroups, features support for your local operators, and more. A fully containerized demo is available here.
  • Meltano - Open source, self-hosted, CLI-first, debuggable, and extensible ELT tool that embraces Singer for extraction and loading, leverages dbt for transformation, and integrates with Airflow for orchestration.
  • DAG checks - The dag-checks consist of checks that can help you in maintaining your Apache Airflow instance.
  • Airflow DVC plugin - Plugin for open-source version-control system for data science and Machine Learning pipelines - DVC.

Meetups

Commercial Airflow-as-a-service providers

  • Google Cloud Composer - Google Cloud Composer is a managed service built atop Google Cloud and Airflow.
  • Qubole - Qubole is mainly known as a service-and-support company for Apache Hive, but also provides Airflow as a component of its platform.
  • Astronomer.io - Astronomer provides complete ETL lifecycle solutions and appears to be entirely focused on providing Airflow-based products.
  • AWS MWAA - Amazon Managed Workflows for Apache Airflow (MWAA) is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale.

Cloud Composer resources

This section contains articles that apply to Cloud Composer — a service built by Google Cloud based on Apache Airflow. Tricks and solutions are described here that are intended for Cloud Composer, but may be applicable to vanilla Airflow.

Non-English resources

Sample projects

License

CC0

To the extent possible under law, Jakob Homan has waived all copyright and related or neighboring rights to this work.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].