All Projects → quantumblacklabs → kedro-airflow

quantumblacklabs / kedro-airflow

Licence: Apache-2.0 license
Kedro-Airflow makes it easy to deploy Kedro projects to Airflow.

Programming Languages

python
139335 projects - #7 most used programming language
Jinja
831 projects
Gherkin
971 projects
Makefile
30231 projects

Projects that are alternatives of or similar to kedro-airflow

kedro-airflow-k8s
Kedro Plugin to support running pipelines on Kubernetes using Airflow.
Stars: ✭ 22 (-81.82%)
Mutual labels:  airflow, kedro, kedro-plugin
dbt-cloud-plugin
DBT Cloud Plugin for Airflow
Stars: ✭ 35 (-71.07%)
Mutual labels:  airflow, airflow-plugin
airflow-code-editor
A plugin for Apache Airflow that allows you to edit DAGs in browser
Stars: ✭ 195 (+61.16%)
Mutual labels:  airflow, airflow-plugin
airflow multi dagrun
triggering a DAG run multiple times
Stars: ✭ 74 (-38.84%)
Mutual labels:  airflow, airflow-plugin
airflow-user-management-plugin
A plugin for Apache Airflow that allows you to manage the users that can login
Stars: ✭ 13 (-89.26%)
Mutual labels:  airflow, airflow-plugin
Airflow Chart
A Helm chart to install Apache Airflow on Kubernetes
Stars: ✭ 137 (+13.22%)
Mutual labels:  airflow
Paperboy
A web frontend for scheduling Jupyter notebook reports
Stars: ✭ 221 (+82.64%)
Mutual labels:  airflow
Beyond Jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (+11.57%)
Mutual labels:  airflow
Telemetry Airflow
Airflow configuration for Telemetry
Stars: ✭ 125 (+3.31%)
Mutual labels:  airflow
fab-oidc
Flask-AppBuilder SecurityManager for OpenIDConnect
Stars: ✭ 28 (-76.86%)
Mutual labels:  airflow
aircan
💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher and Xloader for loading data to DataStore.
Stars: ✭ 24 (-80.17%)
Mutual labels:  airflow
Airflow Scheduler Failover Controller
A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability
Stars: ✭ 204 (+68.6%)
Mutual labels:  airflow
Data Science Stack Cookiecutter
🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (+26.45%)
Mutual labels:  airflow
Example Airflow Dags
Example DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+100.83%)
Mutual labels:  airflow
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (+12.4%)
Mutual labels:  airflow
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-83.47%)
Mutual labels:  airflow
Airflow Pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+5.79%)
Mutual labels:  airflow
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+42.98%)
Mutual labels:  airflow
pipeline
PipelineAI Kubeflow Distribution
Stars: ✭ 4,154 (+3333.06%)
Mutual labels:  airflow
Airflow Testing
Airflow Unit Tests and Integration Tests
Stars: ✭ 175 (+44.63%)
Mutual labels:  airflow

Kedro-Airflow

CircleCI

License Python Version PyPI Version Code Style: Black

Please Note: As part of our move to the Linux Foundation Kedro-Airflow has been moved to kedro-plugins and this repository will no longer be maintained. Please find an up-to-date copy of Kedro-Airflow here.


Apache Airflow is a tool for orchestrating complex workflows and data processing pipelines. The Kedro-Airflow plugin can be used for:

  • Rapid pipeline creation in the prototyping phase. You can write Python functions in Kedro without worrying about schedulers, daemons, services or having to recreate the Airflow DAG file.
  • Automatic dependency resolution in Kedro. This allows you to bypass Airflow's need to specify the order of your tasks.
  • Distributing Kedro tasks across many workers. You can also enable monitoring and scheduling of the tasks' runtimes.

Installation

kedro-airflow is a Python plugin. To install it:

pip install kedro-airflow

Usage

You can use kedro-airflow to deploy a Kedro pipeline as an Airflow DAG by following these steps:

Step 1: Generate the DAG file

At the root directory of the Kedro project, run:

kedro airflow create

This command will generate an Airflow DAG file located in the airflow_dags/ directory in your project. You can pass a --pipeline flag to generate the DAG file for a specific Kedro pipeline and an --env flag to generate the DAG file for a specific Kedro environment.

Step 2: Copy the DAG file to the Airflow DAGs folder.

For more information about the DAGs folder, please visit Airflow documentation.

Step 3: Package and install the Kedro pipeline in the Airflow executor's environment

After generating and deploying the DAG file, you will then need to package and install the Kedro pipeline into the Airflow executor's environment. Please visit the guide to deploy Kedro as a Python package for more details.

FAQ

What if my DAG file is in a different directory to my project folder?

By default the generated DAG file is configured to live in the same directory as your project as per this template. If your DAG file is located in a different directory to your project, you will need to tweak this manually after running the kedro airflow create command.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].