All Projects → bahchis → Airflow Cookbook

bahchis / Airflow Cookbook

Licence: apache-2.0
Airflow workflow management platform chef cookbook.

Programming Languages

ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to Airflow Cookbook

Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+1267.24%)
Mutual labels:  airflow
Airflow Maintenance Dags
A series of DAGs/Workflows to help maintain the operation of Airflow
Stars: ✭ 914 (+1475.86%)
Mutual labels:  airflow
Data Pipelines With Apache Airflow
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Stars: ✭ 50 (-13.79%)
Mutual labels:  airflow
Phila Airflow
Stars: ✭ 16 (-72.41%)
Mutual labels:  airflow
Dotnetcore Cookbook
Chef cookbook for managing .NET Core (http://dotnet.github.io/) installation and applications on all supported platforms.
Stars: ✭ 9 (-84.48%)
Mutual labels:  chef-cookbook
Docker Airflow
Repo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializing GCP while container booting. https://abhioncbr.github.io/docker-airflow/
Stars: ✭ 29 (-50%)
Mutual labels:  airflow
Incubator Dolphinscheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
Stars: ✭ 6,916 (+11824.14%)
Mutual labels:  airflow
Xene
A distributed workflow runner focusing on performance and simplicity.
Stars: ✭ 56 (-3.45%)
Mutual labels:  airflow
Osquery Cookbook
A Chef Cookbook to install and configure osquery.
Stars: ✭ 11 (-81.03%)
Mutual labels:  chef-cookbook
Airflow On Kubernetes
Bare minimal Airflow on Kubernetes (Local, EKS, AKS)
Stars: ✭ 38 (-34.48%)
Mutual labels:  airflow
Automating Your Data Pipeline With Apache Airflow
Automating Your Data Pipeline with Apache Airflow
Stars: ✭ 19 (-67.24%)
Mutual labels:  airflow
Elyra
Elyra extends JupyterLab Notebooks with an AI centric approach.
Stars: ✭ 839 (+1346.55%)
Mutual labels:  airflow
Objinsync
Continuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (-50%)
Mutual labels:  airflow
Chef Crowd
Chef cookbook to install Atlassian Crowd
Stars: ✭ 5 (-91.38%)
Mutual labels:  chef-cookbook
Argo Workflows
Workflow engine for Kubernetes
Stars: ✭ 10,024 (+17182.76%)
Mutual labels:  airflow
Travis Cookbooks
Chef cookbook monolithic repo 📖 💣
Stars: ✭ 669 (+1053.45%)
Mutual labels:  chef-cookbook
System
Development repository for the "system" Chef cookbook
Stars: ✭ 21 (-63.79%)
Mutual labels:  chef-cookbook
Meme Generator
MemeGen is a web application where the user gives an image as input and our tool generates a meme at one click for the user.
Stars: ✭ 57 (-1.72%)
Mutual labels:  chef-cookbook
Airflow Toolkit
Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested data pipelines(DAGs) 🖥 >> [ 🚀, 🚢 ]
Stars: ✭ 51 (-12.07%)
Mutual labels:  airflow
Chef Umami
A tool to automatically generate test code for Chef cookbooks and policies.
Stars: ✭ 35 (-39.66%)
Mutual labels:  chef-cookbook

Airflow Chef Cookbook

Installs and configures Airflow workflow management platform. More information about Airflow can be found here: https://github.com/airbnb/airflow

Supported Platforms

Ubuntu (Tested on 14.04, 16.04). CentOS (Tested on 7.2).

Limitations

The Airflow all and oracle packages are not supported, this is due the Oracle package having dependencies which cannot be automatically installed. I will look how to solve this and add support for those packages at later stage.

Contributing

Please follow instructions in the contributing doc.

Usage

  • Use the relevant cookbooks to install and configure Airflow.
  • Use environment variable in /etc/default/airflow (for Ubuntu) or /etc/sysconfig/airflow (for CentOS) to configure Airflow during the startup process. (More info about Airflow environment variables at: Setting Configuration Options)
  • Make sure to run airflow initdb as part of your startup script.

Recipes

  • default - Executes other recipes.
  • directories - Creates required directories.
  • user - Creates OS user and group.
  • packages - Installs OS and pip packages.
  • config - Handles airflow.cfg
  • services - Creates services env file.
  • webserver - Configures service for webserver.
  • scheduler - Configures service for scheduler.
  • worker - Configures service for worker.
  • flower - Configures service for flower.
  • kerberos - Configures service for kerberos.
  • packages - Installs Airflow and supporting packages.

Attributes

User config
  • ["airflow"]["airflow_package"] - Airflow package name, defaults to 'apache-airflow'. Use 'airflow' for installing version 1.8.0 or lower.
  • ["airflow"]["version"] - The version of airflow to install, defaults to latest (nil).
  • ["airflow"]["user"] - The user Airflow is executed with and owner of all related folders.
  • ["airflow"]["group"] - Airflow user group.
  • ["airflow"]["user_uid"] - Airflow user uid
  • ["airflow"]["group_gid"] - Airflow group gid
  • ["airflow"]["user_home_directory"] - Airflow user home directory.
  • ["airflow"]["shell"] - Airflow user shell.
General config
  • ["airflow"]["directories_mode"] - The permissions airflow and user directories are created.
  • ["airflow"]["config_file_mode"] - The permissions airflow.cfg is created.
  • ["airflow"]["bin_path"] - Path to the bin folder, default is based on platform.
  • ["airflow"]["run_path"] - Pid files base directory
  • ["airflow"]["is_upstart"] - Should upstart be used for services, determined automatiaclly.
  • ["airflow"]["init_system"] - The init system to use when configuring services, only upstart or systemd are supported and defaults based on ["airflow"]["is_upstart"] value.
  • ["airflow"]["env_path"] - The path to services env file, determined automatiaclly.
Python config
  • ["airflow"]["python_runtime"] = Python runtime as used by poise-python cookbook.
  • ["airflow"]["python_version"] = Python version to install as used by poise-python cookbook.
  • ["airflow"]["pip_version"] = Pip version to install (true - installs latest) as used by poise-python cookbook.
Package config
  • default['airflow']['packages'] - The Python packages to install for Airflow.
  • default['airflow']['dependencies'] - The dependencies of the packages listed in default['airflow']['packages']. These are OS packages, not Python packages.
airflow.cfg

This cookbook enables to configure any airflow.cfg paramters dynamically by using attributes structure like (see the attributes file for airflow.cfg examples): ["airflow"]["config"]["CONFIG_SECTION"]["CONFIG_ENTRY"]

License

Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Author

Sergey Bahchissaraitsev

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].