All Projects → chandulal → Airflow Testing

chandulal / Airflow Testing

Airflow Unit Tests and Integration Tests

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Airflow Testing

Discreetly
ETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (-65.71%)
Mutual labels:  airflow
Airflow in docker compose
Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)
Stars: ✭ 109 (-37.71%)
Mutual labels:  airflow
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (-22.29%)
Mutual labels:  airflow
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+582.86%)
Mutual labels:  airflow
Bitnami Docker Airflow
Bitnami Docker Image for Apache Airflow
Stars: ✭ 89 (-49.14%)
Mutual labels:  airflow
Afctl
afctl helps to manage and deploy Apache Airflow projects faster and smoother.
Stars: ✭ 116 (-33.71%)
Mutual labels:  airflow
Xene
A distributed workflow runner focusing on performance and simplicity.
Stars: ✭ 56 (-68%)
Mutual labels:  airflow
Airflow Exporter
Airflow plugin to export dag and task based metrics to Prometheus.
Stars: ✭ 161 (-8%)
Mutual labels:  airflow
Aws Ecs Airflow
Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (-38.86%)
Mutual labels:  airflow
Beyond Jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-22.86%)
Mutual labels:  airflow
Airflow Training
Airflow training for the crunch conf
Stars: ✭ 83 (-52.57%)
Mutual labels:  airflow
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-49.14%)
Mutual labels:  airflow
Telemetry Airflow
Airflow configuration for Telemetry
Stars: ✭ 125 (-28.57%)
Mutual labels:  airflow
Terraform Aws Airflow
Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Stars: ✭ 69 (-60.57%)
Mutual labels:  airflow
Airflow Chart
A Helm chart to install Apache Airflow on Kubernetes
Stars: ✭ 137 (-21.71%)
Mutual labels:  airflow
Airflow Cookbook
Airflow workflow management platform chef cookbook.
Stars: ✭ 58 (-66.86%)
Mutual labels:  airflow
Whirl
Fast iterative local development and testing of Apache Airflow workflows
Stars: ✭ 111 (-36.57%)
Mutual labels:  airflow
Airflow Doc Zh
📖 [译] Airflow 中文文档
Stars: ✭ 169 (-3.43%)
Mutual labels:  airflow
Data Science Stack Cookiecutter
🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (-12.57%)
Mutual labels:  airflow
Airflow Pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-26.86%)
Mutual labels:  airflow

Airflow Testing

This project contains different categories of tests with examples.

Five Categories of Tests

  1. DAG Validation Tests: To test the validity of the DAG, checking typos and cyclicity.
  2. DAG/Pipeline Definition Tests: To test the total number of tasks in the DAG, upstream and downstream dependencies of each task, etc.
  3. Unit Tests: To test the logic of custom Operators, custom Sensor, etc.
  4. Integration Tests: To test the communication between tasks. For example, task1 pass some information to task 2 using Xcoms.
  5. End to End Pipeline Tests: To test and verify the integration between each task. You can also assert the data on successful completion of the E2E pipeline.

Clone this repo to run these test in your local machine.

Unit Tests

Unit tests cover all tests falls under teh first four categories.

How to run?

  1. Build the airflow image. Go to project root directory and run

    docker build . -t airflow-test

  2. Run the unit tests from the docker. Use your repository location for {SourceDir} (Eg. If you cloned your repo at /User/username/airflow-testing/ then SourceDir is /User/username.)

    docker run -ti -v {SourceDir}/airflow-testing:/opt --entrypoint /mnt/entrypoint.sh airflow-test run_unit_tests

End-to-End Tests

End-to-End tests cover all tests of category five. To run these tests, we need to set up airflow environment in minikube. Also, we need to set up all the component required by your DAGs.

Minikube set up

Prerequisite:

git clone https://github.com/chandulal/airflow-testing.git
brew cask install virtualbox (run if you don't have virtual box installed)

Install minikube

brew cask install minikube
brew install kubernetes-cli
minikube start --cpus 4 --memory 8192

Mount DAGs, Plugins, etc.

Mount all your DAGs,Plugins, etc. in minikube

minikube mount {project dir}/src/main/python/:/data

Deploy Airflow in minikube

Open new terminal. Go to project root dir and run:

kubectl apply -f airflow.kube.yaml

wait for 3-4 min to start all airflow components.

This will set up following components:

  • Postgres (To store the metadata of airflow)
  • Redis (Broker for celery executors)
  • Airflow Scheduler
  • Celery Workers
  • Airflow Web Server
  • Flower

Access Airflow

Get minikube ip by running minikube ip command

Use minikube ip and access:

**Airflow UI:** {minikube-ip}:31317 

**Flower:** {minikube-ip}:32081

How Airflow works in minikube?

minkube_airflow_architecture

How to run these tests?

  1. Install all required components to run your DAGs in minikube. To run integration tests, available in this repo, we required MySQL and Presto on minikube.

     kubectl apply -f {SourceDir}/k8s/mysql/mysql.kube.yaml
     kubectl apply -f {SourceDir}/k8s/presto/presto.kube.yaml
     
  2. Run the integration tests from the docker. Use absolute path of this repository in your machine for {SourceDir}

    docker run -ti -v {SourceDir}/airflow-testing:/opt --entrypoint /mnt/entrypoint.sh airflow-test run_integration_tests {minikube-ip}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].