AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (+11.11%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (+38.89%)
Airflow ExporterAirflow plugin to export dag and task based metrics to Prometheus.
Stars: ✭ 161 (+794.44%)
fab-oidcFlask-AppBuilder SecurityManager for OpenIDConnect
Stars: ✭ 28 (+55.56%)
WhirlFast iterative local development and testing of Apache Airflow workflows
Stars: ✭ 111 (+516.67%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (+650%)
aircan💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher and Xloader for loading data to DataStore.
Stars: ✭ 24 (+33.33%)
Terraform Aws AirflowTerraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Stars: ✭ 69 (+283.33%)
PaperboyA web frontend for scheduling Jupyter notebook reports
Stars: ✭ 221 (+1127.78%)
saisokuSaisoku is a Python module that helps you build complex pipelines of batch file/directory transfer/sync jobs.
Stars: ✭ 40 (+122.22%)
Airflow TestingAirflow Unit Tests and Integration Tests
Stars: ✭ 175 (+872.22%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+116.67%)
Airflow ChartA Helm chart to install Apache Airflow on Kubernetes
Stars: ✭ 137 (+661.11%)
kedro-airflowKedro-Airflow makes it easy to deploy Kedro projects to Airflow.
Stars: ✭ 121 (+572.22%)
Aws Ecs AirflowRun Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (+494.44%)
kedro-airflow-k8sKedro Plugin to support running pipelines on Kubernetes using Airflow.
Stars: ✭ 22 (+22.22%)
machine-learning-data-pipelinePipeline module for parallel real-time data processing for machine learning models development and production purposes.
Stars: ✭ 22 (+22.22%)
Airflow CookbookAirflow workflow management platform chef cookbook.
Stars: ✭ 58 (+222.22%)
aws-pdf-textract-pipeline🔍 Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
Stars: ✭ 141 (+683.33%)
Airflow ToolkitAny Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested data pipelines(DAGs) 🖥 >> [ 🚀, 🚢 ]
Stars: ✭ 51 (+183.33%)
Example Airflow DagsExample DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+1250%)
T-WatchReal Time Twitter Sentiment Analysis Product
Stars: ✭ 20 (+11.11%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+194.44%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+861.11%)
incremental trainingRepo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'
Stars: ✭ 110 (+511.11%)
Data Science Stack Cookiecutter🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (+750%)
Insight-GDELT-FeedA way for home buyers to know about factors affecting a state
Stars: ✭ 43 (+138.89%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+611.11%)
dc-sdk-js一个基于浏览器环境的数据采集SDK
Stars: ✭ 52 (+188.89%)
Afctlafctl helps to manage and deploy Apache Airflow projects faster and smoother.
Stars: ✭ 116 (+544.44%)
FastETLPlugins do Airflow para implementação de pipelines de dados
Stars: ✭ 31 (+72.22%)
datajobBuild and deploy a serverless data pipeline on AWS with no effort.
Stars: ✭ 101 (+461.11%)
k3aiA lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
Stars: ✭ 105 (+483.33%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+6538.89%)
qunomonTestbed of AI Systems Quality Management
Stars: ✭ 15 (-16.67%)
DiscreetlyETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (+233.33%)
pipelinePipelineAI Kubeflow Distribution
Stars: ✭ 4,154 (+22977.78%)
XeneA distributed workflow runner focusing on performance and simplicity.
Stars: ✭ 56 (+211.11%)
ob bulkstashBulk Stash is a docker rclone service to sync, or copy, files between different storage services. For example, you can copy files either to or from a remote storage services like Amazon S3 to Google Cloud Storage, or locally from your laptop to a remote storage.
Stars: ✭ 113 (+527.78%)
Argo WorkflowsWorkflow engine for Kubernetes
Stars: ✭ 10,024 (+55588.89%)
scicloj.mlA Clojure machine learning library
Stars: ✭ 152 (+744.44%)
torchxTorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
Stars: ✭ 165 (+816.67%)
airflow-boilerplateA complete development environment setup for working with Airflow
Stars: ✭ 94 (+422.22%)
fairflowFunctional Airflow DAG definitions.
Stars: ✭ 38 (+111.11%)