openverse-catalogIdentifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-28.95%)
airflow-boilerplateA complete development environment setup for working with Airflow
Stars: ✭ 94 (+147.37%)
AirflowApache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+63323.68%)
viewflowViewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+189.47%)
airflow-code-editorA plugin for Apache Airflow that allows you to edit DAGs in browser
Stars: ✭ 195 (+413.16%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+3044.74%)
XeneA distributed workflow runner focusing on performance and simplicity.
Stars: ✭ 56 (+47.37%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+355.26%)
DiscreetlyETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (+57.89%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-47.37%)
Argo WorkflowsWorkflow engine for Kubernetes
Stars: ✭ 10,024 (+26278.95%)
Data Science Stack Cookiecutter🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (+302.63%)
Docker AirflowRepo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializing GCP while container booting. https://abhioncbr.github.io/docker-airflow/
Stars: ✭ 29 (-23.68%)
Insight-GDELT-FeedA way for home buyers to know about factors affecting a state
Stars: ✭ 43 (+13.16%)
pipelinePipelineAI Kubeflow Distribution
Stars: ✭ 4,154 (+10831.58%)
ElyraElyra extends JupyterLab Notebooks with an AI centric approach.
Stars: ✭ 839 (+2107.89%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+236.84%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+1986.84%)
kedro-airflow-k8sKedro Plugin to support running pipelines on Kubernetes using Airflow.
Stars: ✭ 22 (-42.11%)
Terraform Aws AirflowTerraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Stars: ✭ 69 (+81.58%)
Airflow TestingAirflow Unit Tests and Integration Tests
Stars: ✭ 175 (+360.53%)
Airflow CookbookAirflow workflow management platform chef cookbook.
Stars: ✭ 58 (+52.63%)
Airflow ToolkitAny Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested data pipelines(DAGs) 🖥 >> [ 🚀, 🚢 ]
Stars: ✭ 51 (+34.21%)
Airflow ExporterAirflow plugin to export dag and task based metrics to Prometheus.
Stars: ✭ 161 (+323.68%)
Data Pipelines With Apache AirflowDeveloped a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Stars: ✭ 50 (+31.58%)
aircan💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher and Xloader for loading data to DataStore.
Stars: ✭ 24 (-36.84%)
ObjinsyncContinuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (-23.68%)
Airflow ChartA Helm chart to install Apache Airflow on Kubernetes
Stars: ✭ 137 (+260.53%)
incremental trainingRepo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'
Stars: ✭ 110 (+189.47%)
DatabookA facebook for data
Stars: ✭ 26 (-31.58%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (+255.26%)
airflow-dbt-pythonA collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (+192.11%)
Incubator DolphinschedulerApache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
Stars: ✭ 6,916 (+18100%)
Udacity Data Engineering ProjectsFew projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+1105.26%)
kedro-airflowKedro-Airflow makes it easy to deploy Kedro projects to Airflow.
Stars: ✭ 121 (+218.42%)
Example Airflow DagsExample DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+539.47%)
Afctlafctl helps to manage and deploy Apache Airflow projects faster and smoother.
Stars: ✭ 116 (+205.26%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+986.84%)
Dag FactoryDynamically generate Apache Airflow DAGs from YAML configuration files
Stars: ✭ 385 (+913.16%)
WhirlFast iterative local development and testing of Apache Airflow workflows
Stars: ✭ 111 (+192.11%)
Aws Airflow StackTurbine: the bare metals that gets you Airflow
Stars: ✭ 352 (+826.32%)
PaperboyA web frontend for scheduling Jupyter notebook reports
Stars: ✭ 221 (+481.58%)