k3aiA lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
Stars: ✭ 105 (-4.55%)
Airflow TestingAirflow Unit Tests and Integration Tests
Stars: ✭ 175 (+59.09%)
Data Pipelines With Apache AirflowDeveloped a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Stars: ✭ 50 (-54.55%)
Aws Ecs AirflowRun Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (-2.73%)
PaperboyA web frontend for scheduling Jupyter notebook reports
Stars: ✭ 221 (+100.91%)
Airflow CookbookAirflow workflow management platform chef cookbook.
Stars: ✭ 58 (-47.27%)
kedro-airflow-k8sKedro Plugin to support running pipelines on Kubernetes using Airflow.
Stars: ✭ 22 (-80%)
Airflow ChartA Helm chart to install Apache Airflow on Kubernetes
Stars: ✭ 137 (+24.55%)
WhirlFast iterative local development and testing of Apache Airflow workflows
Stars: ✭ 111 (+0.91%)
airflow-dbt-pythonA collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (+0.91%)
fab-oidcFlask-AppBuilder SecurityManager for OpenIDConnect
Stars: ✭ 28 (-74.55%)
Terraform Aws AirflowTerraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Stars: ✭ 69 (-37.27%)
Airflow ToolkitAny Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested data pipelines(DAGs) 🖥 >> [ 🚀, 🚢 ]
Stars: ✭ 51 (-53.64%)
ObjinsyncContinuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (-73.64%)
Airflow ExporterAirflow plugin to export dag and task based metrics to Prometheus.
Stars: ✭ 161 (+46.36%)
DatabookA facebook for data
Stars: ✭ 26 (-76.36%)
aircan💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher and Xloader for loading data to DataStore.
Stars: ✭ 24 (-78.18%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (+22.73%)
Incubator DolphinschedulerApache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
Stars: ✭ 6,916 (+6187.27%)
AirflowApache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+21810%)
Afctlafctl helps to manage and deploy Apache Airflow projects faster and smoother.
Stars: ✭ 116 (+5.45%)
event extract master支持百度竞赛数据的中文事件抽取,支持ace2005数据的英文事件抽取,本人将苏神的三元组抽取算法中的DGCNN改成了事件抽取任务,并将karas改成了本人习惯使用的pytorch,在数据加载处考虑了各种语言的扩展
Stars: ✭ 43 (-60.91%)
kedro-airflowKedro-Airflow makes it easy to deploy Kedro projects to Airflow.
Stars: ✭ 121 (+10%)
Example Airflow DagsExample DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+120.91%)
fusemlFuseML aims to provide an MLOps framework as the medium dynamically integrating together the AI/ML tools of your choice. It's an extensible tool built through collaboration, where Data Engineers and DevOps Engineers can come together and contribute with reusable integration code.
Stars: ✭ 73 (-33.64%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+986.36%)
DiscreetlyETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (-45.45%)
domino-researchProjects developed by Domino's R&D team
Stars: ✭ 74 (-32.73%)
XeneA distributed workflow runner focusing on performance and simplicity.
Stars: ✭ 56 (-49.09%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+57.27%)
Argo WorkflowsWorkflow engine for Kubernetes
Stars: ✭ 10,024 (+9012.73%)
Docker AirflowRepo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializing GCP while container booting. https://abhioncbr.github.io/docker-airflow/
Stars: ✭ 29 (-73.64%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-81.82%)
ElyraElyra extends JupyterLab Notebooks with an AI centric approach.
Stars: ✭ 839 (+662.73%)
Data Science Stack Cookiecutter🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (+39.09%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+620.91%)
Udacity Data Engineering ProjectsFew projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+316.36%)
pipelinePipelineAI Kubeflow Distribution
Stars: ✭ 4,154 (+3676.36%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+275.45%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+16.36%)
tornadoThe Tornado 🌪️ framework, designed and implemented for adaptive online learning and data stream mining in Python.
Stars: ✭ 110 (+0%)
mlflow-gocdGoCD plugins to work with MLFlow as model repository in a CD flow
Stars: ✭ 26 (-76.36%)
Insight-GDELT-FeedA way for home buyers to know about factors affecting a state
Stars: ✭ 43 (-60.91%)
FACILFramework for Analysis of Class-Incremental Learning with 12 state-of-the-art methods and 3 baselines.
Stars: ✭ 411 (+273.64%)