All Projects → ml-in-production → Similar Projects or Alternatives

115 Open source projects that are alternatives of or similar to ml-in-production

AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-17.24%)
Mutual labels:  data-engineering, data-pipelines
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (+124.14%)
Mutual labels:  data-engineering, data-pipelines
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+279.31%)
Mutual labels:  data-engineering, apache-airflow
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+396.55%)
Mutual labels:  data-engineering, data-pipelines
neon-workshop
A Pachyderm deep learning tutorial for conference workshops
Stars: ✭ 19 (-34.48%)
Mutual labels:  data-engineering, data-pipelines
papilo
DEPRECATED: Stream data processing micro-framework
Stars: ✭ 24 (-17.24%)
Mutual labels:  data-engineering
airflow-client-python
Apache Airflow - OpenApi Client for Python
Stars: ✭ 172 (+493.1%)
Mutual labels:  apache-airflow
Ro-dou
Gerador de DAGs no Airflow para fazer clipping do Diário Oficial da União.
Stars: ✭ 41 (+41.38%)
Mutual labels:  apache-airflow
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-31.03%)
Mutual labels:  data-engineering
airflow-boilerplate
A complete development environment setup for working with Airflow
Stars: ✭ 94 (+224.14%)
Mutual labels:  apache-airflow
datart
Datart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+3493.1%)
Mutual labels:  data-engineering
Elastik Nearest Neighbors
Go to: https://github.com/alexklibisz/elastiknn
Stars: ✭ 249 (+758.62%)
Mutual labels:  data-engineering
smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+172.41%)
Mutual labels:  data-pipelines
redshift plugin
No description or website provided.
Stars: ✭ 22 (-24.14%)
Mutual labels:  apache-airflow
lrmr
Less-Resilient MapReduce framework for Go
Stars: ✭ 32 (+10.34%)
Mutual labels:  data-engineering
h4sci-course
ETH PhD Program course
Stars: ✭ 19 (-34.48%)
Mutual labels:  data-engineering
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+1410.34%)
Mutual labels:  data-engineering
practical-data-engineering
Real estate dagster pipeline
Stars: ✭ 110 (+279.31%)
Mutual labels:  data-engineering
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (+282.76%)
Mutual labels:  data-engineering
Every Single Day I Tldr
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (+758.62%)
Mutual labels:  data-engineering
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+679.31%)
Mutual labels:  data-engineering
pandora-plugin
Plugin offering views, operators, sensors, and more developed at Pandora Media.
Stars: ✭ 25 (-13.79%)
Mutual labels:  apache-airflow
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+496.55%)
Mutual labels:  data-engineering
Yuniql
Free and open source schema versioning and database migration made natively with .NET Core.
Stars: ✭ 156 (+437.93%)
Mutual labels:  data-engineering
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
Stars: ✭ 139 (+379.31%)
Mutual labels:  data-engineering
prefect-saturn
Python client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (-48.28%)
Mutual labels:  data-engineering
Data Engineering Nanodegree
Projects done in the Data Engineering Nanodegree by Udacity.com
Stars: ✭ 151 (+420.69%)
Mutual labels:  data-engineering
contessa
Easy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (-41.38%)
Mutual labels:  data-engineering
preprocessy
Python package for Customizable Data Preprocessing Pipelines
Stars: ✭ 34 (+17.24%)
Mutual labels:  data-engineering
Everything-Tech
A collection of online resources to help you on your Tech journey.
Stars: ✭ 396 (+1265.52%)
Mutual labels:  data-engineering
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+2010.34%)
Mutual labels:  data-engineering
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
Stars: ✭ 26 (-10.34%)
Mutual labels:  data-engineering
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+96.55%)
Mutual labels:  data-engineering
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (+100%)
Mutual labels:  data-engineering
gallia-core
A schema-aware Scala library for data transformation
Stars: ✭ 44 (+51.72%)
Mutual labels:  data-engineering
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+862.07%)
Mutual labels:  data-engineering
CogStack-NiFi
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
Stars: ✭ 22 (-24.14%)
Mutual labels:  data-pipelines
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (+27.59%)
Mutual labels:  data-engineering
pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 970 (+3244.83%)
Mutual labels:  data-engineering
awesome-dbt
A curated list of awesome dbt resources
Stars: ✭ 520 (+1693.1%)
Mutual labels:  data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+82.76%)
Mutual labels:  data-engineering
Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (+475.86%)
Mutual labels:  data-engineering
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (+662.07%)
Mutual labels:  data-engineering
Azure-Certification-DP-200
Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution
Stars: ✭ 54 (+86.21%)
Mutual labels:  data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+517.24%)
Mutual labels:  data-engineering
datajoint-python
Relational data pipelines for the science lab
Stars: ✭ 140 (+382.76%)
Mutual labels:  data-pipelines
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (+472.41%)
Mutual labels:  data-engineering
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+165.52%)
Mutual labels:  data-engineering
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+424.14%)
Mutual labels:  data-engineering
Data Engineering Howto
A list of useful resources to learn Data Engineering from scratch
Stars: ✭ 2,056 (+6989.66%)
Mutual labels:  data-engineering
Gcp Data Engineer Exam
Study materials for the Google Cloud Professional Data Engineering Exam
Stars: ✭ 144 (+396.55%)
Mutual labels:  data-engineering
awesome-bigquery-views
Useful SQL queries for Blockchain ETL datasets in BigQuery.
Stars: ✭ 325 (+1020.69%)
Mutual labels:  data-engineering
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+62.07%)
Mutual labels:  data-engineering
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (+372.41%)
Mutual labels:  data-engineering
deordie-meetups
DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Stars: ✭ 48 (+65.52%)
Mutual labels:  data-engineering
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (+368.97%)
Mutual labels:  data-engineering
rivery cli
Rivery CLI
Stars: ✭ 16 (-44.83%)
Mutual labels:  data-pipelines
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-13.79%)
Mutual labels:  data-engineering
fairflow
Functional Airflow DAG definitions.
Stars: ✭ 38 (+31.03%)
Mutual labels:  apache-airflow
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (+337.93%)
Mutual labels:  data-engineering
1-60 of 115 similar projects