All Projects → ml-in-production → Similar Projects or Alternatives

115 Open source projects that are alternatives of or similar to ml-in-production

Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (+368.97%)
Mutual labels:  data-engineering
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (+337.93%)
Mutual labels:  data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+334.48%)
Mutual labels:  data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+8124.14%)
Mutual labels:  data-engineering
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (+320.69%)
Mutual labels:  data-engineering
D6t Python
Accelerate data science
Stars: ✭ 118 (+306.9%)
Mutual labels:  data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+5110.34%)
Mutual labels:  data-engineering
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+146913.79%)
Mutual labels:  data-engineering
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+61362.07%)
Mutual labels:  data-engineering
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (+182.76%)
Mutual labels:  data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+172.41%)
Mutual labels:  data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+172.41%)
Mutual labels:  data-engineering
Ansible Playbook
Ansible playbook to deploy distributed technologies
Stars: ✭ 61 (+110.34%)
Mutual labels:  data-engineering
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (+106.9%)
Mutual labels:  data-engineering
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+3372.41%)
Mutual labels:  data-engineering
Dbt Sqlserver
dbt adapter for SQL Server and Azure SQL
Stars: ✭ 41 (+41.38%)
Mutual labels:  data-engineering
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+2879.31%)
Mutual labels:  data-engineering
Lakefs
Git-like capabilities for your object storage
Stars: ✭ 847 (+2820.69%)
Mutual labels:  data-engineering
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+2634.48%)
Mutual labels:  data-engineering
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+27334.48%)
Mutual labels:  data-engineering
Pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 647 (+2131.03%)
Mutual labels:  data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+2082.76%)
Mutual labels:  data-engineering
Pointblank
Data validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+1555.17%)
Mutual labels:  data-engineering
Data Engineering Book
Accumulated knowledge and experience in the field of Data Engineering
Stars: ✭ 471 (+1524.14%)
Mutual labels:  data-engineering
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+1479.31%)
Mutual labels:  data-engineering
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+19927.59%)
Mutual labels:  data-engineering
Active workflow
Turn complex requirements to workflows without leaving the comfort of your technology stack.
Stars: ✭ 413 (+1324.14%)
Mutual labels:  data-engineering
Awesome Opensource Data Engineering
An Awesome List of Open-Source Data Engineering Projects
Stars: ✭ 381 (+1213.79%)
Mutual labels:  data-engineering
Learn Something Every Day
📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
Stars: ✭ 362 (+1148.28%)
Mutual labels:  data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+1079.31%)
Mutual labels:  data-engineering
Egeria
Open Metadata and Governance
Stars: ✭ 328 (+1031.03%)
Mutual labels:  data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+12675.86%)
Mutual labels:  data-engineering
Around Dataengineering
A Data Engineering & Machine Learning Knowledge Hub
Stars: ✭ 257 (+786.21%)
Mutual labels:  data-engineering
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+16862.07%)
Mutual labels:  data-engineering
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+8782.76%)
Mutual labels:  data-engineering
Cookbook
The Data Engineering Cookbook
Stars: ✭ 9,829 (+33793.1%)
Mutual labels:  data-engineering
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-51.72%)
Mutual labels:  data-engineering
ClassifyBot
Automate building ML classification pipelines in .NET
Stars: ✭ 16 (-44.83%)
Mutual labels:  data-engineering
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+7975.86%)
Mutual labels:  data-engineering
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (-24.14%)
Mutual labels:  data-engineering
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (+120.69%)
Mutual labels:  data-engineering
yt-channels-DS-AI-ML-CS
A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+3479.31%)
Mutual labels:  data-engineering
Kaggle-project-list
Summary of my projects on kaggle
Stars: ✭ 20 (-31.03%)
Mutual labels:  data-engineering
mpc-DL-controller
Deep Neural Network architecture as a predictive optimal controller for {HVAC+Solar cell + battery} disturbance afflicted system vs classic Model Predictive Control
Stars: ✭ 37 (+27.59%)
Mutual labels:  data-engineering
DataEngineering
This repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+62.07%)
Mutual labels:  data-engineering
Dagster
An orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+14034.48%)
Mutual labels:  data-pipelines
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+13703.45%)
Mutual labels:  data-pipelines
arakat
ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform
Stars: ✭ 23 (-20.69%)
Mutual labels:  data-pipelines
spark-transformers
Spark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
Stars: ✭ 39 (+34.48%)
Mutual labels:  data-pipelines
Awesome Apache Airflow
Curated list of resources about Apache Airflow
Stars: ✭ 2,755 (+9400%)
Mutual labels:  apache-airflow
Airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+83006.9%)
Mutual labels:  apache-airflow
airflow-user-management-plugin
A plugin for Apache Airflow that allows you to manage the users that can login
Stars: ✭ 13 (-55.17%)
Mutual labels:  apache-airflow
airflow-code-editor
A plugin for Apache Airflow that allows you to edit DAGs in browser
Stars: ✭ 195 (+572.41%)
Mutual labels:  apache-airflow
openverse-catalog
Identifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-6.9%)
Mutual labels:  apache-airflow
airflow-prometheus-exporter
Export Airflow metrics (from mysql) in prometheus format
Stars: ✭ 25 (-13.79%)
Mutual labels:  apache-airflow
61-115 of 115 similar projects