All Projects → neon-workshop → Similar Projects or Alternatives

103 Open source projects that are alternatives of or similar to neon-workshop

versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+657.89%)
Mutual labels:  data-engineering, data-pipelines
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (+242.11%)
Mutual labels:  data-engineering, data-pipelines
ml-in-production
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (+52.63%)
Mutual labels:  data-engineering, data-pipelines
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (+26.32%)
Mutual labels:  data-engineering, data-pipelines
get smarties
Dummy variable generation with fit/transform capabilities
Stars: ✭ 23 (+21.05%)
Mutual labels:  data-engineering
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (+1063.16%)
Mutual labels:  data-engineering
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (+773.68%)
Mutual labels:  data-engineering
Gcp Data Engineer Exam
Study materials for the Google Cloud Professional Data Engineering Exam
Stars: ✭ 144 (+657.89%)
Mutual labels:  data-engineering
practical-data-engineering
Real estate dagster pipeline
Stars: ✭ 110 (+478.95%)
Mutual labels:  data-engineering
papilo
DEPRECATED: Stream data processing micro-framework
Stars: ✭ 24 (+26.32%)
Mutual labels:  data-engineering
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (+568.42%)
Mutual labels:  data-engineering
Every Single Day I Tldr
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (+1210.53%)
Mutual labels:  data-engineering
awesome-bigquery-views
Useful SQL queries for Blockchain ETL datasets in BigQuery.
Stars: ✭ 325 (+1610.53%)
Mutual labels:  data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+842.11%)
Mutual labels:  data-engineering
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+200%)
Mutual labels:  data-engineering
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+700%)
Mutual labels:  data-engineering
smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+315.79%)
Mutual labels:  data-pipelines
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (+621.05%)
Mutual labels:  data-engineering
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
Stars: ✭ 139 (+631.58%)
Mutual labels:  data-engineering
lrmr
Less-Resilient MapReduce framework for Go
Stars: ✭ 32 (+68.42%)
Mutual labels:  data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+12452.63%)
Mutual labels:  data-engineering
D6t Python
Accelerate data science
Stars: ✭ 118 (+521.05%)
Mutual labels:  data-engineering
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+224289.47%)
Mutual labels:  data-engineering
datart
Datart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+5384.21%)
Mutual labels:  data-engineering
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+1368.42%)
Mutual labels:  data-engineering
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (+331.58%)
Mutual labels:  data-engineering
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (+484.21%)
Mutual labels:  data-engineering
prefect-saturn
Python client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (-21.05%)
Mutual labels:  data-engineering
Elastik Nearest Neighbors
Go to: https://github.com/alexklibisz/elastiknn
Stars: ✭ 249 (+1210.53%)
Mutual labels:  data-engineering
preprocessy
Python package for Customizable Data Preprocessing Pipelines
Stars: ✭ 34 (+78.95%)
Mutual labels:  data-engineering
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+1089.47%)
Mutual labels:  data-engineering
deordie-meetups
DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Stars: ✭ 48 (+152.63%)
Mutual labels:  data-engineering
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+810.53%)
Mutual labels:  data-engineering
datajoint-python
Relational data pipelines for the science lab
Stars: ✭ 140 (+636.84%)
Mutual labels:  data-pipelines
Yuniql
Free and open source schema versioning and database migration made natively with .NET Core.
Stars: ✭ 156 (+721.05%)
Mutual labels:  data-engineering
contessa
Easy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (-10.53%)
Mutual labels:  data-engineering
Data Engineering Nanodegree
Projects done in the Data Engineering Nanodegree by Udacity.com
Stars: ✭ 151 (+694.74%)
Mutual labels:  data-engineering
CogStack-NiFi
Building data processing pipelines for documents processing with NLP using Apache NiFi and related services
Stars: ✭ 22 (+15.79%)
Mutual labels:  data-pipelines
Data Engineering Howto
A list of useful resources to learn Data Engineering from scratch
Stars: ✭ 2,056 (+10721.05%)
Mutual labels:  data-engineering
Everything-Tech
A collection of online resources to help you on your Tech journey.
Stars: ✭ 396 (+1984.21%)
Mutual labels:  data-engineering
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (+615.79%)
Mutual labels:  data-engineering
h4sci-course
ETH PhD Program course
Stars: ✭ 19 (+0%)
Mutual labels:  data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+563.16%)
Mutual labels:  data-engineering
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
Stars: ✭ 26 (+36.84%)
Mutual labels:  data-engineering
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (+542.11%)
Mutual labels:  data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+178.95%)
Mutual labels:  data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+7852.63%)
Mutual labels:  data-engineering
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (+205.26%)
Mutual labels:  data-engineering
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+93710.53%)
Mutual labels:  data-engineering
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+147.37%)
Mutual labels:  data-engineering
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+2205.26%)
Mutual labels:  data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+315.79%)
Mutual labels:  data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+315.79%)
Mutual labels:  data-engineering
Ansible Playbook
Ansible playbook to deploy distributed technologies
Stars: ✭ 61 (+221.05%)
Mutual labels:  data-engineering
Azure-Certification-DP-200
Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution
Stars: ✭ 54 (+184.21%)
Mutual labels:  data-engineering
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (+94.74%)
Mutual labels:  data-engineering
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (+215.79%)
Mutual labels:  data-engineering
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+5200%)
Mutual labels:  data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (+5.26%)
Mutual labels:  data-engineering
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+3121.05%)
Mutual labels:  data-engineering
1-60 of 103 similar projects