All Projects → beneath → Similar Projects or Alternatives

1569 Open source projects that are alternatives of or similar to beneath

versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+121.54%)
rivery cli
Rivery CLI
Stars: ✭ 16 (-75.38%)
Mutual labels:  etl, dataops, data-pipelines
Dagster
An orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+6206.15%)
Mutual labels:  etl, analytics, data-pipelines
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+426.15%)
Mutual labels:  etl, analytics, data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-63.08%)
Mutual labels:  etl, data-engineering, data-pipelines
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+21.54%)
Mutual labels:  etl, analytics, data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+175.38%)
Mutual labels:  etl, analytics, data-engineering
neon-workshop
A Pachyderm deep learning tutorial for conference workshops
Stars: ✭ 19 (-70.77%)
Mutual labels:  data-engineering, data-pipelines
Github Analytics
GitHub Analytics with Keen IO
Stars: ✭ 42 (-35.38%)
Mutual labels:  analytics, developer-tools
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (-66.15%)
Mutual labels:  etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+873.85%)
Mutual labels:  etl, data-engineering
gallia-core
A schema-aware Scala library for data transformation
Stars: ✭ 44 (-32.31%)
Mutual labels:  etl, data-engineering
open-semantic-desktop-search
Virtual Machine for Desktop Search with Open Semantic Search
Stars: ✭ 22 (-66.15%)
Mutual labels:  etl, analytics
Gpdb
Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.
Stars: ✭ 4,928 (+7481.54%)
Mutual labels:  analytics, data-warehouse
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+6058.46%)
Mutual labels:  data-pipelines, mlops
noronha
DataOps framework for Machine Learning projects.
Stars: ✭ 47 (-27.69%)
Mutual labels:  dataops, mlops
Ananas Desktop
A hackable data integration & analysis tool to enable non technical users to edit data processing jobs and visualise data on demand.
Stars: ✭ 551 (+747.69%)
Mutual labels:  etl, analytics
Reddit Detective
Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (+98.46%)
Mutual labels:  etl, analytics
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+3569.23%)
Mutual labels:  etl, data-engineering
Ether sql
A python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-36.92%)
Mutual labels:  etl, analytics
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+93.85%)
Mutual labels:  etl, data-engineering
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+7467.69%)
Mutual labels:  etl, data-engineering
Athenax
SQL-based streaming analytics platform at scale
Stars: ✭ 1,178 (+1712.31%)
Mutual labels:  streaming, analytics
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+2547.69%)
Mutual labels:  streaming, analytics
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (-43.08%)
Mutual labels:  etl, data-engineering
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+18.46%)
Mutual labels:  etl, data-engineering
firehose
Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.
Stars: ✭ 213 (+227.69%)
Mutual labels:  streaming, dataops
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+329.23%)
Mutual labels:  etl, data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-18.46%)
Mutual labels:  etl, data-engineering
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (-27.69%)
Mutual labels:  etl, data-engineering
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-12.31%)
Mutual labels:  etl, data-engineering
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+8835.38%)
Mutual labels:  data-engineering, mlops
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+3863.08%)
Mutual labels:  data-engineering, mlops
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+65490.77%)
Mutual labels:  analytics, data-engineering
google-sheets-etl
Live import all your Google Sheets to your data warehouse
Stars: ✭ 15 (-76.92%)
Mutual labels:  etl, data-warehouse
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+5600%)
Mutual labels:  etl, data-engineering
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-78.46%)
Mutual labels:  etl, data-engineering
datatile
A library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+544.62%)
Mutual labels:  dataops, mlops
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+21.54%)
Mutual labels:  etl, data-engineering
Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (+156.92%)
Mutual labels:  data-warehouse, data-engineering
data-science-best-practices
The goal of this repository is to enable data scientists and ML engineers to develop data science use cases and making it ready for production use. This means focusing on the versioning, scalability, monitoring and engineering of the solution.
Stars: ✭ 53 (-18.46%)
Mutual labels:  analytics, mlops
cli
Polyaxon Core Client & CLI to streamline MLOps
Stars: ✭ 18 (-72.31%)
Mutual labels:  dataops, mlops
Eventql
Distributed "massively parallel" SQL query engine
Stars: ✭ 1,121 (+1624.62%)
Mutual labels:  streaming, analytics
Uplot
📈 A small, fast chart for time series, lines, areas, ohlc & bars
Stars: ✭ 6,808 (+10373.85%)
Mutual labels:  streaming, analytics
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-69.23%)
Mutual labels:  etl, data-engineering
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+689.23%)
Mutual labels:  streaming, analytics
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+3503.08%)
Mutual labels:  analytics, data-engineering
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+841.54%)
Mutual labels:  etl, data-engineering
ml-in-production
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (-55.38%)
Mutual labels:  data-engineering, data-pipelines
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (-1.54%)
Mutual labels:  etl, data-engineering
m3u8
Parse and generate m3u8 playlists for Apple HTTP Live Streaming (HLS) in Ruby.
Stars: ✭ 96 (+47.69%)
Mutual labels:  streaming
tfx-kubeflow-pipelines
Kubeflow pipelines built on top of Tensorflow TFX library
Stars: ✭ 17 (-73.85%)
Mutual labels:  mlops
WinAnalytics
A light-weight android library that can be quickly integrated into any app to use analytics tools.
Stars: ✭ 23 (-64.62%)
Mutual labels:  analytics
mlops-with-vertex-ai
An end-to-end example of MLOps on Google Cloud using TensorFlow, TFX, and Vertex AI
Stars: ✭ 155 (+138.46%)
Mutual labels:  mlops
own3dpro-obs-plugin
OWN3D Pro OBS Plugin
Stars: ✭ 25 (-61.54%)
Mutual labels:  streaming
Covid-19-analysis
Analysis with Covid-19 data
Stars: ✭ 49 (-24.62%)
Mutual labels:  analytics
mailtrap
MailTrap has been renamed to Sendria. Please use Sendria now, MailTrap is abandoned. MailTrap is a SMTP server designed to run in your dev/test environment, that is designed to catch any email you or your application is sending, and display it in a web interface instead of sending to real world.
Stars: ✭ 14 (-78.46%)
Mutual labels:  developer-tools
dji-ryze-tello
Pythonic DJI Ryze Tello Workbench
Stars: ✭ 17 (-73.85%)
Mutual labels:  streaming
transform-hub
Flexible and efficient data processing engine and an evolution of the popular Scramjet Framework based on node.js. Our Transform Hub was designed specifically for data processing and has its own unique algorithms included.
Stars: ✭ 38 (-41.54%)
Mutual labels:  streaming
PHP-Broadcast-radio
🌈 Autonomous streaming audio ,serveronline internet radio is free streaming music for your listening pleasure, as well as news and announcements.
Stars: ✭ 38 (-41.54%)
Mutual labels:  streaming
1-60 of 1569 similar projects