All Projects → Dagster → Similar Projects or Alternatives

2030 Open source projects that are alternatives of or similar to Dagster

Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-98.07%)
Mutual labels:  data-science, analytics, etl
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-98.41%)
Mutual labels:  etl, analytics, data-pipelines
Covid19 Dashboard
A site that displays up to date COVID-19 stats, powered by fastpages.
Stars: ✭ 1,212 (-70.43%)
Mutual labels:  data-science, analytics
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+20%)
Mutual labels:  data-science, etl
Plynx
PLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.
Stars: ✭ 192 (-95.32%)
Mutual labels:  data-science, workflow
Drake Examples
Example workflows for the drake R package
Stars: ✭ 57 (-98.61%)
Mutual labels:  data-science, workflow
Suspeitando
Projeto de análise de contratos com suspeita de superfaturamento e má qualidade na prestação de serviços.
Stars: ✭ 76 (-98.15%)
Mutual labels:  data-science, analytics
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (-41.82%)
Mutual labels:  data-science, etl
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-96.93%)
Mutual labels:  data-science, etl
Airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+487.97%)
Mutual labels:  scheduler, workflow
thain
Thain is a distributed flow schedule platform.
Stars: ✭ 81 (-98.02%)
Mutual labels:  etl, scheduler
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (-96.49%)
Mutual labels:  etl, data-pipelines
Model Describer
model-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-99.46%)
Mutual labels:  data-science, analytics
Datofutbol
Dato Fútbol repository
Stars: ✭ 23 (-99.44%)
Mutual labels:  data-science, analytics
Awesome Business Intelligence
Actively curated list of awesome BI tools. PRs welcome!
Stars: ✭ 1,157 (-71.77%)
Mutual labels:  data-science, etl
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (-73.07%)
Mutual labels:  data-science, analytics
Ds With Pysimplegui
Data science and Machine Learning GUI programs/ desktop apps with PySimpleGUI package
Stars: ✭ 93 (-97.73%)
Mutual labels:  data-science, analytics
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+940.11%)
Mutual labels:  data-science, analytics
Batchflow
BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (-96.19%)
Mutual labels:  data-science, workflow
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-98.07%)
Mutual labels:  data-science, etl
Docker Airflow
Docker Apache Airflow
Stars: ✭ 3,375 (-17.66%)
Mutual labels:  scheduler, workflow
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (-90.92%)
Mutual labels:  scheduler, etl
Powerjob
Enterprise job scheduling middleware with distributed computing ability.
Stars: ✭ 3,231 (-21.18%)
Mutual labels:  scheduler, workflow
Wexflow
An easy and fast way to build automation and workflows on Windows, Linux, macOS, and the cloud.
Stars: ✭ 2,435 (-40.6%)
Mutual labels:  scheduler, workflow
bitnami-docker-airflow-scheduler
Bitnami Docker Image for Apache Airflow Scheduler
Stars: ✭ 19 (-99.54%)
Mutual labels:  workflow, scheduler
ibis
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Stars: ✭ 48 (-98.83%)
Mutual labels:  workflow, workflow-automation
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+11.76%)
Mutual labels:  data-science, analytics
Threatpursuit Vm
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
Stars: ✭ 814 (-80.14%)
Mutual labels:  data-science, analytics
Awesome Streamlit
The purpose of this project is to share knowledge on how awesome Streamlit is and can be
Stars: ✭ 769 (-81.24%)
Mutual labels:  data-science, analytics
Vds
Verteego Data Suite
Stars: ✭ 9 (-99.78%)
Mutual labels:  data-science, workflow
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+94.1%)
Mutual labels:  data-science, workflow
Introduction Datascience Python Book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
Stars: ✭ 275 (-93.29%)
Mutual labels:  data-science, analytics
Etl with python
ETL with Python - Taught at DWH course 2017 (TAU)
Stars: ✭ 68 (-98.34%)
Mutual labels:  data-science, etl
Sciblog support
Support content for my blog
Stars: ✭ 694 (-83.07%)
Mutual labels:  data-science, analytics
Drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (-68.26%)
Mutual labels:  data-science, workflow
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (-69.02%)
Mutual labels:  data-science, analytics
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (-61.97%)
Mutual labels:  data-science, analytics
Flyte
Accelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (-69.7%)
Mutual labels:  data-science, workflow
Interactive machine learning
IPython widgets, interactive plots, interactive machine learning
Stars: ✭ 140 (-96.58%)
Mutual labels:  data-science, analytics
Qlik Py Tools
Data Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).
Stars: ✭ 135 (-96.71%)
Mutual labels:  data-science, analytics
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (-95.73%)
Mutual labels:  data-science, analytics
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (-84.56%)
Mutual labels:  data-science, etl
Awesome Datascience
📝 An awesome Data Science repository to learn and apply for real world problems.
Stars: ✭ 17,520 (+327.42%)
Mutual labels:  data-science, analytics
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (-94.61%)
Mutual labels:  data-science, workflow
Active workflow
Turn complex requirements to workflows without leaving the comfort of your technology stack.
Stars: ✭ 413 (-89.92%)
Mutual labels:  scheduler, workflow
Elastic
R client for the Elasticsearch HTTP API
Stars: ✭ 227 (-94.46%)
Mutual labels:  data-science, etl
Aiida Core
The official repository for the AiiDA code
Stars: ✭ 238 (-94.19%)
Mutual labels:  scheduler, workflow
Schedulis
Schedulis is a high performance workflow task scheduling system that supports high availability and multi-tenant financial level features, Linkis computing middleware, and has been integrated into data application development portal DataSphere Studio
Stars: ✭ 222 (-94.58%)
Mutual labels:  scheduler, workflow
zdh web
大数据采集,抽取平台
Stars: ✭ 292 (-92.88%)
Mutual labels:  etl, scheduler
Cql
Categorical Query Language IDE
Stars: ✭ 196 (-95.22%)
Mutual labels:  data-science, etl
open-semantic-desktop-search
Virtual Machine for Desktop Search with Open Semantic Search
Stars: ✭ 22 (-99.46%)
Mutual labels:  etl, analytics
monopacker
A tool for managing builds of monorepo frontend projects with eg. npm- or yarn workspaces, lerna or similar tools into a standalone application - no other tools needed.
Stars: ✭ 17 (-99.59%)
Mutual labels:  workflow, workflow-automation
rivery cli
Rivery CLI
Stars: ✭ 16 (-99.61%)
Mutual labels:  etl, data-pipelines
Polyaxon
Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (-27.64%)
Mutual labels:  data-science, workflow
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+29.42%)
Mutual labels:  data-science, analytics
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (-84.63%)
Mutual labels:  data-science, analytics
Data Science Live Book
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-95.29%)
Mutual labels:  data-science, analytics
zenaton-node
⚡ Node.js library to run and orchestrate background jobs with Zenaton Workflow Engine
Stars: ✭ 50 (-98.78%)
Mutual labels:  scheduler, workflow-automation
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-99.41%)
Mutual labels:  etl, data-pipelines
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (-2.34%)
Mutual labels:  data-science, data-pipelines
1-60 of 2030 similar projects