All Projects → Sayn → Similar Projects or Alternatives

2778 Open source projects that are alternatives of or similar to Sayn

Dagster
An orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+5088.61%)
Mutual labels:  data-science, analytics, etl
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-17.72%)
Mutual labels:  etl, analytics, data-engineering
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (+121.52%)
Mutual labels:  data-science, sql, analytics
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+5698.73%)
Mutual labels:  data-science, sql, analytics
Ether sql
A python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-48.1%)
Mutual labels:  sql, analytics, etl
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+9970.89%)
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+2918.99%)
Mutual labels:  data-science, etl, data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+126.58%)
Mutual labels:  analytics, etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+701.27%)
Mutual labels:  data-science, etl, data-engineering
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (+1297.47%)
Mutual labels:  data-science, sql, analytics
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+332.91%)
Mutual labels:  analytics, etl, data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+59.49%)
Mutual labels:  data-science, etl, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+0%)
Mutual labels:  data-science, etl, data-engineering
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+53867.09%)
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+6126.58%)
Mutual labels:  data-science, etl, data-engineering
Awesome Business Intelligence
Actively curated list of awesome BI tools. PRs welcome!
Stars: ✭ 1,157 (+1364.56%)
Mutual labels:  data-science, sql, etl
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (-72.15%)
Mutual labels:  etl, data-engineering
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+2864.56%)
Mutual labels:  analytics, data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-69.62%)
Mutual labels:  etl, data-engineering
Introduction Datascience Python Book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
Stars: ✭ 275 (+248.1%)
Mutual labels:  data-science, analytics
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-82.28%)
Mutual labels:  etl, data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+4589.87%)
Mutual labels:  etl, data-engineering
Covid19 Dashboard
A site that displays up to date COVID-19 stats, powered by fastpages.
Stars: ✭ 1,212 (+1434.18%)
Mutual labels:  data-science, analytics
Minsql
High-performance log search engine.
Stars: ✭ 356 (+350.63%)
Mutual labels:  sql, analytics
Learn Something Every Day
📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
Stars: ✭ 362 (+358.23%)
Mutual labels:  data-science, data-engineering
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+356.96%)
Mutual labels:  sql, etl
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (-18.99%)
Mutual labels:  etl, data-engineering
open-semantic-desktop-search
Virtual Machine for Desktop Search with Open Semantic Search
Stars: ✭ 22 (-72.15%)
Mutual labels:  etl, analytics
Mlinterview
A curated awesome list of AI Startups in India & Machine Learning Interview Guide. Feel free to contribute!
Stars: ✭ 410 (+418.99%)
Mutual labels:  data-science, sql
gallia-core
A schema-aware Scala library for data transformation
Stars: ✭ 44 (-44.3%)
Mutual labels:  etl, data-engineering
My Data Competition Experience
本人多次机器学习与大数据竞赛Top5的经验总结,满满的干货,拿好不谢
Stars: ✭ 271 (+243.04%)
Mutual labels:  data-science, sql
Roapi
Create full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (+220.25%)
Mutual labels:  sql, analytics
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+82.28%)
Mutual labels:  etl, data-engineering
Crate
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.
Stars: ✭ 3,254 (+4018.99%)
Mutual labels:  sql, analytics
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+359.49%)
Mutual labels:  sql, analytics
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+422.78%)
Mutual labels:  data-science, analytics
Ananas Desktop
A hackable data integration & analysis tool to enable non technical users to edit data processing jobs and visualise data on demand.
Stars: ✭ 551 (+597.47%)
Mutual labels:  analytics, etl
Preql
An interpreted relational query language that compiles to SQL.
Stars: ✭ 257 (+225.32%)
Mutual labels:  data-science, sql
Vudash
Powerful, Flexible, Open Source dashboards for anything
Stars: ✭ 363 (+359.49%)
Mutual labels:  automation, analytics
Datacleaner
The premier open source Data Quality solution
Stars: ✭ 391 (+394.94%)
Mutual labels:  data-science, etl
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (+382.28%)
Mutual labels:  data-science, analytics
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+7251.9%)
Mutual labels:  data-science, data-engineering
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+674.68%)
Mutual labels:  etl, data-engineering
Sciblog support
Support content for my blog
Stars: ✭ 694 (+778.48%)
Mutual labels:  data-science, analytics
Mit 15 003 Data Science Tools
Study guides for MIT's 15.003 Data Science Tools
Stars: ✭ 743 (+840.51%)
Mutual labels:  data-science, sql
Threatpursuit Vm
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
Stars: ✭ 814 (+930.38%)
Mutual labels:  data-science, analytics
Awesome Streamlit
The purpose of this project is to share knowledge on how awesome Streamlit is and can be
Stars: ✭ 769 (+873.42%)
Mutual labels:  data-science, analytics
Model Describer
model-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-72.15%)
Mutual labels:  data-science, analytics
Walkoff
A flexible, easy to use, automation framework allowing users to integrate their capabilities and devices to cut through the repetitive, tedious tasks slowing them down. #nsacyber
Stars: ✭ 855 (+982.28%)
Mutual labels:  automation, analytics
Datacleaner
A Python tool that automatically cleans data sets and readies them for analysis.
Stars: ✭ 933 (+1081.01%)
Mutual labels:  automation, data-science
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+993.67%)
Mutual labels:  data-science, data-engineering
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (+697.47%)
Mutual labels:  data-science, analytics
Datofutbol
Dato Fútbol repository
Stars: ✭ 23 (-70.89%)
Mutual labels:  data-science, analytics
Aws Auto Terminate Idle Emr
AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-73.42%)
Mutual labels:  automation, etl
Bloom
The simplest way to de-Google your life and business: Inbox, Calendar, Files, Contacts & much more
Stars: ✭ 934 (+1082.28%)
Mutual labels:  automation, analytics
Semester Biology
Stars: ✭ 52 (-34.18%)
Mutual labels:  data-science, sql
Ai Platform
An open-source platform for automating tasks using machine learning models
Stars: ✭ 61 (-22.78%)
Mutual labels:  automation, data-science
Locopy
locopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-7.59%)
Mutual labels:  sql, etl
Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+10505.06%)
Mutual labels:  automation, data-science
Eventql
Distributed "massively parallel" SQL query engine
Stars: ✭ 1,121 (+1318.99%)
Mutual labels:  sql, analytics
1-60 of 2778 similar projects