All Projects → Soda Sql → Similar Projects or Alternatives

1160 Open source projects that are alternatives of or similar to Soda Sql

Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.

Stars: ✭ 4,919 (+2743.35%)

Mutual labels: data-science, data-engineering

JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.

Stars: ✭ 25 (-85.55%)

Mutual labels: airflow, data-engineering

polygon-etl

ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub

Stars: ✭ 53 (-69.36%)

Mutual labels: airflow, data-engineering

viewflow

Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.

Stars: ✭ 110 (-36.42%)

Mutual labels: airflow, data-engineering

Applied Ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

Stars: ✭ 17,824 (+10202.89%)

Mutual labels: data-science, data-engineering

Around Dataengineering

A Data Engineering & Machine Learning Knowledge Hub

Stars: ✭ 257 (+48.55%)

Mutual labels: airflow, data-engineering

Auptimizer

An automatic ML model optimization tool.

Stars: ✭ 166 (-4.05%)

Mutual labels: data-science, data-engineering

Gspread Pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

Stars: ✭ 226 (+30.64%)

Mutual labels: data-science, data-engineering

Pyspark Example Project

Example project implementing best practices for PySpark ETL jobs and applications.

Stars: ✭ 633 (+265.9%)

Mutual labels: data-science, data-engineering

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-54.34%)

Mutual labels: data-science, data-engineering

Dataengineeringproject

Example end to end data engineering project.

Stars: ✭ 82 (-52.6%)

Mutual labels: airflow, data-engineering

Beyond Jupyter

🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)

Stars: ✭ 135 (-21.97%)

Mutual labels: airflow, data-science

airflow-dbt-python

A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.

Stars: ✭ 111 (-35.84%)

Mutual labels: airflow, data-engineering

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (-12.14%)

Mutual labels: data-science, data-engineering

Data Science On Gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

Stars: ✭ 864 (+399.42%)

Mutual labels: data-science, data-engineering

D6t Python

Accelerate data science

Stars: ✭ 118 (-31.79%)

Mutual labels: data-science, data-engineering

Pipelinex

PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more

Stars: ✭ 127 (-26.59%)

Mutual labels: data-science, data-engineering

AirflowETL

Blog post on ETL pipelines with Airflow

Stars: ✭ 20 (-88.44%)

Mutual labels: airflow, data-engineering

Goodreads etl pipeline

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Stars: ✭ 793 (+358.38%)

Mutual labels: airflow, data-engineering

Great expectations

Always know what to expect from your data.

Stars: ✭ 5,808 (+3257.23%)

Mutual labels: data-science, data-engineering

Udacity Data Engineering Projects

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

Stars: ✭ 458 (+164.74%)

Mutual labels: airflow, data-engineering

Sayn

Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

Stars: ✭ 79 (-54.34%)

Mutual labels: data-science, data-engineering

Prefect

The easiest way to automate your data

Stars: ✭ 7,956 (+4498.84%)

Mutual labels: data-science, data-engineering

Aws Data Wrangler

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Stars: ✭ 2,385 (+1278.61%)

Mutual labels: data-science, data-engineering

Butterfree

A tool for building feature stores.

Stars: ✭ 126 (-27.17%)

Mutual labels: data-science, data-engineering

Just Dashboard

📊 📋 Dashboards using YAML or JSON files

Stars: ✭ 1,511 (+773.41%)

Mutual labels: data-science, data-engineering

Accelerator

The Accelerator is a tool for fast and reproducible processing of large amounts of data.

Stars: ✭ 137 (-20.81%)

Mutual labels: data-science, data-engineering

AirflowDataPipeline

Example of an ETL Pipeline using Airflow

Stars: ✭ 24 (-86.13%)

Mutual labels: airflow, data-engineering

Ploomber

A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.

Stars: ✭ 221 (+27.75%)

Mutual labels: data-science, data-engineering

Spark Alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive

Stars: ✭ 122 (-29.48%)

Mutual labels: data-science, data-engineering

Learn Something Every Day

📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->

Stars: ✭ 362 (+109.25%)

Mutual labels: data-science, data-engineering

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (+138.73%)

Mutual labels: airflow, data-science

Data-Engineering-Projects

Personal Data Engineering Projects

Stars: ✭ 167 (-3.47%)

Mutual labels: airflow, data-engineering

Superset

Apache Superset is a Data Visualization and Data Exploration Platform

Stars: ✭ 42,634 (+24543.93%)

Mutual labels: data-science, data-engineering

Airflow Autoscaling Ecs

Airflow Deployment on AWS ECS Fargate Using Cloudformation

Stars: ✭ 136 (-21.39%)

Mutual labels: airflow, data-engineering

Data Science Stack Cookiecutter

🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)

Stars: ✭ 153 (-11.56%)

Mutual labels: airflow, data-science

Presentations

Slide show presentations regarding data driven investing.

Stars: ✭ 162 (-6.36%)

Mutual labels: data-science

Covid19 Severity Prediction

Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈

Stars: ✭ 170 (-1.73%)

Mutual labels: data-science

Lazynlp

Library to scrape and clean web pages to create massive datasets.

Stars: ✭ 1,985 (+1047.4%)

Mutual labels: data-science

Datascience Pizza

🍕 Repositório para juntar informações sobre materiais de estudo em análise de dados e áreas afins, empresas que trabalham com dados e dicionário de conceitos

Stars: ✭ 2,043 (+1080.92%)

Mutual labels: data-science

Datasets For Good

List of datasets to apply stats/machine learning/technology to the world of social good.

Stars: ✭ 174 (+0.58%)

Mutual labels: data-science

Data Science Toolkit

Collection of stats, modeling, and data science tools in Python and R.

Stars: ✭ 169 (-2.31%)

Mutual labels: data-science

Airflow Exporter

Airflow plugin to export dag and task based metrics to Prometheus.