All Projects → Soda Sql → Similar Projects or Alternatives

1160 Open source projects that are alternatives of or similar to Soda Sql

Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+2743.35%)
Mutual labels:  data-science, data-engineering
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-85.55%)
Mutual labels:  airflow, data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-69.36%)
Mutual labels:  airflow, data-engineering
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (-36.42%)
Mutual labels:  airflow, data-engineering
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+10202.89%)
Mutual labels:  data-science, data-engineering
Around Dataengineering
A Data Engineering & Machine Learning Knowledge Hub
Stars: ✭ 257 (+48.55%)
Mutual labels:  airflow, data-engineering
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-4.05%)
Mutual labels:  data-science, data-engineering
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+30.64%)
Mutual labels:  data-science, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+265.9%)
Mutual labels:  data-science, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-54.34%)
Mutual labels:  data-science, data-engineering
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-52.6%)
Mutual labels:  airflow, data-engineering
Beyond Jupyter
🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-21.97%)
Mutual labels:  airflow, data-science
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-35.84%)
Mutual labels:  airflow, data-engineering
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-12.14%)
Mutual labels:  data-science, data-engineering
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+399.42%)
Mutual labels:  data-science, data-engineering
D6t Python
Accelerate data science
Stars: ✭ 118 (-31.79%)
Mutual labels:  data-science, data-engineering
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-26.59%)
Mutual labels:  data-science, data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-88.44%)
Mutual labels:  airflow, data-engineering
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+358.38%)
Mutual labels:  airflow, data-engineering
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+3257.23%)
Mutual labels:  data-science, data-engineering
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+164.74%)
Mutual labels:  airflow, data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-54.34%)
Mutual labels:  data-science, data-engineering
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+4498.84%)
Mutual labels:  data-science, data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1278.61%)
Mutual labels:  data-science, data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-27.17%)
Mutual labels:  data-science, data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+773.41%)
Mutual labels:  data-science, data-engineering
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (-20.81%)
Mutual labels:  data-science, data-engineering
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-86.13%)
Mutual labels:  airflow, data-engineering
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (+27.75%)
Mutual labels:  data-science, data-engineering
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-29.48%)
Mutual labels:  data-science, data-engineering
Learn Something Every Day
📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
Stars: ✭ 362 (+109.25%)
Mutual labels:  data-science, data-engineering
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+138.73%)
Mutual labels:  airflow, data-science
Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (-3.47%)
Mutual labels:  airflow, data-engineering
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+24543.93%)
Mutual labels:  data-science, data-engineering
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (-21.39%)
Mutual labels:  airflow, data-engineering
Data Science Stack Cookiecutter
🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (-11.56%)
Mutual labels:  airflow, data-science
Presentations
Slide show presentations regarding data driven investing.
Stars: ✭ 162 (-6.36%)
Mutual labels:  data-science
Covid19 Severity Prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
Stars: ✭ 170 (-1.73%)
Mutual labels:  data-science
Lazynlp
Library to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+1047.4%)
Mutual labels:  data-science
Datascience Pizza
🍕 Repositório para juntar informações sobre materiais de estudo em análise de dados e áreas afins, empresas que trabalham com dados e dicionário de conceitos
Stars: ✭ 2,043 (+1080.92%)
Mutual labels:  data-science
Datasets For Good
List of datasets to apply stats/machine learning/technology to the world of social good.
Stars: ✭ 174 (+0.58%)
Mutual labels:  data-science
Data Science Toolkit
Collection of stats, modeling, and data science tools in Python and R.
Stars: ✭ 169 (-2.31%)
Mutual labels:  data-science
Airflow Exporter
Airflow plugin to export dag and task based metrics to Prometheus.
Stars: ✭ 161 (-6.94%)
Mutual labels:  airflow
Danmf
A sparsity aware implementation of "Deep Autoencoder-like Nonnegative Matrix Factorization for Community Detection" (CIKM 2018).
Stars: ✭ 161 (-6.94%)
Mutual labels:  data-science
Matplotplusplus
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
Stars: ✭ 2,433 (+1306.36%)
Mutual labels:  data-science
Influxdb exporter
A server that accepts InfluxDB metrics via the HTTP API and exports them via HTTP for Prometheus consumption
Stars: ✭ 159 (-8.09%)
Mutual labels:  observability
Scikit Plot
An intuitive library to add plotting functionality to scikit-learn objects.
Stars: ✭ 2,162 (+1149.71%)
Mutual labels:  data-science
Dstack
An open-source tool to rapidly develop data applications with Python
Stars: ✭ 174 (+0.58%)
Mutual labels:  data-science
Pzad
Курс "Прикладные задачи анализа данных" (ВМК, МГУ имени М.В. Ломоносова)
Stars: ✭ 160 (-7.51%)
Mutual labels:  data-science
Primehub
A toil-free multi-tenancy machine learning platform in your Kubernetes cluster
Stars: ✭ 160 (-7.51%)
Mutual labels:  data-science
Open Solution Data Science Bowl 2018
Open solution to the Data Science Bowl 2018
Stars: ✭ 159 (-8.09%)
Mutual labels:  data-science
Airflow Doc Zh
📖 [译] Airflow 中文文档
Stars: ✭ 169 (-2.31%)
Mutual labels:  airflow
Ghactions
GitHub actions for R and accompanying R package
Stars: ✭ 159 (-8.09%)
Mutual labels:  data-science
Applicationinsights Java
Application Insights for Java
Stars: ✭ 172 (-0.58%)
Mutual labels:  observability
Scalable Data Science Platform
Content for architecting a data science platform for products using Luigi, Spark & Flask.
Stars: ✭ 158 (-8.67%)
Mutual labels:  data-science
Sign Language Interpreter Using Deep Learning
A sign language interpreter using live video feed from the camera.
Stars: ✭ 157 (-9.25%)
Mutual labels:  data-science
Visualizingtwitchcommunities
Graphing communities on Twitch.tv in a visually intuitive way
Stars: ✭ 150 (-13.29%)
Mutual labels:  data-science
Aulas
Aulas da Escola de Inteligência Artificial de São Paulo
Stars: ✭ 166 (-4.05%)
Mutual labels:  data-science
Fastbook
The fastai book, published as Jupyter Notebooks
Stars: ✭ 13,998 (+7991.33%)
Mutual labels:  data-science
Gensim
Topic Modelling for Humans
Stars: ✭ 12,763 (+7277.46%)
Mutual labels:  data-science
1-60 of 1160 similar projects