awesome-dbtA curated list of awesome dbt resources
Stars: ✭ 520 (+274.1%)
airflow-dbt-pythonA collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-20.14%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+24.46%)
dbt-spotify-analyticsContainerized end-to-end analytics of Spotify data using Python, dbt, Postgres, and Metabase
Stars: ✭ 92 (-33.81%)
Everything-TechA collection of online resources to help you on your Tech journey.
Stars: ✭ 396 (+184.89%)
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (-44.6%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-12.23%)
qsvCSVs sliced, diced & analyzed.
Stars: ✭ 438 (+215.11%)
Applied Ml📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+12723.02%)
dbt ad reportingFivetran's ad reporting dbt package. Combine your Facebook, Google, Pinterest, Linkedin, Twitter, Snapchat and Microsoft advertising spend using this package.
Stars: ✭ 68 (-51.08%)
get smartiesDummy variable generation with fit/transform capabilities
Stars: ✭ 23 (-83.45%)
Gspread PandasA package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+62.59%)
datartDatart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+649.64%)
YuniqlFree and open source schema versioning and database migration made natively with .NET Core.
Stars: ✭ 156 (+12.23%)
Data Engineering HowtoA list of useful resources to learn Data Engineering from scratch
Stars: ✭ 2,056 (+1379.14%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (-9.35%)
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-58.27%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+987.05%)
prefect-saturnPython client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (-89.21%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-43.17%)
Ansible PlaybookAnsible playbook to deploy distributed technologies
Stars: ✭ 61 (-56.12%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-56.83%)
Every Single Day I TldrA daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (+79.14%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-61.87%)
PloomberA convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (+58.99%)
contessaEasy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (-87.77%)
dbt2lookerGenerate lookml for views from dbt models
Stars: ✭ 119 (-14.39%)
AuptimizerAn automatic ML model optimization tool.
Stars: ✭ 166 (+19.42%)
papiloDEPRECATED: Stream data processing micro-framework
Stars: ✭ 24 (-82.73%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+9.35%)
Gcp Data Engineer ExamStudy materials for the Google Cloud Professional Data Engineering Exam
Stars: ✭ 144 (+3.6%)
metriqlThe metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (+63.31%)
AcceleratorThe Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (-1.44%)
funsiesfunsies is a lightweight workflow engine 🔧
Stars: ✭ 37 (-73.38%)
PipelinexPipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-8.63%)
lrmrLess-Resilient MapReduce framework for Go
Stars: ✭ 32 (-76.98%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1615.83%)
dbt mlPackage for dbt that allows users to train, audit and use BigQuery ML models.
Stars: ✭ 41 (-70.5%)
D6t PythonAccelerate data science
Stars: ✭ 118 (-15.11%)
etl[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+100.72%)
SupersetApache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+30571.94%)
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-58.99%)
PyRasgoHelper code to interact with Rasgo via our SDK, PyRasgo
Stars: ✭ 39 (-71.94%)
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-43.17%)
uptasticsearchAn Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (-66.19%)
preprocessyPython package for Customizable Data Preprocessing Pipelines
Stars: ✭ 34 (-75.54%)
deordie-meetupsDE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Stars: ✭ 48 (-65.47%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-85.61%)