All Projects → Hiflylabs → awesome-dbt

Hiflylabs / awesome-dbt

Licence: GPL-3.0 license
A curated list of awesome dbt resources

Projects that are alternatives of or similar to awesome-dbt

airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-78.65%)
Mutual labels:  data-engineering, dbt
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
Stars: ✭ 139 (-73.27%)
Mutual labels:  data-engineering, dbt
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-76.54%)
Mutual labels:  data-engineering
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (-56.54%)
Mutual labels:  data-engineering
Data Engineering Nanodegree
Projects done in the Data Engineering Nanodegree by Udacity.com
Stars: ✭ 151 (-70.96%)
Mutual labels:  data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-75.77%)
Mutual labels:  data-engineering
Yuniql
Free and open source schema versioning and database migration made natively with .NET Core.
Stars: ✭ 156 (-70%)
Mutual labels:  data-engineering
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+190.58%)
Mutual labels:  data-engineering
Every Single Day I Tldr
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (-52.12%)
Mutual labels:  data-engineering
Gcp Data Engineer Exam
Study materials for the Google Cloud Professional Data Engineering Exam
Stars: ✭ 144 (-72.31%)
Mutual labels:  data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (-65.58%)
Mutual labels:  data-engineering
Data Engineering Howto
A list of useful resources to learn Data Engineering from scratch
Stars: ✭ 2,056 (+295.38%)
Mutual labels:  data-engineering
Pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-75.58%)
Mutual labels:  data-engineering
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-68.08%)
Mutual labels:  data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+358.65%)
Mutual labels:  data-engineering
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (-57.5%)
Mutual labels:  data-engineering
D6t Python
Accelerate data science
Stars: ✭ 118 (-77.31%)
Mutual labels:  data-engineering
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-70.77%)
Mutual labels:  data-engineering
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (-73.65%)
Mutual labels:  data-engineering
Elastik Nearest Neighbors
Go to: https://github.com/alexklibisz/elastiknn
Stars: ✭ 249 (-52.12%)
Mutual labels:  data-engineering

Awesome dbt Awesome GitHub contributors GitHub commit activity

Welcome to the awesome curated list of dbt resources!

Any kind of contribution is greatly encouraged and appreciated. For making a contribution, please check the contribution guidelines first! Add new entries on the top of sections (LIFO) to keep fresh items more visible! Also, feel free to add new sections.

Happy contributing!

Contents

Get Started

Courses from where you can get started with Analytics Engineering.

How To

Helping hand on setting up integrations and implementing best practices.

Integrations

Collection of known data integrations with dbt

  • Datafold - Gives a quick print out summary of changes so you can move fast and (not) break stuff!
  • Raycast dbt Metadata - Queries the dbt Cloud API to return some useful information about your models (number of tests, time they took to run etc…).
  • Cube - APIs, Caching, and Access Control on top of dbt Metrics.
  • FlexIt Analytics - Business Intelligence platform with deep dbt Cloud and CLI integration.
  • Raycast dbt Jobs - Raycast integration to monitor dbt Cloud Jobs.
  • Metaplane - Data Observaibility layer on top of your dbt + BI project.
  • Dbt + Machine Learning: What makes a great baton pass? - Landscape of ML utilities around dbt.
  • Soda - Integration of Soda's data observability platform and dbt.
  • Supported Adapters - Offically supported database adapters.
  • Lightdash - Open source Looker alternative with deep dbt integration.
  • Superset - Open source visualization layer for your Modern Data Stack.
  • Dagster and dbt: Better Together - Getting started with the dagster-dbt library.
  • fal - Add multi-language support (Python) to your dbt project.
  • Privacy Dynamics - Anonymize data in your dbt project.
  • prefect-dbt - Collection of Prefect integrations for working with dbt with your Prefect flows.

User Stories

Use-cases and user stories implemented by the community members using components of the MDS with dbt.

Data Quality

Best-practices and extensions of the testing framework.

CI/CD

Make the best out of your product quality and seamless delivery.

Orchestration

Resources to manage and maintain dependencies in modern data pipelines.

Utilities

Useful tools and extensions to bump up your analytics engineer worklow.

Packages

Community-developed packages to extend default macros and toolset.

  • dbt_linreg - Linear regression in SQL using dbt.
  • dbt-snowflake-query-tags - Automatically tag dbt-issued queries with informative metadata.
  • snowflake-resource-monitoring - Yet another package to monitor Snowflake usage.
  • usagedata - Provides insights on the database/table level usage informations from Snowflake.
  • dbt_ml - Package for dbt that allows users to train, audit and use BigQuery ML models.
  • ddbt - This repo represents my attempt to build a fast version of DBT which gets very slow on large projects (3000+ data models). This project attempts to be a direct drop in replacement for DBT at the command line.
  • dbt-snowflake-monitoring - A dbt package to help you monitor Snowflake performance and costs.
  • datavault4dbt - Macros for staging and creation of all DataVault-Entities you need, to build your own DataVault2.0 solution.
  • DDO - Perform DataOps & administrative CI/CD on your data warehouse.
  • dbt-yaml-check - Checks that columns defined in YAML also exist in SQL.
  • data-diff - A command-line tool and Python library to efficiently diff rows across two different databases.
  • dbt-project-evaluator - This package highlights areas of a dbt project that are misaligned with dbt Labs' best practices.
  • dbt_constraints - Generate database constraints based on the tests in a dbt project.
  • dbt-date - Date logic and calendar functionality.
  • dbt-privacy - Macros to make it easier to protect your customers' data.
  • dbt-fivetran-utils - General macros and helpers.
  • dbt_metrics - Macros to support secondary calculations and generate business metrics.
  • dbt-metabase - Model synchronization from dbt to Metabase.
  • dbt-coves - CLI tool for generating a scaffold for your dbt project.
  • dbt-profiler - Data profiling and doc block generator.
  • dbt_utils - General macros library. A must have.
  • dbt_audit_helper - Macros for data audits that compare columns values and schemas between tables.
  • dbt-ml-preprocessing - A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
  • dbt-external-tables - Macros to stage your external sources.
  • dbt-feature-store - Macros to build a feature store right within your dbt project.
  • dbt-codegen - Macros that generate dbt code, and log it to the command line.
  • dbt-init - Create a project and populate as much of the dbt project as possible.
  • dbt-artifacts - This package builds a mart of tables from dbt artifacts loaded into a table.
  • dbt-erdiagram-generator - This packages generate ERD diagrams from a dbt project.
  • Terraform-dbt Cloud Module - IAC in dbt Cloud via Terraform.
  • dbt2looker - Generate Looker views for dbt models.
  • dbt-coverage - Checks dbt docs and tests coverage.
  • dbt-meta-testing - Yet another coverage testing.
  • dbt-superset-lineage - Push and pull metadata between dbt to Superset.
  • dbtvault - Package for generating and executing ETL for Data Vault 2.0.
  • dbt-invoke - CLI for creating, updating, and deleting dbt property files.
  • dbt-unit-testing - Package which contains macros to support unit testing.

Community

Conferences, meetups, dicussions, newsletters, podcasts, etc. led by fellow analytics engineers and forums of contact.

Sample Projects

Sample projects which work out-of-the box. Reflect use-cases publicly available.

Contributors

Thanks for all the great resources! Can't see your avatar? Check the contribution guide on how you can submit your resources to the community!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].