Fivetran's ad reporting dbt package. Combine your Facebook, Google, Pinterest, Linkedin, Twitter, Snapchat and Microsoft advertising spend using this package.

Stars: ✭ 68 (-51.08%)

Mutual labels: dbt

get smarties

Dummy variable generation with fit/transform capabilities

Stars: ✭ 23 (-83.45%)

Mutual labels: data-engineering

Gspread Pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

Stars: ✭ 226 (+62.59%)

Mutual labels: data-engineering

datart

Datart is a next generation Data Visualization Open Platform

Stars: ✭ 1,042 (+649.64%)

Mutual labels: data-engineering

Yuniql

Free and open source schema versioning and database migration made natively with .NET Core.

Stars: ✭ 156 (+12.23%)

Mutual labels: data-engineering

big-data-engineering-indonesia

A curated list of big data engineering tools, resources and communities.

Stars: ✭ 26 (-81.29%)

Mutual labels: data-engineering

Data Engineering Howto

A list of useful resources to learn Data Engineering from scratch

Stars: ✭ 2,056 (+1379.14%)

Mutual labels: data-engineering

dbt-airflow-docker-compose

Execution of DBT models using Apache Airflow through Docker Compose

Stars: ✭ 76 (-45.32%)

Mutual labels: dbt

Butterfree

A tool for building feature stores.

Stars: ✭ 126 (-9.35%)

Mutual labels: data-engineering

soda-spark

Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

Stars: ✭ 58 (-58.27%)

Mutual labels: data-engineering

Just Dashboard

📊 📋 Dashboards using YAML or JSON files

Stars: ✭ 1,511 (+987.05%)

Mutual labels: data-engineering

prefect-saturn

Python client for using Prefect Cloud with Saturn Cloud

Stars: ✭ 15 (-89.21%)

Mutual labels: data-engineering

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-43.17%)

Mutual labels: data-engineering

dbt-databricks

A dbt adapter for Databricks.

Stars: ✭ 115 (-17.27%)

Mutual labels: dbt

Ansible Playbook

Ansible playbook to deploy distributed technologies

Stars: ✭ 61 (-56.12%)

Mutual labels: data-engineering

Waimak

Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.

Stars: ✭ 60 (-56.83%)

Mutual labels: data-engineering

Every Single Day I Tldr

A daily digest of the articles or videos I've found interesting, that I want to share with you.

Stars: ✭ 249 (+79.14%)

Mutual labels: data-engineering

polygon-etl

ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub

Stars: ✭ 53 (-61.87%)

Mutual labels: data-engineering

Ploomber

A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.

Stars: ✭ 221 (+58.99%)

Mutual labels: data-engineering

contessa

Easy way to define, execute and store quality rules for your data.

Stars: ✭ 17 (-87.77%)

Mutual labels: data-engineering

Aws Serverless Data Lake Framework

Enterprise-grade, production-hardened, serverless data lake on AWS

Stars: ✭ 179 (+28.78%)

Mutual labels: data-engineering

dbt2looker

Generate lookml for views from dbt models

Stars: ✭ 119 (-14.39%)

Mutual labels: dbt

Auptimizer

An automatic ML model optimization tool.

Stars: ✭ 166 (+19.42%)

Mutual labels: data-engineering

papilo

DEPRECATED: Stream data processing micro-framework

Stars: ✭ 24 (-82.73%)

Mutual labels: data-engineering

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (+9.35%)

Mutual labels: data-engineering

Azure-Certification-DP-200

Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution

Stars: ✭ 54 (-61.15%)

Mutual labels: data-engineering

Gcp Data Engineer Exam

Study materials for the Google Cloud Professional Data Engineering Exam

Stars: ✭ 144 (+3.6%)

Mutual labels: data-engineering

metriql

The metrics layer for your data. Join us at https://metriql.com/slack

Stars: ✭ 227 (+63.31%)

Mutual labels: dbt

Accelerator

The Accelerator is a tool for fast and reproducible processing of large amounts of data.

Stars: ✭ 137 (-1.44%)

Mutual labels: data-engineering

funsies

funsies is a lightweight workflow engine 🔧

Stars: ✭ 37 (-73.38%)

Mutual labels: data-engineering

Pipelinex

PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more

Stars: ✭ 127 (-8.63%)

Mutual labels: data-engineering

lrmr

Less-Resilient MapReduce framework for Go

Stars: ✭ 32 (-76.98%)

Mutual labels: data-engineering

Aws Data Wrangler

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Stars: ✭ 2,385 (+1615.83%)

Mutual labels: data-engineering

dbt ml

Package for dbt that allows users to train, audit and use BigQuery ML models.

Stars: ✭ 41 (-70.5%)

Mutual labels: dbt

D6t Python

Accelerate data science

Stars: ✭ 118 (-15.11%)

Mutual labels: data-engineering

etl

[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library