Udacity Data Engineering ProjectsFew projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+174.25%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-85.03%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+3.59%)
Data Science Stack Cookiecutter🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (-8.38%)
pipelinePipelineAI Kubeflow Distribution
Stars: ✭ 4,154 (+2387.43%)
beneathBeneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-61.08%)
viewflowViewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (-34.13%)
QuillCompile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (+1096.41%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-68.26%)
SqlpadWeb-based SQL editor run in your own private cloud. Supports MySQL, Postgres, SQL Server, Vertica, Crate, ClickHouse, Trino, Presto, SAP HANA, Cassandra, Snowflake, BigQuery, SQLite, and more with ODBC
Stars: ✭ 4,113 (+2362.87%)
MigrateDatabase migrations. CLI and Golang library.
Stars: ✭ 7,712 (+4517.96%)
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (-13.77%)
airflow-dbt-pythonA collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-33.53%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+374.85%)
MigrateDatabase migrations. CLI and Golang library.
Stars: ✭ 2,315 (+1286.23%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-88.02%)
astroAstro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (-52.69%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-19.16%)
agentJob tracker & performance platform
Stars: ✭ 26 (-84.43%)
skyline-querySimple implementation of spatial skyline query algorithms
Stars: ✭ 17 (-89.82%)
factoryDocker microservice & Crawler by scrapy
Stars: ✭ 56 (-66.47%)
FsCassyFunctional F# API for Cassandra
Stars: ✭ 20 (-88.02%)
cassandra-migrationApache Cassandra / DataStax Enterprise database migration (schema evolution) library
Stars: ✭ 51 (-69.46%)
wait-for-pgCheck if PostgreSQL database is ready
Stars: ✭ 22 (-86.83%)
general-angularRealtime Angular Admin/CRUD Front End App
Stars: ✭ 24 (-85.63%)
scrapy.dartScrapy, a fast high-level web crawling & scraping framework for dart and Flutter
Stars: ✭ 50 (-70.06%)
scraping-ebayScraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (-52.69%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-77.25%)
erdiagramEntity-Relationship diagram code generator library
Stars: ✭ 28 (-83.23%)
create-fastify-appAn utility that help you to generate or add plugin to your Fastify project
Stars: ✭ 53 (-68.26%)
pg migrateManage postgres schema, triggers, procedures, and views
Stars: ✭ 25 (-85.03%)
kubernetes-examplesA bunch of examples of how to deploy things on kubernetes
Stars: ✭ 34 (-79.64%)
libpq.frameworkAn XCode project to compile your own libpq.framework for iOS 11.x
Stars: ✭ 27 (-83.83%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-91.02%)
nim-gatabaseConnection-Pooling Compile-Time ORM for Nim
Stars: ✭ 103 (-38.32%)
cassandra.realtimeDifferent ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink
Stars: ✭ 25 (-85.03%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-80.84%)
Commando[DEPRECATED] ⚫ Commando Discord bot built on discord.js-commando.
Stars: ✭ 78 (-53.29%)
pg-error-enumTypeScript Enum for Postgres Errors with no runtime dependencies. Also compatible with plain JavaScript.
Stars: ✭ 18 (-89.22%)
hivebergDemonstration of a Hive Input Format for Iceberg
Stars: ✭ 22 (-86.83%)
metamapperMetamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.
Stars: ✭ 60 (-64.07%)
gallia-coreA schema-aware Scala library for data transformation
Stars: ✭ 44 (-73.65%)
elves🎊 Design and implement of lightweight crawler framework.
Stars: ✭ 322 (+92.81%)
logparserA tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.
Stars: ✭ 70 (-58.08%)
EFCore.CassandraEntity Framework Core provider for Cassandra
Stars: ✭ 23 (-86.23%)