All Projects → Around Dataengineering → Similar Projects or Alternatives

1456 Open source projects that are alternatives of or similar to Around Dataengineering

Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+78.21%)
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+208.56%)
Mutual labels:  airflow, spark, data-engineering
Toc
A Table of Contents of all Gruntwork Code
Stars: ✭ 111 (-56.81%)
Mutual labels:  devops, infrastructure
Waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-76.65%)
Mutual labels:  spark, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-69.26%)
Mutual labels:  spark, data-engineering
Awesome Devops
A curated list of resources for Devops
Stars: ✭ 697 (+171.21%)
Mutual labels:  devops, infrastructure
Defcon24 Infra Monitoring Workshop
Defcon24 Workshop Contents : Ninja Level Infrastructure Monitoring
Stars: ✭ 104 (-59.53%)
Mutual labels:  devops, infrastructure
Docker practice
Learn and understand Docker technologies, with real DevOps practice!
Stars: ✭ 19,768 (+7591.83%)
Mutual labels:  spark, devops
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+57.98%)
Mutual labels:  spark, devops
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+60.7%)
Mutual labels:  airflow, spark
Airflow Autoscaling Ecs
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Stars: ✭ 136 (-47.08%)
Mutual labels:  airflow, data-engineering
airflow-dbt-python
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (-56.81%)
Mutual labels:  airflow, data-engineering
Awesome Learning
A curated list for DevOps learning resources. Join the slack channel to discuss more.
Stars: ✭ 327 (+27.24%)
Mutual labels:  devops, infrastructure
Ansible For Kubernetes
Ansible and Kubernetes examples from Ansible for Kubernetes Book
Stars: ✭ 389 (+51.36%)
Mutual labels:  devops, infrastructure
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-90.66%)
Mutual labels:  airflow, data-engineering
Chef
Chef Infra, a powerful automation platform that transforms infrastructure into code automating how infrastructure is configured, deployed and managed across any environment, at any scale
Stars: ✭ 6,766 (+2532.68%)
Mutual labels:  devops, infrastructure
Minicron
🕰️ Monitor your cron jobs
Stars: ✭ 2,351 (+814.79%)
Mutual labels:  devops, infrastructure
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+2995.72%)
Mutual labels:  data-engineering, infrastructure
Dockerfiles
50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+229.57%)
Mutual labels:  spark, devops
Terrascan
Detect compliance and security violations across Infrastructure as Code to mitigate risk before provisioning cloud native infrastructure.
Stars: ✭ 2,687 (+945.53%)
Mutual labels:  devops, infrastructure
Every Single Day I Tldr
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Stars: ✭ 249 (-3.11%)
Mutual labels:  spark, data-engineering
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+902.33%)
Mutual labels:  spark, data-engineering
Airflow Pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-50.19%)
Mutual labels:  airflow, spark
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+364.98%)
Mutual labels:  airflow, spark
awesome-open-mlops
The Fuzzy Labs guide to the universe of open source MLOps
Stars: ✭ 304 (+18.29%)
Mutual labels:  infrastructure, datascience
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (-57.2%)
Mutual labels:  airflow, data-engineering
Data-Engineering-Projects
Personal Data Engineering Projects
Stars: ✭ 167 (-35.02%)
Mutual labels:  airflow, data-engineering
k3ai
A lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
Stars: ✭ 105 (-59.14%)
Mutual labels:  airflow, datascience
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-90.27%)
Mutual labels:  airflow, data-engineering
Howtheysre
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
Stars: ✭ 6,962 (+2608.95%)
Mutual labels:  devops, infrastructure
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-79.38%)
Mutual labels:  airflow, data-engineering
Cintodeutilidadesdocker
My Docker templates repository 🐳 ☁️ 🐳
Stars: ✭ 74 (-71.21%)
Mutual labels:  devops, infrastructure
Sceptre
Build better AWS infrastructure
Stars: ✭ 1,160 (+351.36%)
Mutual labels:  devops, infrastructure
Terraform Multienv
A template for maintaining a multiple environments infrastructure with Terraform. This template includes a CI/CD process, that applies the infrastructure in an AWS account.
Stars: ✭ 107 (-58.37%)
Mutual labels:  devops, infrastructure
funsies
funsies is a lightweight workflow engine 🔧
Stars: ✭ 37 (-85.6%)
Mutual labels:  infrastructure, data-engineering
Opunit
🕵️‍♂️ Sanity checking containers, vms, and servers
Stars: ✭ 176 (-31.52%)
Mutual labels:  devops, infrastructure
Terrahub
Terraform Automation and Orchestration Tool (Open Source)
Stars: ✭ 148 (-42.41%)
Mutual labels:  devops, infrastructure
Ansible Playbook
Ansible playbook to deploy distributed technologies
Stars: ✭ 61 (-76.26%)
Mutual labels:  data-engineering, devops
Mitogen
Distributed self-replicating programs in Python
Stars: ✭ 1,779 (+592.22%)
Mutual labels:  devops, infrastructure
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+146.3%)
Mutual labels:  spark, data-engineering
Pointblank
Data validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+86.77%)
Mutual labels:  spark, data-engineering
Ds Cheatsheets
List of Data Science Cheatsheets to rule the world
Stars: ✭ 9,452 (+3577.82%)
Mutual labels:  spark, datascience
Openuba
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-50.58%)
Mutual labels:  spark, datascience
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-40.86%)
Mutual labels:  spark, data-engineering
Data science blogs
A repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-45.91%)
Mutual labels:  spark, datascience
openverse-catalog
Identifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-89.49%)
Mutual labels:  airflow, spark
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-65.37%)
Mutual labels:  airflow, spark
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-68.09%)
Mutual labels:  airflow, data-engineering
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (-32.68%)
Mutual labels:  airflow, data-engineering
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-52.53%)
Mutual labels:  spark, data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-92.22%)
Mutual labels:  airflow, data-engineering
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-89.88%)
Mutual labels:  spark, datascience
bigkube
Minikube for big data with Scala and Spark
Stars: ✭ 16 (-93.77%)
Mutual labels:  airflow, spark
Covid19Tracker
A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-74.71%)
Mutual labels:  spark
helpdesk
Yet another helpdesk based on multiple providers
Stars: ✭ 14 (-94.55%)
Mutual labels:  airflow
redis-inventory
CLI tool to see redis memory usage by keys in hierarchical way. Think of disk inventory but for redis.
Stars: ✭ 163 (-36.58%)
Mutual labels:  infrastructure
ycsm
This is a quick script installation for resilient redirector using nginx reverse proxy and letsencrypt compatible with some popular Post-Ex Tools (Cobalt Strike, Empire, Metasploit, PoshC2).
Stars: ✭ 73 (-71.6%)
Mutual labels:  infrastructure
Youtube Videos
Documentation for Techno Tim YouTube Videos
Stars: ✭ 250 (-2.72%)
Mutual labels:  devops
infra
🚀 INFRA: your infrastructure as a GraphQL service
Stars: ✭ 48 (-81.32%)
Mutual labels:  infrastructure
blog
blog entries
Stars: ✭ 39 (-84.82%)
Mutual labels:  spark
1-60 of 1456 similar projects