All Projects → Udacity Data Engineering → Similar Projects or Alternatives

8704 Open source projects that are alternatives of or similar to Udacity Data Engineering

Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (-19.1%)
Mutual labels:  aws, spark, etl, postgresql, redshift
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+791.01%)
Mutual labels:  s3, airflow, spark, redshift
Locopy
locopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-17.98%)
Mutual labels:  aws, s3, etl, redshift
Aws Ecs Airflow
Run Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (+20.22%)
Mutual labels:  aws, airflow, etl
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (+191.01%)
Mutual labels:  aws, jupyter-notebook, spark
Awesome Aws
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
Stars: ✭ 9,895 (+11017.98%)
Mutual labels:  aws, s3, redshift
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Stars: ✭ 20,147 (+22537.08%)
Mutual labels:  spark, postgresql, redshift
Storagetapper
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+160.67%)
Mutual labels:  s3, etl, postgresql
Data Engineering Nanodegree
Projects done in the Data Engineering Nanodegree by Udacity.com
Stars: ✭ 151 (+69.66%)
Mutual labels:  aws, jupyter-notebook, cassandra
astro
Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (-11.24%)
Mutual labels:  airflow, etl, s3
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+2579.78%)
Mutual labels:  aws, etl, redshift
Firecamp
Serverless Platform for the stateful services
Stars: ✭ 194 (+117.98%)
Mutual labels:  aws, postgresql, cassandra
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+1242.7%)
Mutual labels:  airflow, spark, etl
Deploy Strapi On Aws
Deploying a Strapi API on AWS (EC2 & RDS & S3)
Stars: ✭ 121 (+35.96%)
Mutual labels:  aws, s3, postgresql
Agile data code 2
Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+364.04%)
Mutual labels:  airflow, jupyter-notebook, spark
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-71.91%)
Mutual labels:  airflow, s3, redshift
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+414.61%)
Mutual labels:  aws, airflow, cassandra
Dev Setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+6180.9%)
Mutual labels:  aws, spark, postgresql
Awslib scala
An idiomatic Scala wrapper around the AWS Java SDK
Stars: ✭ 20 (-77.53%)
Mutual labels:  aws, s3
Aws Auto Terminate Idle Emr
AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-76.4%)
Mutual labels:  aws, etl
Tbls
tbls is a CI-Friendly tool for document a database, written in Go.
Stars: ✭ 940 (+956.18%)
Mutual labels:  postgresql, redshift
Dropdot
☁️ Direct Upload to Amazon S3 With CORS demo. Built with Node/Express
Stars: ✭ 87 (-2.25%)
Mutual labels:  aws, s3
Sparkmagic
Jupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+971.91%)
Mutual labels:  jupyter-notebook, spark
Vagrant Projects
Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-61.8%)
Mutual labels:  spark, cassandra
Terraform Aws Redshift
Terraform module which creates Redshift resources on AWS
Stars: ✭ 36 (-59.55%)
Mutual labels:  aws, redshift
Nagios Plugins
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+1023.6%)
Mutual labels:  aws, cassandra
Panther
Detect threats with log data and improve cloud security posture
Stars: ✭ 885 (+894.38%)
Mutual labels:  aws, etl
Tedsds
Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-84.27%)
Mutual labels:  jupyter-notebook, spark
Workshop Donkeytracker
Workshop to build a serverless tracking application for your mobile device with an AWS backend
Stars: ✭ 27 (-69.66%)
Mutual labels:  aws, s3
Dockerfiles
50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+851.69%)
Mutual labels:  spark, cassandra
Objinsync
Continuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (-67.42%)
Mutual labels:  s3, airflow
Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+974.16%)
Mutual labels:  aws, etl
Aws S3 Scala
Scala client for Amazon S3
Stars: ✭ 35 (-60.67%)
Mutual labels:  aws, s3
S3 Deploy Website
Deploy website to S3/CloudFront from Python
Stars: ✭ 26 (-70.79%)
Mutual labels:  aws, s3
Spark python ml examples
Spark 2.0 Python Machine Learning examples
Stars: ✭ 87 (-2.25%)
Mutual labels:  aws, spark
Pixiedust
Python Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1021.35%)
Mutual labels:  jupyter-notebook, spark
Airflow On Kubernetes
Bare minimal Airflow on Kubernetes (Local, EKS, AKS)
Stars: ✭ 38 (-57.3%)
Mutual labels:  aws, airflow
Ether sql
A python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-53.93%)
Mutual labels:  etl, postgresql
Aws Data Replication Hub
Seamless User Interface for replicating data into AWS.
Stars: ✭ 40 (-55.06%)
Mutual labels:  aws, s3
Simple S3 Setup
Code examples used in the post "How to Setup Amazon S3 in a Django Project"
Stars: ✭ 46 (-48.31%)
Mutual labels:  aws, s3
Ddlparse
DDL parase and Convert to BigQuery JSON schema and DDL statements
Stars: ✭ 52 (-41.57%)
Mutual labels:  postgresql, redshift
Aws Testing Library
Chai (https://chaijs.com) and Jest (https://jestjs.io/) assertions for testing services built with aws
Stars: ✭ 52 (-41.57%)
Mutual labels:  aws, s3
Aws Utilities
Docker images and scripts to deploy to AWS
Stars: ✭ 52 (-41.57%)
Mutual labels:  aws, s3
Scrapy S3pipeline
Scrapy pipeline to store chunked items into Amazon S3 or Google Cloud Storage bucket.
Stars: ✭ 57 (-35.96%)
Mutual labels:  aws, s3
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-74.16%)
Mutual labels:  jupyter-notebook, spark
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+1007.87%)
Mutual labels:  jupyter-notebook, spark
Kiba Plus
Kiba enhancement for Ruby ETL.
Stars: ✭ 47 (-47.19%)
Mutual labels:  etl, postgresql
Dbbench
🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts
Stars: ✭ 52 (-41.57%)
Mutual labels:  postgresql, cassandra
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-34.83%)
Mutual labels:  s3, spark
Pysparkgeoanalysis
🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-29.21%)
Mutual labels:  jupyter-notebook, spark
S3reverse
The format of various s3 buckets is convert in one format. for bugbounty and security testing.
Stars: ✭ 61 (-31.46%)
Mutual labels:  aws, s3
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-28.09%)
Mutual labels:  jupyter-notebook, spark
React Deploy S3
Deploy create react app's in AWS S3
Stars: ✭ 66 (-25.84%)
Mutual labels:  aws, s3
Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-32.58%)
Mutual labels:  jupyter-notebook, spark
Terraform Aws S3 Log Storage
This module creates an S3 bucket suitable for receiving logs from other AWS services such as S3, CloudFront, and CloudTrail
Stars: ✭ 65 (-26.97%)
Mutual labels:  aws, s3
S3 Blob Store
☁️ Amazon S3 blob-store
Stars: ✭ 66 (-25.84%)
Mutual labels:  aws, s3
Etl with python
ETL with Python - Taught at DWH course 2017 (TAU)
Stars: ✭ 68 (-23.6%)
Mutual labels:  jupyter-notebook, etl
Cloud Security Audit
A command line security audit tool for Amazon Web Services
Stars: ✭ 68 (-23.6%)
Mutual labels:  aws, s3
Sql Runner
Run templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake
Stars: ✭ 68 (-23.6%)
Mutual labels:  postgresql, redshift
Aws Inventory
Python script for AWS resources inventory (cheaper than AWS Config)
Stars: ✭ 69 (-22.47%)
Mutual labels:  aws, s3
1-60 of 8704 similar projects