Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+3072%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-20%)
career-resourcesSome SWE/PM/Designer related career resources for students
Stars: ✭ 154 (+516%)
collectorA job board data collector
Stars: ✭ 27 (+8%)
ob bulkstashBulk Stash is a docker rclone service to sync, or copy, files between different storage services. For example, you can copy files either to or from a remote storage services like Amazon S3 to Google Cloud Storage, or locally from your laptop to a remote storage.
Stars: ✭ 113 (+352%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+112%)
DataEngineeringThis repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+88%)
astroAstro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (+216%)
counter-interview.deva collaborative collection of interview questions collected from both sides of the game: Interviewer(s) and Interviewee.
Stars: ✭ 102 (+308%)
js jobs botJS Jobs search telegram channel
Stars: ✭ 24 (-4%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+9440%)
Data Engineering HowtoA list of useful resources to learn Data Engineering from scratch
Stars: ✭ 2,056 (+8124%)
Locopylocopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (+192%)
vagasMural de vagas para desenvolvedor Android.
Stars: ✭ 748 (+2892%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+2432%)
Cluster PackA library on top of either pex or conda-pack to make your Python code easily available on a cluster
Stars: ✭ 23 (-8%)
Awesome AwsA curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
Stars: ✭ 9,895 (+39480%)
Udacity Data Engineering ProjectsFew projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (+1732%)
ObjinsyncContinuously synchronize directories from remote object store to local filesystem
Stars: ✭ 29 (+16%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+592%)
airflow-dbt-pythonA collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Stars: ✭ 111 (+344%)
aws-pdf-textract-pipeline🔍 Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
Stars: ✭ 141 (+464%)
udacity-data-eng-proj2A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract data from S3, apply a series of transformations and load into S3 and Redshift.
Stars: ✭ 25 (+0%)
viewflowViewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+340%)
saisokuSaisoku is a Python module that helps you build complex pipelines of batch file/directory transfer/sync jobs.
Stars: ✭ 40 (+60%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+56%)
growthbookOpen Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+9268%)
telleryTellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
Stars: ✭ 219 (+776%)
YuniqlFree and open source schema versioning and database migration made natively with .NET Core.
Stars: ✭ 156 (+524%)
Remote JobsA list of semi to fully remote-friendly companies (jobs) in tech.
Stars: ✭ 17,863 (+71352%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+404%)
FoundatioPluggable foundation blocks for building distributed apps.
Stars: ✭ 1,365 (+5360%)
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (+132%)
opentrials-airflowConfiguration and definitions of Airflow for OpenTrials
Stars: ✭ 18 (-28%)
dbt-ml-preprocessingA SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
Stars: ✭ 128 (+412%)
gallia-coreA schema-aware Scala library for data transformation
Stars: ✭ 44 (+76%)
junior.guruLearn to code and get your first job in tech 🐣
Stars: ✭ 27 (+8%)
LetsHackNotes & HowTo's covering the Raspberry Pi, Arduino, ESP8266, ESP32, etc.
Stars: ✭ 37 (+48%)
node-redshiftA simple collection of tools to help you get started with Amazon Redshift from node.js
Stars: ✭ 66 (+164%)
gozeitGoZeit
Stars: ✭ 19 (-24%)
s3-proxyS3 Reverse Proxy with GET, PUT and DELETE methods and authentication (OpenID Connect and Basic Auth)
Stars: ✭ 106 (+324%)
s3storageSimple rails plugin that makes it easy to store uploaded files on Amazon S3
Stars: ✭ 15 (-40%)
Dive-Into-AWSLinks to the Repos and Sections in our Dive into AWS Course.
Stars: ✭ 27 (+8%)
vagas💼 É dev? É devops? É bom? Quer mexer com muita tecnologia e desafios? Vem pro match!
Stars: ✭ 21 (-16%)
jobor支持秒级分布式定时任务系统, A high performance distributed task scheduling system, Support multi protocol scheduling tasks
Stars: ✭ 52 (+108%)
vagasVagas e empresas que ativamente contratam pessoas desenvolvedoras Clojure no Brasil
Stars: ✭ 75 (+200%)
herd-mdlHerd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
Stars: ✭ 11 (-56%)
terraform-aws-cloudtrailTerraform module to provision an AWS CloudTrail and an encrypted S3 bucket with versioning to store CloudTrail logs
Stars: ✭ 78 (+212%)
hiring-systemCodeCareer is seeking core contributors to take the lead on this project.
Stars: ✭ 16 (-36%)
firehoserA wrapper around AWS Kinesis Firehose with retry logic and custom queuing behavior. Requires node >= 6.0.0
Stars: ✭ 22 (-12%)