pyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 970 (+5962.5%)
Mutual labels: data-engineering
pangeo-forge-recipesPython library for building Pangeo Forge recipes.
Stars: ✭ 64 (+300%)
Mutual labels: data-engineering
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+3725%)
Mutual labels: data-engineering
gallia-coreA schema-aware Scala library for data transformation
Stars: ✭ 44 (+175%)
Mutual labels: data-engineering
DataEngineeringThis repo contains commands that data engineers use in day to day work.
Stars: ✭ 47 (+193.75%)
Mutual labels: data-engineering
uptasticsearchAn Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+193.75%)
Mutual labels: data-engineering
beneathBeneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (+306.25%)
Mutual labels: data-engineering
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (+56.25%)
Mutual labels: data-engineering
yt-channels-DS-AI-ML-CSA comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+6387.5%)
Mutual labels: data-engineering
neon-workshopA Pachyderm deep learning tutorial for conference workshops
Stars: ✭ 19 (+18.75%)
Mutual labels: data-engineering
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+800%)
Mutual labels: data-engineering
mpc-DL-controllerDeep Neural Network architecture as a predictive optimal controller for {HVAC+Solar cell + battery} disturbance afflicted system vs classic Model Predictive Control
Stars: ✭ 37 (+131.25%)
Mutual labels: data-engineering
h4sci-courseETH PhD Program course
Stars: ✭ 19 (+18.75%)
Mutual labels: data-engineering
arthur-redshift-etlELT Code for your Data Warehouse
Stars: ✭ 22 (+37.5%)
Mutual labels: data-engineering
dbt-sugardbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
Stars: ✭ 139 (+768.75%)
Mutual labels: data-engineering
ml-in-productionThe practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (+81.25%)
Mutual labels: data-engineering
AirflowDataPipelineExample of an ETL Pipeline using Airflow
Stars: ✭ 24 (+50%)
Mutual labels: data-engineering
growthbookOpen Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+14537.5%)
Mutual labels: data-engineering
Kaggle-project-listSummary of my projects on kaggle
Stars: ✭ 20 (+25%)
Mutual labels: data-engineering