xarray-beamDistributed Xarray with Apache Beam
Stars: ✭ 83 (+29.69%)
Mutual labels: xarray, zarr
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-68.75%)
Mutual labels: etl, data-engineering
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+3626.56%)
Mutual labels: etl, data-engineering
uptasticsearchAn Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (-26.56%)
Mutual labels: etl, data-engineering
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+96.88%)
Mutual labels: etl, data-engineering
etl[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+335.94%)
Mutual labels: etl, data-engineering
DataformDataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+434.38%)
Mutual labels: etl, data-engineering
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-10.94%)
Mutual labels: etl, data-engineering
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-17.19%)
Mutual labels: etl, data-engineering
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+23.44%)
Mutual labels: etl, data-engineering
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+125%)
Mutual labels: etl, data-engineering
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+23.44%)
Mutual labels: etl, data-engineering
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+7585.94%)
Mutual labels: etl, data-engineering
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+889.06%)
Mutual labels: etl, data-engineering
hive-metastore-clientA client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (-42.19%)
Mutual labels: etl, data-engineering
etl managerA python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-78.12%)
Mutual labels: etl, data-engineering
BenthosFancy stream processing made operationally mundane
Stars: ✭ 3,705 (+5689.06%)
Mutual labels: etl, data-engineering
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+20.31%)
Mutual labels: etl, data-engineering
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+856.25%)
Mutual labels: etl, data-engineering