All Categories → Data Processing → etl-framework

Top 39 etl-framework open source projects

Etlbox
A lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.
Bender
Bender - Serverless ETL Framework
Logstash
Logstash - transport and process your logs, events, or other data
Hydrograph
A visual ETL development and debugging tool for big data
Openkettlewebui
一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Waterdrop
Production Ready Data Integration Product, documentation:
Hale
(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Dig Etl Engine
Download DIG to run on your laptop or server.
Globalbioticinteractions
Global Biotic Interactions provides access to existing species interaction datasets
Stetl
Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Getting Started
This repository is a getting started guide to Singer.
Etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Noflo
Flow-based programming for JavaScript
hotsub
Command line tool to run batch jobs concurrently with ETL framework on AWS or other cloud computing resources
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
cubetl
CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
OpenKettleWebUI
一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
link-move
A model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.
DIRECT
DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.
1-39 of 39 etl-framework projects