hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+3725%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+50%)
EtlboxA lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.
Stars: ✭ 203 (+1168.75%)
etlflowEtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (+137.5%)
csvpluscsvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+318.75%)
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (+681.25%)
Pyetlpython ETL framework
Stars: ✭ 33 (+106.25%)
Mara PipelinesA lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+11406.25%)
OpenKettleWebUI一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+762.5%)
Openkettlewebui一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (+681.25%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+143.75%)
vixtractwww.vixtract.ru
Stars: ✭ 40 (+150%)
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+381.25%)
Metlmito ETL tool
Stars: ✭ 153 (+856.25%)
maxwell-sinkconsume maxwell generated message from kafka,export it to another mysql.
Stars: ✭ 16 (+0%)
StetlStetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (+300%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+800%)
qweryA SQL-like language for performing ETL transformations.
Stars: ✭ 28 (+75%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+2225%)
EtlalchemyExtract, Transform, Load: Any SQL Database in 4 lines of Code.
Stars: ✭ 460 (+2775%)
Hale(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (+425%)
Getting StartedThis repository is a getting started guide to Singer.
Stars: ✭ 734 (+4487.5%)
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+30643.75%)
BenderBender - Serverless ETL Framework
Stars: ✭ 171 (+968.75%)
cubetlCubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (+31.25%)
link-moveA model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.
Stars: ✭ 32 (+100%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+2156.25%)
mydataharbor🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (+75%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+687.5%)
DIRECTDIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.
Stars: ✭ 20 (+25%)
BETL-oldBETL. Meta data driven ETL generation using T-SQL
Stars: ✭ 17 (+6.25%)
django-calaccess-raw-dataA Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
Stars: ✭ 61 (+281.25%)
oopt-taiTIP OOPT - Transponder Abstraction Interface
Stars: ✭ 44 (+175%)
flockFlock: A Low-Cost Streaming Query Engine on FaaS Platforms
Stars: ✭ 232 (+1350%)
waspyWASP framework for Python
Stars: ✭ 43 (+168.75%)
sync-engine-exampleSynchronization Algorithm Exploration: Techniques to synchronize a SQL database with external destinations.
Stars: ✭ 17 (+6.25%)
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+256.25%)
sql-to-redis🔄 Simple tool for ETL. From SQL to Redis.
Stars: ✭ 18 (+12.5%)
starlakeStarlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (+0%)
TrenitaliaAPIReverse engineering dell'API dell'app di Trenitalia
Stars: ✭ 32 (+100%)
CogStack-NiFiBuilding data processing pipelines for documents processing with NLP using Apache NiFi and related services
Stars: ✭ 22 (+37.5%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+231.25%)
DQCS数据质量控制系统
Stars: ✭ 34 (+112.5%)
doctoral-thesis📖 Generation and Applications of Knowledge Graphs in Systems and Networks Biology
Stars: ✭ 26 (+62.5%)
public-transit-toolsTools for working with GTFS public transit data in ArcGIS
Stars: ✭ 126 (+687.5%)
proc-thatproc(ess)-that - easy extendable ETL tool for Node.js. Written in TypeScript.
Stars: ✭ 25 (+56.25%)
zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+3993.75%)
SchemaMapperA .NET class library that allows you to import data from different sources into a unified destination
Stars: ✭ 41 (+156.25%)
carbon-footprintCalculate your carbon footprint 🏭👣 from food, transport, purchases, fashion, electricity and digital activities like streaming, NFT or blockchain.
Stars: ✭ 59 (+268.75%)
cobrixA COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Stars: ✭ 109 (+581.25%)
zdh web大数据采集,抽取平台
Stars: ✭ 292 (+1725%)
napalm-logsCross-vendor normalisation for network syslog messages, following the OpenConfig and IETF YANG models
Stars: ✭ 131 (+718.75%)
csv-cruncherTreats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Stars: ✭ 32 (+100%)
django-data-migrationData migration framework for Django that migrates legacy data into your new django app
Stars: ✭ 18 (+12.5%)
covid-19Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-12.5%)