vixtractwww.vixtract.ru
Stars: ✭ 40 (-40.3%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-64.18%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-41.79%)
etlflowEtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (-43.28%)
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+813.43%)
DIRECTDIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.
Stars: ✭ 20 (-70.15%)
BETL-oldBETL. Meta data driven ETL generation using T-SQL
Stars: ✭ 17 (-74.63%)
Go StreamsA lightweight stream processing library for Go
Stars: ✭ 615 (+817.91%)
link-moveA model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.
Stars: ✭ 32 (-52.24%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+2670.15%)
Hale(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (+25.37%)
qweryA SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-58.21%)
SmooksAn extensible Java framework for building XML and non-XML streaming applications
Stars: ✭ 293 (+337.31%)
Tuna🐟 A streaming ETL for fish
Stars: ✭ 11 (-83.58%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+88.06%)
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (+86.57%)
OpenKettleWebUI一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+105.97%)
cubetlCubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (-68.66%)
DataBridge.NETConfigurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-76.12%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+114.93%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+455.22%)
EtlalchemyExtract, Transform, Load: Any SQL Database in 4 lines of Code.
Stars: ✭ 460 (+586.57%)
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-14.93%)
Pyetlpython ETL framework
Stars: ✭ 33 (-50.75%)
Getting StartedThis repository is a getting started guide to Singer.
Stars: ✭ 734 (+995.52%)
StetlStetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (-4.48%)
RikoA Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+2244.78%)
Metlmito ETL tool
Stars: ✭ 153 (+128.36%)
BenderBender - Serverless ETL Framework
Stars: ✭ 171 (+155.22%)
EtlboxA lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.
Stars: ✭ 203 (+202.99%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+438.81%)
BenthosFancy stream processing made operationally mundane
Stars: ✭ 3,705 (+5429.85%)
Openkettlewebui一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (+86.57%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-70.15%)
YaEtlYet Another ETL in PHP
Stars: ✭ 60 (-10.45%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-20.9%)
Perfect-KafkaAn Express Swift Client of Apache Kafka 0.8, the Stream Processing Platform
Stars: ✭ 20 (-70.15%)
flockFlock: A Low-Cost Streaming Query Engine on FaaS Platforms
Stars: ✭ 232 (+246.27%)
ip2location-csv-converterThis PHP script converts IP2Location CSV database into IP range or CIDR format.
Stars: ✭ 26 (-61.19%)
proc-thatproc(ess)-that - easy extendable ETL tool for Node.js. Written in TypeScript.
Stars: ✭ 25 (-62.69%)
distogramA library to compute histograms on distributed environments, on streaming data
Stars: ✭ 19 (-71.64%)
VBA-CSVCSV Parser and Writer as VBA functions
Stars: ✭ 26 (-61.19%)
product-spAn open source, cloud-native streaming data integration and analytics product optimized for agile digital businesses
Stars: ✭ 80 (+19.4%)
mediapipe plusThe purpose of this project is to apply mediapipe to more AI chips.
Stars: ✭ 38 (-43.28%)
vectorA high-performance observability data pipeline.
Stars: ✭ 12,138 (+18016.42%)
zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+877.61%)
daggerDagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.
Stars: ✭ 238 (+255.22%)
openrefine-batchShell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (+13.43%)
go-riversCollection of stream processing / multiplexing / networking libs in Go
Stars: ✭ 35 (-47.76%)
zdh web大数据采集,抽取平台
Stars: ✭ 292 (+335.82%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (-71.64%)
SwiftBuilderSwiftBuilder is a fast way to assign new value to the property of the object.
Stars: ✭ 26 (-61.19%)
makinageStream Processing Made Easy
Stars: ✭ 31 (-53.73%)
openPDCOpen Source Phasor Data Concentrator
Stars: ✭ 109 (+62.69%)