HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+350%)
qweryA SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-12.5%)
csvpluscsvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+109.38%)
DIRECTDIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.
Stars: ✭ 20 (-37.5%)
cubetlCubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (-34.37%)
Metlmito ETL tool
Stars: ✭ 153 (+378.13%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+293.75%)
Getting StartedThis repository is a getting started guide to Singer.
Stars: ✭ 734 (+2193.75%)
EtlboxA lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.
Stars: ✭ 203 (+534.38%)
etlflowEtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (+18.75%)
BenderBender - Serverless ETL Framework
Stars: ✭ 171 (+434.38%)
Pyetlpython ETL framework
Stars: ✭ 33 (+3.13%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+1028.13%)
Openkettlewebui一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (+290.63%)
Hale(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (+162.5%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-25%)
OpenKettleWebUI一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+331.25%)
EtlalchemyExtract, Transform, Load: Any SQL Database in 4 lines of Code.
Stars: ✭ 460 (+1337.5%)
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (+290.63%)
BETL-oldBETL. Meta data driven ETL generation using T-SQL
Stars: ✭ 17 (-46.87%)
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+1812.5%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+21.88%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+1062.5%)
DataBridge.NETConfigurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-50%)
StetlStetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (+100%)
vixtractwww.vixtract.ru
Stars: ✭ 40 (+25%)
thainThain is a distributed flow schedule platform.
Stars: ✭ 81 (+153.13%)
Bitcoin EtlETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+443.75%)
Linq2dbLinq to database provider.
Stars: ✭ 2,211 (+6809.38%)
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+15271.88%)
Usaspending ApiServer application to serve U.S. federal spending data via a RESTful API
Stars: ✭ 166 (+418.75%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+665.63%)
Open Semantic EtlPython based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (+415.63%)
Etl unicorn数据可视化, 数据挖掘, 数据处理 ETL
Stars: ✭ 156 (+387.5%)
DataX-srcDataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (-34.37%)
krawlerA minimalist (geospatial) ETL
Stars: ✭ 51 (+59.38%)
Example Airflow DagsExample DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+659.38%)
Mara Example Project 2An example mini data warehouse for python project stats, template for new projects
Stars: ✭ 154 (+381.25%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+634.38%)
Omniparseromniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Stars: ✭ 148 (+362.5%)
StoragetapperStorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+625%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+337.5%)
Mara PipelinesA lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+5653.13%)
Etl2pcapngUtility that converts an .etl file containing a Windows network packet capture into .pcapng format.
Stars: ✭ 228 (+612.5%)
Kettle Web基于spring boot通过java代码调用kette
Stars: ✭ 128 (+300%)
Reddit DetectivePlay detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (+303.13%)
wikirepoPython based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (+3.13%)
chronicle-etl📜 A CLI toolkit for extracting and working with your digital history
Stars: ✭ 78 (+143.75%)
NBiNBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (+218.75%)
ElasticR client for the Elasticsearch HTTP API
Stars: ✭ 227 (+609.38%)
Etl.netMass processing data with a complete ETL for .net developers
Stars: ✭ 129 (+303.13%)
Bulk WriterProvides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (+556.25%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-37.5%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+7353.13%)
RikoA Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+4809.38%)
CqlCategorical Query Language IDE
Stars: ✭ 196 (+512.5%)
KibaData processing & ETL framework for Ruby
Stars: ✭ 1,618 (+4956.25%)