DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-56.36%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+154.55%)
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+3.64%)
dbddbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-45.45%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+576.36%)
id3cData logistics system enabling real-time pathogen surveillance. Built for the Seattle Flu Study.
Stars: ✭ 21 (-61.82%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+129.09%)
Bitcoin EtlETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+216.36%)
RikoA Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+2756.36%)
Bulk WriterProvides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (+281.82%)
Usaspending ApiServer application to serve U.S. federal spending data via a RESTful API
Stars: ✭ 166 (+201.82%)
OdČeská otevřená data
Stars: ✭ 99 (+80%)
Reddit DetectivePlay detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (+134.55%)
Openkettlewebui一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (+127.27%)
Etl2pcapngUtility that converts an .etl file containing a Windows network packet capture into .pcapng format.
Stars: ✭ 228 (+314.55%)
Sentinel CrawlerXenomorph Crawler, a Concise, Declarative and Observable Distributed Crawler(Node / Go / Java / Rust) For Web, RDB, OS, also can act as a Monitor(with Prometheus) or ETL for Infrastructure 💫 多语言执行器,分布式爬虫
Stars: ✭ 118 (+114.55%)
Linq2dbLinq to database provider.
Stars: ✭ 2,211 (+3920%)
Kafka Connectequivalent to kafka-connect 🔧 for nodejs ✨🐢🚀✨
Stars: ✭ 102 (+85.45%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+345.45%)
Etl unicorn数据可视化, 数据挖掘, 数据处理 ETL
Stars: ✭ 156 (+183.64%)
Hale(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (+52.73%)
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+43.64%)
CqlCategorical Query Language IDE
Stars: ✭ 196 (+256.36%)
Metlmito ETL tool
Stars: ✭ 153 (+178.18%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+2072.73%)
Kettle Web基于spring boot通过java代码调用kette
Stars: ✭ 128 (+132.73%)
MetlMetl is a simple, web-based integration platform that allows for several different styles of data integration including messaging, file based Extract/Transform/Load (ETL), and remote procedure invocation via Web Services. Read more at www.jumpmind.com/products/metl/overview
Stars: ✭ 185 (+236.36%)
Etl.netMass processing data with a complete ETL for .net developers
Stars: ✭ 129 (+134.55%)
StoragetapperStorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+321.82%)
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (+127.27%)
GrafterLinked Data & RDF Manufacturing Tools in Clojure
Stars: ✭ 174 (+216.36%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+4236.36%)
KibaData processing & ETL framework for Ruby
Stars: ✭ 1,618 (+2841.82%)
BenderBender - Serverless ETL Framework
Stars: ✭ 171 (+210.91%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+110.91%)
ElasticR client for the Elasticsearch HTTP API
Stars: ✭ 227 (+312.73%)
Aws Ecs AirflowRun Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (+94.55%)
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+8843.64%)
Csv2dbThe CSV to database command line loader
Stars: ✭ 102 (+85.45%)
vixtractwww.vixtract.ru
Stars: ✭ 40 (-27.27%)
Open Data Etl Utility KitUse Pentaho's open source data integration tool (Kettle) to create Extract-Transform-Load (ETL) processes to update a Socrata open data portal. Documentation is available at http://open-data-etl-utility-kit.readthedocs.io/en/stable
Stars: ✭ 93 (+69.09%)
Open Semantic EtlPython based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Stars: ✭ 165 (+200%)
EtlLinkedPipes ETL is an RDF based, lightweight ETL tool
Stars: ✭ 88 (+60%)
EtlboxA lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.
Stars: ✭ 203 (+269.09%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+43.64%)
Mara Example Project 2An example mini data warehouse for python project stats, template for new projects
Stars: ✭ 154 (+180%)
Data StoryA visual process builder for Laravel
Stars: ✭ 71 (+29.09%)
Example Airflow DagsExample DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+341.82%)
Locopylocopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (+32.73%)
Omniparseromniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Stars: ✭ 148 (+169.09%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (+30.91%)
TransporterSync data between persistence engines, like ETL only not stodgy
Stars: ✭ 1,175 (+2036.36%)
ExtractA cross-platform command line tool for parallelised content extraction and analysis.
Stars: ✭ 188 (+241.82%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+161.82%)