Etl2pcapngUtility that converts an .etl file containing a Windows network packet capture into .pcapng format.
Stars: ✭ 228 (+1325%)
starlakeStarlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (+0%)
vixtractwww.vixtract.ru
Stars: ✭ 40 (+150%)
Bulk WriterProvides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (+1212.5%)
CqlCategorical Query Language IDE
Stars: ✭ 196 (+1125%)
proc-thatproc(ess)-that - easy extendable ETL tool for Node.js. Written in TypeScript.
Stars: ✭ 25 (+56.25%)
Mongo EsA MongoDB to Elasticsearch connector
Stars: ✭ 185 (+1056.25%)
Bitcoin EtlETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+987.5%)
singer-runnerA CLI and library to run Singer Taps and Targets
Stars: ✭ 33 (+106.25%)
csv-cruncherTreats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Stars: ✭ 32 (+100%)
thainThain is a distributed flow schedule platform.
Stars: ✭ 81 (+406.25%)
Usaspending ApiServer application to serve U.S. federal spending data via a RESTful API
Stars: ✭ 166 (+937.5%)
docker-omnidbOmniDB installed into a Docker container
Stars: ✭ 30 (+87.5%)
sirenSiren provides an easy-to-use universal alert, notification, channels management framework for the entire observability infrastructure.
Stars: ✭ 70 (+337.5%)
Metlmito ETL tool
Stars: ✭ 153 (+856.25%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+800%)
cliPolyaxon Core Client & CLI to streamline MLOps
Stars: ✭ 18 (+12.5%)
Mara PipelinesA lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+11406.25%)
iridium💎 Growing collection of VS Code extensions with a fancy name
Stars: ✭ 39 (+143.75%)
Reddit DetectivePlay detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (+706.25%)
guardianGuardian is a tool for extensible and universal data access with automated access workflows and security controls across data stores, analytical systems, and cloud products.
Stars: ✭ 127 (+693.75%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (+687.5%)
sql-to-redis🔄 Simple tool for ETL. From SQL to Redis.
Stars: ✭ 18 (+12.5%)
Openkettlewebui一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 125 (+681.25%)
machine-learning-data-pipelinePipeline module for parallel real-time data processing for machine learning models development and production purposes.
Stars: ✭ 22 (+37.5%)
RikoA Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+9718.75%)
maxwell-sinkconsume maxwell generated message from kafka,export it to another mysql.
Stars: ✭ 16 (+0%)
Sentinel CrawlerXenomorph Crawler, a Concise, Declarative and Observable Distributed Crawler(Node / Go / Java / Rust) For Web, RDB, OS, also can act as a Monitor(with Prometheus) or ETL for Infrastructure 💫 多语言执行器,分布式爬虫
Stars: ✭ 118 (+637.5%)
YaEtlYet Another ETL in PHP
Stars: ✭ 60 (+275%)
ATOMAutomated Tool for Optimized Modelling
Stars: ✭ 85 (+431.25%)
CogStack-NiFiBuilding data processing pipelines for documents processing with NLP using Apache NiFi and related services
Stars: ✭ 22 (+37.5%)
DQCS数据质量控制系统
Stars: ✭ 34 (+112.5%)
Kafka Connectequivalent to kafka-connect 🔧 for nodejs ✨🐢🚀✨
Stars: ✭ 102 (+537.5%)
BETL-oldBETL. Meta data driven ETL generation using T-SQL
Stars: ✭ 17 (+6.25%)
OdČeská otevřená data
Stars: ✭ 99 (+518.75%)
PDAP-ScrapersCode relating to scraping public police data.
Stars: ✭ 72 (+350%)
saisokuSaisoku is a Python module that helps you build complex pipelines of batch file/directory transfer/sync jobs.
Stars: ✭ 40 (+150%)
Hale(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (+425%)
maricutodbPHP Flat File Database Manager
Stars: ✭ 23 (+43.75%)
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+393.75%)
zdh server数据采集平台zdh,etl 处理服务
Stars: ✭ 53 (+231.25%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+7368.75%)
dswarman open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)
Stars: ✭ 57 (+256.25%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (+350%)
Korzh.DbUtilsHelps to initialize your database and seed it with some data in a most simple and convenient way.
Stars: ✭ 73 (+356.25%)
opentrials-airflowConfiguration and definitions of Airflow for OpenTrials
Stars: ✭ 18 (+12.5%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+1431.25%)
Example Airflow DagsExample DAGs using hooks and operators from Airflow Plugins
Stars: ✭ 243 (+1418.75%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+1368.75%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (+18.75%)
Database-Web-APIDynamically generate RESTful APIs from the contents of a database table. Provides JSON, XML, and HTML. Supports most popular databases
Stars: ✭ 37 (+131.25%)