dogETLA lib to transform data from jdbc,csv,json to ecah other.
Stars: ✭ 15 (+7.14%)
lineageGenerate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (+14.29%)
sync-addonsOdoo Integration Addons
Stars: ✭ 69 (+392.86%)
DataXServer为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能
Stars: ✭ 130 (+828.57%)
pyjanitorClean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 970 (+6828.57%)
gamechanger-dataGAMECHANGER aspires to be the Department’s trusted solution for evidence-based, data-driven decision-making across the universe of DoD requirements
Stars: ✭ 17 (+21.43%)
OpenKettleWebUI一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+885.71%)
ml-in-productionThe practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (+107.14%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (+78.57%)
yt-channels-DS-AI-ML-CSA comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+7314.29%)
dflibIn-memory Java DataFrame library
Stars: ✭ 50 (+257.14%)
deordie-meetupsDE or DIE meetup made by data engineers for data engineers. Currently in Russian only.
Stars: ✭ 48 (+242.86%)
funsiesfunsies is a lightweight workflow engine 🔧
Stars: ✭ 37 (+164.29%)
cardano-pyPython3 lib and cli for operating a Cardano Passive Node and using the API's. (PRE-ALPHA)
Stars: ✭ 17 (+21.43%)
flockFlock: A Low-Cost Streaming Query Engine on FaaS Platforms
Stars: ✭ 232 (+1557.14%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (+78.57%)
dbddbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (+114.29%)
starlakeStarlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (+14.29%)
YaEtlYet Another ETL in PHP
Stars: ✭ 60 (+328.57%)
mlbgamedayMulti-core processing of 'Gameday' data from Major League Baseball Advanced Media. Additional tools to parallelize large data sets and write them to a database.
Stars: ✭ 37 (+164.29%)
proc-thatproc(ess)-that - easy extendable ETL tool for Node.js. Written in TypeScript.
Stars: ✭ 25 (+78.57%)
persistityA persistence framework for game developers
Stars: ✭ 34 (+142.86%)
datartDatart is a next generation Data Visualization Open Platform
Stars: ✭ 1,042 (+7342.86%)
bandar-logMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (+42.86%)
zdh web大数据采集,抽取平台
Stars: ✭ 292 (+1985.71%)
mydataharbor🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (+100%)
BETL-oldBETL. Meta data driven ETL generation using T-SQL
Stars: ✭ 17 (+21.43%)
openrefine-clientThe OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
Stars: ✭ 67 (+378.57%)
prefect-saturnPython client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (+7.14%)
naas⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (+1464.29%)
Everything-TechA collection of online resources to help you on your Tech journey.
Stars: ✭ 396 (+2728.57%)
google-sheets-etlLive import all your Google Sheets to your data warehouse
Stars: ✭ 15 (+7.14%)
growthbookOpen Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+16628.57%)
viewflowViewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+685.71%)
zdh server数据采集平台zdh,etl 处理服务
Stars: ✭ 53 (+278.57%)
get smartiesDummy variable generation with fit/transform capabilities
Stars: ✭ 23 (+64.29%)
TEAMThe Taxonomy for ETL Automation Metadata (TEAM) is a metadata management tool for data warehouse automation. It is part of the ecosystem for data warehouse automation, alongside the Virtual Data Warehouse pattern manager and the generic schema for Data Warehouse Automation.
Stars: ✭ 27 (+92.86%)
contessaEasy way to define, execute and store quality rules for your data.
Stars: ✭ 17 (+21.43%)
PDAP-ScrapersCode relating to scraping public police data.
Stars: ✭ 72 (+414.29%)
papiloDEPRECATED: Stream data processing micro-framework
Stars: ✭ 24 (+71.43%)
ETW2JSONTool and library to convert ETW logs to JSON files
Stars: ✭ 66 (+371.43%)
openrefine-batchShell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (+442.86%)
DataBridge.NETConfigurable data bridge for permanent ETL jobs
Stars: ✭ 16 (+14.29%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (+35.71%)
mpc-DL-controllerDeep Neural Network architecture as a predictive optimal controller for {HVAC+Solar cell + battery} disturbance afflicted system vs classic Model Predictive Control
Stars: ✭ 37 (+164.29%)
go-bqloaderbqloader is a simple ETL framework to load data from Cloud Storage into BigQuery.
Stars: ✭ 16 (+14.29%)
iex-stocksETL for the IEX Stocks API
Stars: ✭ 19 (+35.71%)
awesome-integrationA curated list of awesome system integration software and resources.
Stars: ✭ 117 (+735.71%)
cobrixA COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Stars: ✭ 109 (+678.57%)
neo4j-jdbcJDBC driver for Neo4j
Stars: ✭ 110 (+685.71%)
link-moveA model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.
Stars: ✭ 32 (+128.57%)
carryPython ETL(Extract-Transform-Load) tool / Data migration tool
Stars: ✭ 115 (+721.43%)