AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+167.19%)
Bulk WriterProvides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (-88.59%)
mydataharbor🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (-98.48%)
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (-95.82%)
Reddit DetectivePlay detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (-92.99%)
PdpipeEasy pipelines for pandas DataFrames.
Stars: ✭ 590 (-67.95%)
StoragetapperStorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (-87.4%)
PyfunctionalPython library for creating data pipelines with chain functional programming
Stars: ✭ 1,943 (+5.54%)
DatacleanerThe premier open source Data Quality solution
Stars: ✭ 391 (-78.76%)
Ensembl HiveEnsEMBL Hive - a system for creating and running pipelines on a distributed compute resource
Stars: ✭ 44 (-97.61%)
naas⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (-88.1%)
RikoA Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (-14.67%)
Go StreamsA lightweight stream processing library for Go
Stars: ✭ 615 (-66.59%)
lineageGenerate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (-99.13%)
Usaspending ApiServer application to serve U.S. federal spending data via a RESTful API
Stars: ✭ 166 (-90.98%)
Linq2dbLinq to database provider.
Stars: ✭ 2,211 (+20.1%)
TransformalizeConfigurable Extract, Transform, and Load
Stars: ✭ 125 (-93.21%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-95.71%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-98.64%)
PostguiA React web application to query and share any PostgreSQL database.
Stars: ✭ 260 (-85.88%)
MetabaseThe simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
Stars: ✭ 26,803 (+1355.89%)
PglogicalLogical Replication extension for PostgreSQL 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Stars: ✭ 455 (-75.29%)
Locopylocopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-96.03%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-96.09%)
Metlmito ETL tool
Stars: ✭ 153 (-91.69%)
Csv2dbThe CSV to database command line loader
Stars: ✭ 102 (-94.46%)
etlM-Lab ingestion pipeline
Stars: ✭ 15 (-99.19%)
TransporterSync data between persistence engines, like ETL only not stodgy
Stars: ✭ 1,175 (-36.18%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-99.08%)
DatavecETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (-85.23%)
Ether sqlA python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-97.77%)
GrafterLinked Data & RDF Manufacturing Tools in Clojure
Stars: ✭ 174 (-90.55%)
StetlStetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (-96.52%)
DataBridge.NETConfigurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-99.13%)
Kiba PlusKiba enhancement for Ruby ETL.
Stars: ✭ 47 (-97.45%)
OdČeská otevřená data
Stars: ✭ 99 (-94.62%)
KibaData processing & ETL framework for Ruby
Stars: ✭ 1,618 (-12.11%)
Giraffeql🦒 Developer tool to visualize relational databases and export schemas for GraphQL API's.
Stars: ✭ 128 (-93.05%)
Kettle Web基于spring boot通过java代码调用kette
Stars: ✭ 128 (-93.05%)
VmsTHIS PROJECT IS ARCHIVED. Volunteer Management System.
Stars: ✭ 127 (-93.1%)
Mumuki Laboratory 🔬 Where students practice and receive automated and human feedback
Stars: ✭ 131 (-92.88%)
FpartSort files and pack them into partitions
Stars: ✭ 127 (-93.1%)
Backy2backy2: Deduplicating block based backup software for ceph/rbd, image files and devices
Stars: ✭ 126 (-93.16%)
Githubrankingsspain⬆️ Rankings with the most active GitHub users in Spain (sorted by public contributions) 🇪🇸
Stars: ✭ 127 (-93.1%)
SqueezemetaA complete pipeline for metagenomic analysis
Stars: ✭ 128 (-93.05%)
PipelinexPipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-93.1%)
DbNewt DB is a Python object-oriented database with JSONB-based access and search in PostgreSQL
Stars: ✭ 132 (-92.83%)
JhtalibTechnical Analysis Library Time-Series
Stars: ✭ 131 (-92.88%)
Postgres OperatorProduction PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
Stars: ✭ 2,166 (+17.65%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (-93.16%)
Dbmate🚀 A lightweight, framework-agnostic database migration tool.
Stars: ✭ 2,228 (+21.02%)
Asr audio data linksA list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 128 (-93.05%)
Mattermost AnsibleAnsible playbook to provide a turnkey solution for the Team Edition of Mattermost
Stars: ✭ 126 (-93.16%)
SemsegpipelineA simpler way of reading and augmenting image segmentation data into TensorFlow
Stars: ✭ 126 (-93.16%)
Torque PostgresqlAdd support to complex resources of PostgreSQL, like data types, array associations, and auxiliary statements (CTE)
Stars: ✭ 130 (-92.94%)
Go Bank TransferSimple API for banking routines using a Clean Architecture in Golang. 💳 💰 💸
Stars: ✭ 123 (-93.32%)