vixtractwww.vixtract.ru
Stars: ✭ 40 (+48.15%)
Mutual labels: etl, etl-pipeline
DIRECTDIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics framework that can be used to monitor, log, audit and control data integration / ETL processes.
Stars: ✭ 20 (-25.93%)
Mutual labels: etl, etl-pipeline
Kafka Connectequivalent to kafka-connect 🔧 for nodejs ✨🐢🚀✨
Stars: ✭ 102 (+277.78%)
Mutual labels: etl, kafka-connect
redis-connect-distReal-Time Event Streaming & Change Data Capture
Stars: ✭ 21 (-22.22%)
Mutual labels: etl, etl-pipeline
maxwell-sinkconsume maxwell generated message from kafka,export it to another mysql.
Stars: ✭ 16 (-40.74%)
Mutual labels: etl, kafka-connect
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+44.44%)
Mutual labels: etl, etl-pipeline
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-25.93%)
Mutual labels: etl, etl-pipeline
etlflowEtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (+40.74%)
Mutual labels: etl, etl-pipeline
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+2166.67%)
Mutual labels: etl, etl-pipeline
csvpluscsvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+148.15%)
Mutual labels: etl, etl-pipeline
bigquery-kafka-connect☁️ nodejs kafka connect connector for Google BigQuery
Stars: ✭ 17 (-37.04%)
Mutual labels: etl, kafka-connect
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-11.11%)
Mutual labels: etl, etl-pipeline
dflibIn-memory Java DataFrame library
Stars: ✭ 50 (+85.19%)
Mutual labels: etl
registryless-avro-converterAn avro converter for Kafka Connect without a Schema Registry
Stars: ✭ 45 (+66.67%)
Mutual labels: kafka-connect
compiler-benchmarkBenchmarks for scalac
Stars: ✭ 68 (+151.85%)
Mutual labels: performance-test
versatile-data-kitVersatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+433.33%)
Mutual labels: etl
mydataharbor🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (+3.7%)
Mutual labels: etl
PDAP-ScrapersCode relating to scraping public police data.
Stars: ✭ 72 (+166.67%)
Mutual labels: etl