DebeziumChange data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Stars: ✭ 5,937 (+11317.31%)
debezium-incubatorPreviously used repository for new Debezium modules and connectors in incubation phase (archived)
Stars: ✭ 89 (+71.15%)
southpaw⚾ Streaming left joins in Kafka for change data capture
Stars: ✭ 48 (-7.69%)
debezium.github.ioSource for the Debezium website; Please log issues in our tracker at https://issues.redhat.com/projects/DBZ/.
Stars: ✭ 34 (-34.62%)
kafka-connect-httpKafka Connect connector that enables Change Data Capture from JSON/HTTP APIs into Kafka.
Stars: ✭ 81 (+55.77%)
HudiUpserts, Deletes And Incremental Processing on Big Data.
Stars: ✭ 2,586 (+4873.08%)
dlinkDinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
Stars: ✭ 1,535 (+2851.92%)
flink-connector-kudu基于Apache-bahir-kudu-connector的flink-connector-kudu,支持Flink1.11.x DynamicTableSource/Sink,支持Range分区等
Stars: ✭ 40 (-23.08%)
MySqlCdcMySQL/MariaDB binlog replication client for .NET
Stars: ✭ 71 (+36.54%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-25%)
dt-sql-parserSQL Parsers for BigData, built with antlr4.
Stars: ✭ 135 (+159.62%)
oracdcOracle database CDC (Change Data Capture)
Stars: ✭ 51 (-1.92%)
RealtimeListen to your to PostgreSQL database in realtime via websockets. Built with Elixir.
Stars: ✭ 4,278 (+8126.92%)
litemall-dw基于开源Litemall电商项目的大数据项目,包含前端埋点(openresty+lua)、后端埋点;数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化),同时也包含了Azkaban的workflow。
Stars: ✭ 36 (-30.77%)
LarkMidTableLarkMidTable 是一站式开源的数据中台,实现中台的 基础建设,数据治理,数据开发,监控告警,数据服务,数据的可视化,实现高效赋能数据前台并提供数据服务的产品。
Stars: ✭ 873 (+1578.85%)
redis-microservices-demoMicroservice application with various Redis use-cases with RediSearch, RedisGraph and Streams. The data are synchronize between MySQL and Redis using Debezium as a CDC engine
Stars: ✭ 48 (-7.69%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+269.23%)
pgcaptureA scalable Netflix DBLog implementation for PostgreSQL
Stars: ✭ 94 (+80.77%)
OpenLogReplicatorOpen Source Oracle database CDC written purely in C++. Reads transactions directly from database redo log files and streams in JSON or Protobuf format to: Kafka, RocketMQ, flat file, network stream (plain TCP/IP or ZeroMQ)
Stars: ✭ 112 (+115.38%)
OLAP-cubeis an hypercube of data
Stars: ✭ 23 (-55.77%)
kafka-delta-ingestA highly efficient daemon for streaming data from Kafka into Delta Lake
Stars: ✭ 139 (+167.31%)
cdcA library for performing Content-Defined Chunking (CDC) on data streams.
Stars: ✭ 18 (-65.38%)
Websockets-Vertx-Flink-KafkaA simple request response cycle using Websockets, Eclipse Vert-x server, Apache Kafka, Apache Flink.
Stars: ✭ 14 (-73.08%)
spark-vcfSpark VCF data source implementation for Dataframes
Stars: ✭ 15 (-71.15%)
opaque-sqlAn encrypted data analytics platform
Stars: ✭ 169 (+225%)
albisAlbis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (-61.54%)
google-sheets-etlLive import all your Google Sheets to your data warehouse
Stars: ✭ 15 (-71.15%)
awesome-bigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 11,093 (+21232.69%)
RnsspA Signature R package for the National Syndromic Surveillance Program (NSSP) at the Centers for Disease Control and Prevention (CDC). A collection of tools, functions, and R Markdown templates that supports the Community of Practice of the NSSP.
Stars: ✭ 19 (-63.46%)
northwind-dotnetA full-stack .NET 6 Microservices build on Minimal APIs and C# 10
Stars: ✭ 77 (+48.08%)
deltaqFast and portable delta encoding for .NET in 100% safe, managed code.
Stars: ✭ 26 (-50%)
HadoopDedup🍉基于Hadoop和HBase的大规模海量数据去重
Stars: ✭ 27 (-48.08%)
shopping-lista PWA to note shopping list and see shopping history
Stars: ✭ 24 (-53.85%)
DeltaUISwiftUI + CoreData user interface for DeltaCore & Friends.
Stars: ✭ 61 (+17.31%)
flink-learnLearning Flink : Flink CEP,Flink Core,Flink SQL
Stars: ✭ 70 (+34.62%)
dw-vldb-samplesThis is a top level repository for code examples related to Data Warehousing and Very Large Databases.
Stars: ✭ 32 (-38.46%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+167.31%)
Tweet-Analysis-With-Kafka-and-SparkA real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.
Stars: ✭ 18 (-65.38%)
tipoca-streamNear real time cloud native data pipeline in AWS (CDC+Sink). Hosts code for RedshiftSink. RDS to RedshiftSink Pipeline with masking and reloading support.
Stars: ✭ 43 (-17.31%)
smart-data-lakeSmart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+51.92%)
spark2-etl-examplesA project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
Stars: ✭ 23 (-55.77%)
geosparkbring sf to spark in production
Stars: ✭ 53 (+1.92%)
SANSA-StackBig Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
Stars: ✭ 130 (+150%)