Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → xmlking → Cdc Kafka Hadoop

xmlking / Cdc Kafka Hadoop

MySQL to NoSQL real time dataflow

Programming Languages

java

68154 projects - #9 most used programming language

groovy

2714 projects

Labels

mysql kafka architecture hadoop data-flow

Projects that are alternatives of or similar to Cdc Kafka Hadoop

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+6253.85%)

Mutual labels: kafka, hadoop, mysql

Devops Bash Tools

550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...

Stars: ✭ 226 (+1638.46%)

Mutual labels: kafka, hadoop, mysql

Nagios Plugins

450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...

Stars: ✭ 1,000 (+7592.31%)

Mutual labels: kafka, hadoop, mysql

Javakeeper

✍️ Java 工程师必备架构体系知识总结：涵盖分布式、微服务、RPC等互联网公司常用架构，以及数据存储、缓存、搜索等必备技能

Stars: ✭ 502 (+3761.54%)

Mutual labels: kafka, mysql

Pdf

编程电子书，电子书，编程书籍，包括C，C#，Docker，Elasticsearch，Git，Hadoop，HeadFirst，Java，Javascript，jvm，Kafka，Linux，Maven，MongoDB，MyBatis，MySQL，Netty，Nginx，Python，RabbitMQ，Redis，Scala，Solr，Spark，Spring，SpringBoot，SpringCloud，TCPIP，Tomcat，Zookeeper，人工智能，大数据类，并发编程，数据库类，数据挖掘，新面试题，架构设计，算法系列，计算机类，设计模式，软件测试，重构优化，等更多分类

Stars: ✭ 12,009 (+92276.92%)

Mutual labels: hadoop, mysql

School Of Sre

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

Stars: ✭ 5,141 (+39446.15%)

Mutual labels: hadoop, mysql

Kafka Connect Hdfs

Kafka Connect HDFS connector

Stars: ✭ 400 (+2976.92%)

Mutual labels: kafka, hadoop

Pmacct

pmacct is a small set of multi-purpose passive network monitoring tools [NetFlow IPFIX sFlow libpcap BGP BMP RPKI IGP Streaming Telemetry].

Stars: ✭ 677 (+5107.69%)

Mutual labels: kafka, mysql

Books Recommendation

程序员进阶书籍（视频），持续更新（Programmer Books）

Stars: ✭ 558 (+4192.31%)

Mutual labels: kafka, mysql

Demo Scene

👾Scripts and samples to support Confluent Demos and Talks. ⚠️Might be rough around the edges ;-) 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/

Stars: ✭ 806 (+6100%)

Mutual labels: kafka, mysql

Quarkus Microservices Poc

Very simplified shop sales system made in a microservices architecture using quarkus

Stars: ✭ 16 (+23.08%)

Mutual labels: kafka, architecture

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+6492.31%)

Mutual labels: kafka, hadoop

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+46115.38%)

Mutual labels: kafka, hadoop

Cookbook

🎉🎉🎉JAVA高级架构师技术栈==任何技能通过 “刻意练习” 都可以达到融会贯通的境界，就像烹饪一样，这里有一份JAVA开发技术手册，只需要增加自己练习的次数。🏃🏃🏃

Stars: ✭ 428 (+3192.31%)

Mutual labels: kafka, mysql

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (+3653.85%)

Mutual labels: hadoop, mysql

Go Clean Arch

Go (Golang) Clean Architecture based on Reading Uncle Bob's Clean Architecture

Stars: ✭ 5,128 (+39346.15%)

Mutual labels: mysql, architecture

Javapdf

🍣100本 Java电子书技术书籍PDF(以下载阅读为荣，以点赞收藏为耻)

Stars: ✭ 609 (+4584.62%)

Mutual labels: hadoop, mysql

Kudo

Kubernetes Universal Declarative Operator (KUDO)

Stars: ✭ 849 (+6430.77%)

Mutual labels: kafka, mysql

Workflow

C++ Parallel Computing and Asynchronous Networking Engine

Stars: ✭ 6,680 (+51284.62%)

Mutual labels: kafka, mysql

Gnomock

Test your code without writing mocks with ephemeral Docker containers 📦 Setup popular services with just a couple lines of code ⏱️ No bash, no yaml, only code 💻

Stars: ✭ 398 (+2961.54%)

Mutual labels: kafka, mysql

View All Similar Projects ➔

CDC Hadoop Dataflow

A low latency, multi-tenant Change Data Capture(CDC) pipeline to continuously replicate data from OLTP(MySQL) to OLAP(NoSQL) systems with no impact to the source.

This project demonstrate how to build dataflow pipeline to move data from O]operational databases(MySQL, Oracle) to analytics databases(Hadoop, MongoDB, MarkLogic) in real-time using Change Data Capture(CDC), Kafka and tools like Apache NiFi, Kafka Streams or Spark to process and ingest data into Hadoop.

Features

Capture changes from many Data Sources and types.
Feed data to many client types (real-time, slow/catch-up, full bootstrap).
Multi-tenant: can contain data from many different databases, support multiple consumers.
Non-intrusive architecture for change capture.
Both batch and near real time delivery.
Isolate fast consumers from slow consumers.
Isolate sources from consumers
1. Schema changes
2. Physical layout changes
3. Speed mismatch
Change filtering
1. Filtering of database changes at the database level, schema level, table level, and row/column level.
Buffer change records in Kafka for flexible consumption from an arbitrary time point in the change stream including full bootstrap capability of the entire data.
Guaranteed in-commit-order and at-least-once delivery with high availability (at least once vs. exactly once)
Resilience and Recoverability
Schema-awareness

Setup

Install and Run MySQL

Install source MySQL database and configure it with row based replication as per instructions.

Install and Run Kafka

Follow the instructions

Install and Run Maxwell

cd cdc/maxwell
# curl -L -0 https://github.com/zendesk/maxwell/releases/download/v1.0.0/maxwell-1.1.2.tar.gz | tar --strip-components=1 -zx -C .
curl -L -0 https://github.com/xmlking/maxwell/releases/download/1.1.2.1/maxwell-1.1.2.1-kafka-connect.tar.gz | tar --strip-components=1 -zx -C .

Run

cd cdc/maxwell

Run with stdout producer (for testing only)

bin/maxwell --user='maxwell' --password='XXXXXX' --host='127.0.0.1' --producer=stdout
Run with kafka producer

bin/maxwell

Test

Manual Testing

If all goes well you'll see maxwell replaying your inserts:

mysql -u root -p

mysql> CREATE TABLE test.shop
       (
         id BIGINT(20) NOT NULL AUTO_INCREMENT,
         version BIGINT(20) NOT NULL,
         name VARCHAR(255) NOT NULL,
         owner VARCHAR(255) NOT NULL,
         phone_number VARCHAR(255) NOT NULL,
         primary key (id, name)
       );
mysql> INSERT INTO test.shop (version, name, owner, phone_number) values (0, 'aaa', 'bbb', '3331114444');
Query OK, 1 row affected (0.02 sec)

(maxwell)
{"database":"test","table":"shop","pk.id":4,"pk.name":"aaa"}
{"database":"test","table":"shop","type":"insert","ts":1458510224,"xid":33531,"commit":true,"data":{"owner":"bbb","name":"aaa","phone_number":"3331114444","id":4,"version":0}}

Testing via Grails App

You can also use testApp to generate load.

Reference

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 13

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗