All Projects → salesforce → Mirus

salesforce / Mirus

Licence: bsd-3-clause
Mirus is a cross data-center data replication tool for Apache Kafka

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Mirus

Cp Docker Images
[DEPRECATED] Docker images for Confluent Platform.
Stars: ✭ 975 (+470.18%)
Mutual labels:  kafka, kafka-connect
Mongo Kafka
MongoDB Kafka Connector
Stars: ✭ 166 (-2.92%)
Mutual labels:  kafka, kafka-connect
Kattlo Cli
Kattlo CLI Project
Stars: ✭ 58 (-66.08%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Elasticsearch Source
Kafka Connect Elasticsearch Source
Stars: ✭ 22 (-87.13%)
Mutual labels:  kafka, kafka-connect
Kukulcan
A REPL for Apache Kafka
Stars: ✭ 103 (-39.77%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Elastic Sink
Kafka connect Elastic sink connector, with just in time index/delete behaviour.
Stars: ✭ 23 (-86.55%)
Mutual labels:  kafka, kafka-connect
Kspp
A high performance/ real-time C++ Kafka streams framework (C++17)
Stars: ✭ 80 (-53.22%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Elasticsearch
Kafka Connect Elasticsearch connector
Stars: ✭ 550 (+221.64%)
Mutual labels:  kafka, kafka-connect
Kafka Connect
equivalent to kafka-connect 🔧 for nodejs ✨🐢🚀✨
Stars: ✭ 102 (-40.35%)
Mutual labels:  kafka, kafka-connect
Streamx
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
Stars: ✭ 96 (-43.86%)
Mutual labels:  kafka, kafka-connect
Kafkacenter
KafkaCenter is a unified platform for Kafka cluster management and maintenance, producer / consumer monitoring, and use of ecological components.
Stars: ✭ 896 (+423.98%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Mongodb
**Unofficial / Community** Kafka Connect MongoDB Sink Connector - Find the official MongoDB Kafka Connector here: https://www.mongodb.com/kafka-connector
Stars: ✭ 137 (-19.88%)
Mutual labels:  kafka, kafka-connect
Demo Scene
👾Scripts and samples to support Confluent Demos and Talks. ⚠️Might be rough around the edges ;-) 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
Stars: ✭ 806 (+371.35%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Storage Cloud
Kafka Connect suite of connectors for Cloud storage (Amazon S3)
Stars: ✭ 153 (-10.53%)
Mutual labels:  kafka, kafka-connect
Stream Reactor
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.
Stars: ✭ 753 (+340.35%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Protobuf Converter
Protobuf converter plugin for Kafka Connect
Stars: ✭ 79 (-53.8%)
Mutual labels:  kafka, kafka-connect
All Things Cqrs
Comprehensive guide to a couple of possible ways of synchronizing two states with Spring tools. Synchronization is shown by separating command and queries in a simple CQRS application.
Stars: ✭ 474 (+177.19%)
Mutual labels:  kafka, kafka-connect
Debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Stars: ✭ 5,937 (+3371.93%)
Mutual labels:  kafka, kafka-connect
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-52.05%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Tools
Kafka Connect Tooling
Stars: ✭ 115 (-32.75%)
Mutual labels:  kafka, kafka-connect

Build Status License

Mirus

A tool for distributed, high-volume replication between Apache Kafka clusters based on Kafka Connect. Designed for easy operation in a high-throughput, multi-cluster environment.

Features

  • Dynamic Configuration: Uses the Kafka Connect REST API for dynamic API-driven configuration
  • Precise Replication: Supports a regex whitelist and an explicit topic whitelist
  • Simple Management for Multiple Kafka Clusters: Supports multiple source clusters with one Worker process
  • Continuous Ingestion: Continues consuming from the source cluster while flushing and committing offsets
  • Built for Dynamic Kafka Clusters: Able to handle topics and partitions being created and deleted in source and destination clusters
  • Scalable: Creates a configurable set of worker tasks that are distributed across a Kafka Connect cluster for high performance, even when pushing data over the internet
  • Fault tolerant: Includes a monitor thread that looks for task failures and optionally auto-restarts
  • Monitoring: Includes custom JMX metrics for production ready monitoring and alerting

Overview

Mirus is built around Apache Kafka Connect, providing SourceConnector and SourceTask implementations optimized for reading data from Kafka source clusters. The MirusSourceConnector runs a KafkaMonitor thread, which monitors the source and destination Apache Kafka cluster partition allocations, looking for changes and applying a configurable topic whitelist. Each task is responsible for a subset of the matching partitions, and runs an independent KafkaConsumer and KafkaProducer client pair to do the work of replicating those partitions.

Tasks can be restarted independently without otherwise affecting a running cluster, are monitored continuously for failure, and are optionally automatically restarted.

To understand how Mirus distributes work across a cluser of machines please read the Kafka Connect documentation.

Installation

To build a package containing the Mirus jar file and all dependencies, run mvn package -P all:

  • target/mirus-${project.version}-all.zip

This package can be unzipped for use (see Quick Start).

Maven also builds the following artifacts when you run mvn package. These are useful if you need customized packaging for your own environment:

  • target/mirus-${project.version}.jar: Primary Mirus jar (dependencies not included)
  • target/mirus-${project.version}-run.zip: A package containing the Mirus run control scripts

Usage

These instructions assume you have expanded the mirus-${project.version}-all.zip package.

Mirus Worker Instance

A single Mirus Worker can be started using this helper script.

> bin/mirus-worker-start.sh [worker-properties-file]

worker-properties-file: Path to the Mirus worker properties file, which configures the Kafka Connect framework. See quickstart-worker.properties for an example.

Options:

--override property=value: optional command-line override for any item in the Mirus worker properties file. Multiple override options are supported (similar to the equivalent flag in Kafka).

Mirus Offset Tool

Mirus includes a simple tool for reading and writing offsets. This can be useful for migration from other replication tools, for debugging, and for offset monitoring in production. The tool supports CSV and JSON input and output.

For detailed usage:

> bin/mirus-offset-tool.sh --help

Quick Start

To run the Quick Start example you will need running Kafka and Zookeeper clusters to work with. We will assume you have a standard Apache Kafka Quickstart test cluster running on localhost. Follow the Kafka Quick Start instructions.

For this tutorial we will set up a Mirus worker instance to mirror the topic test in loop-back mode to another topic in the same cluster. To avoid a conflict the destination topic name will be set to test.mirror using the destination.topic.name.suffix configuration option.

  1. Build the full Mirus project using Maven

    > mvn package -P all
    
  2. Unpack the Mirus "all" package

    > mkdir quickstart; cd quickstart; unzip ../target/mirus-*-all.zip
    
  3. Start the quickstart worker using the sample worker properties file

    > bin/mirus-worker-start.sh config/quickstart-worker.properties
    
    
  4. In another terminal, confirm the Mirus Kafka Connect REST API is running

    > curl localhost:8083
    
    {"version":"1.1.0","commit":"fdcf75ea326b8e07","kafka_cluster_id":"xdxNfx84TU-ennOs7EznZQ"}
    
  5. Submit a new MirusSourceConnector configuration to the REST API with the name mirus-quickstart-source

    > curl localhost:8083/connectors/mirus-quickstart-source/config \
          -X PUT \
          -H 'Content-Type: application/json' \
          -d '{
               "name": "mirus-quickstart-source",
               "connector.class": "com.salesforce.mirus.MirusSourceConnector",
               "tasks.max": "5",
               "topics.whitelist": "test",
               "destination.topic.name.suffix": ".mirror",
               "destination.consumer.bootstrap.servers": "localhost:9092",
               "consumer.bootstrap.servers": "localhost:9092",
               "consumer.client.id": "mirus-quickstart",
               "consumer.key.deserializer": "org.apache.kafka.common.serialization.ByteArrayDeserializer",
               "consumer.value.deserializer": "org.apache.kafka.common.serialization.ByteArrayDeserializer"
           }'
    
  6. Confirm the new connector is running

    > curl localhost:8083/connectors
    
    ["mirus-quickstart-source"]
    
    > curl localhost:8083/connectors/mirus-quickstart-source/status
    
    {"name":"mirus-quickstart-source","connector":{"state":"RUNNING","worker_id":"1.2.3.4:8083"},"tasks":[],"type":"source"}
    
  7. Create source and destination topics test and test.mirror in your Kafka cluster

    > cd ${KAFKA_HOME}
    
    > bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic 'test' --partitions 1 --replication-factor 1
    Created topic "test".
    
    > bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic 'test.mirror' --partitions 1 --replication-factor 1
    Created topic "test.mirror".
    
  8. Mirus should detect that the new source and destination topics are available and create a new Mirus Source Task:

    > curl localhost:8083/connectors/mirus-quickstart-source/status
    
    {"name":"mirus-quickstart-source","connector":{"state":"RUNNING","worker_id":"10.126.22.44:8083"},"tasks":[{"state":"RUNNING","id":0,"worker_id":"10.126.22.44:8083"}],"type":"source"}
    

Any message you write to the topic test will now be mirrored to test.mirror.

REST API

See the documentation for Kafka Connect REST API.

Configuration

Kafka Connect Configuration

Mirus shares most Worker and Source configuration with the Kafka Connect framework. For general information on configuring the framework see:

Mirus Specific Configuration

Mirus-specific configuration properties are documented in these files:

  • Mirus Source Properties These can be added to the JSON config object posted to the REST API /config endpoint to configure a new MirusSourceConnector instance. In addition, the Kafka Consumer instances created by Mirus Tasks can be configured using a consumer. prefix on the standard Kafka Consumer properties. The equivalent KafkaProducer options are configured in the Mirus Worker Properties file (see below). The destination.consumer.prefix can be used to override the properties of the KafkaConsumer that connects to the destination Kafka cluster.

  • Mirus Worker Properties These are Mirus extensions to the Kafka Connect configuration, and should be applied to the Worker Properties file provided at startup. The Kafka Producer instances created by Kafka Connect can also be configured using a producer. prefix on the standard Kafka Producer properties.

Destination Topic Checking

By default, Mirus checks that the destination topic exists in the destination Kafka cluster before starting to replicate data to it. This feature can be disabled by setting the enable.destination.topic.checking config option to false.

As of version 0.2.0, destination topic checking can also support topic re-routing performed by the RegexRouter Single-Message Transformation. No other Router Transformations are supported, so destination topic checking must be disabled in order to use them.

Developer Info

To preform a release: mvn release:prepare release:perform

Discussion

Questions or comments can also be posted on the Mirus Github issues page.

Maintainers

Paul Davidson and contributors.

Code Style

This project uses the Google Java Format.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].