All Projects → DarioBalinzo → Kafka Connect Elasticsearch Source

DarioBalinzo / Kafka Connect Elasticsearch Source

Licence: apache-2.0
Kafka Connect Elasticsearch Source

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Kafka Connect Elasticsearch Source

Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (+272.73%)
Mutual labels:  kafka, kafka-connect, elasticsearch
Kafka Connect Ui
Web tool for Kafka Connect |
Stars: ✭ 388 (+1663.64%)
Mutual labels:  kafka, kafka-connect, elasticsearch
Kafka Connect Elastic Sink
Kafka connect Elastic sink connector, with just in time index/delete behaviour.
Stars: ✭ 23 (+4.55%)
Mutual labels:  kafka, kafka-connect, elasticsearch
Stream Reactor
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.
Stars: ✭ 753 (+3322.73%)
Mutual labels:  kafka, kafka-connect, elasticsearch
Mongo Kafka
MongoDB Kafka Connector
Stars: ✭ 166 (+654.55%)
Mutual labels:  source, kafka, kafka-connect
Demo Scene
👾Scripts and samples to support Confluent Demos and Talks. ⚠️Might be rough around the edges ;-) 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
Stars: ✭ 806 (+3563.64%)
Mutual labels:  kafka, kafka-connect, elasticsearch
Kafka Connect Github Source
Get a stream of issues and pull requests for your chosen GitHub repository
Stars: ✭ 327 (+1386.36%)
Mutual labels:  source, kafka, kafka-connect
Kafka Connect Elasticsearch
Kafka Connect Elasticsearch connector
Stars: ✭ 550 (+2400%)
Mutual labels:  kafka, kafka-connect, elasticsearch
Gpmall
【咕泡学院实战项目】-基于SpringBoot+Dubbo构建的电商平台-微服务架构、商城、电商、微服务、高并发、kafka、Elasticsearch
Stars: ✭ 4,241 (+19177.27%)
Mutual labels:  kafka, elasticsearch
Cookbook
🎉🎉🎉JAVA高级架构师技术栈==任何技能通过 “刻意练习” 都可以达到融会贯通的境界,就像烹饪一样,这里有一份JAVA开发技术手册,只需要增加自己练习的次数。🏃🏃🏃
Stars: ✭ 428 (+1845.45%)
Mutual labels:  kafka, elasticsearch
Hangout
用java实现一下Logstash的几个常用input/filter/output, 希望能有效率上面的大提升. 现在我们迁移到golang了 https://github.com/childe/gohangout
Stars: ✭ 469 (+2031.82%)
Mutual labels:  kafka, elasticsearch
Kafkacenter
KafkaCenter is a unified platform for Kafka cluster management and maintenance, producer / consumer monitoring, and use of ecological components.
Stars: ✭ 896 (+3972.73%)
Mutual labels:  kafka, kafka-connect
Elasticsearch Prometheus Exporter
Prometheus exporter plugin for Elasticsearch
Stars: ✭ 409 (+1759.09%)
Mutual labels:  elasticsearch, plugin
Ksql
The database purpose-built for stream processing applications.
Stars: ✭ 4,668 (+21118.18%)
Mutual labels:  kafka, kafka-connect
Gnomock
Test your code without writing mocks with ephemeral Docker containers 📦 Setup popular services with just a couple lines of code ⏱️ No bash, no yaml, only code 💻
Stars: ✭ 398 (+1709.09%)
Mutual labels:  kafka, elasticsearch
Debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Stars: ✭ 5,937 (+26886.36%)
Mutual labels:  kafka, kafka-connect
Gohangout
golang版本的hangout, 希望能省些内存. 使用了自己写的Kafka lib .. 虚. 不过我们在生产环境已经使用近1年, kafka 版本从0.9.0.1到2.0都在使用, 目前情况稳定. 吞吐量在每天2000亿条以上.
Stars: ✭ 507 (+2204.55%)
Mutual labels:  kafka, elasticsearch
All Things Cqrs
Comprehensive guide to a couple of possible ways of synchronizing two states with Spring tools. Synchronization is shown by separating command and queries in a simple CQRS application.
Stars: ✭ 474 (+2054.55%)
Mutual labels:  kafka, kafka-connect
Javakeeper
✍️ Java 工程师必备架构体系知识总结:涵盖分布式、微服务、RPC等互联网公司常用架构,以及数据存储、缓存、搜索等必备技能
Stars: ✭ 502 (+2181.82%)
Mutual labels:  kafka, elasticsearch
Books Recommendation
程序员进阶书籍(视频),持续更新(Programmer Books)
Stars: ✭ 558 (+2436.36%)
Mutual labels:  kafka, elasticsearch

Kafka-connect-elasticsearch-source

YourActionName Actions Status

Kafka Connect Elasticsearch Source: fetch data from elastic-search and sends it to kafka. The connector fetches only new data using a strictly incremental / temporal field (like a timestamp or an incrementing id). It supports dynamic schema and nested objects/ arrays.

Requirements:

  • Elasticsearch 6.x and 7.x
  • Java >= 8
  • Maven

Output data serialization format:

The connector uses kafka-connect schema and structs, that are agnostic regarding the user serialization method (e.g. it might be Avro or json, etc...).

Bugs or new Ideas?

Installation:

Compile the project with:

mvn clean package -DskipTests

You can also compile and running both unit and integration tests (docker is mandatory) with:

mvn clean package

Copy the jar with dependencies from the target folder into connect classpath (e.g /usr/share/java/kafka-connect-elasticsearch ) or set plugin.path parameter appropriately.

Example

Using kafka connect in distributed way, a sample config file to fetch my_awesome_index* indices and to produce output topics with es_ prefix:

{       
  "name": "elastic-source",
   "config": {
             "connector.class":"com.github.dariobalinzo.ElasticSourceConnector",
             "tasks.max": "1",
             "es.host" : "localhost",
             "es.port" : "9200",
             "index.prefix" : "my_awesome_index",
             "topic.prefix" : "es_",
             "incrementing.field.name" : "@timestamp"
        }
}

To start the connector with curl:

curl -X POST -H "Content-Type: application/json" --data @config.json http://localhost:8083/connectors | jq

To check the status:

curl localhost:8083/connectors/elastic-source/status | jq

To stop the connector:

curl -X DELETE localhost:8083/connectors/elastic-source | jq

Documentation

Elasticsearch Configuration

es.host ElasticSearch host. Optionally it is possible to specify many hosts using ; as separator (host1;host2;host3)

  • Type: string
  • Importance: high
  • Dependents: index.prefix

es.port ElasticSearch port

  • Type: string
  • Importance: high
  • Dependents: index.prefix

es.scheme ElasticSearch scheme (http/https)

  • Type: string
  • Importance: medium
  • Default: http

es.user Elasticsearch username

  • Type: string
  • Default: null
  • Importance: high

es.password Elasticsearch password

  • Type: password
  • Default: null
  • Importance: high

connection.attempts Maximum number of attempts to retrieve a valid Elasticsearch connection.

  • Type: int
  • Default: 3
  • Importance: low

connection.backoff.ms Backoff time in milliseconds between connection attempts.

  • Type: long
  • Default: 10000
  • Importance: low

index.prefix Indices prefix to include in copying.

  • Type: string
  • Default: ""
  • Importance: medium

Connector Configuration

poll.interval.ms Frequency in ms to poll for new data in each index.

  • Type: int
  • Default: 5000
  • Importance: high

batch.max.rows Maximum number of documents to include in a single batch when polling for new data.

  • Type: int
  • Default: 10000
  • Importance: low

topic.prefix Prefix to prepend to index names to generate the name of the Kafka topic to publish data

  • Type: string
  • Importance: high

filters.whitelist Whitelist filter for extracting a subset of fields from elastic-search json documents. The whitelist filter supports nested fields. To provide multiple fields use ; as separator (e.g. customer;order.qty;order.price).

  • Type: string
  • Importance: medium
  • Default: null

filters.json_cast This filter casts nested fields to json string, avoiding parsing recursively as kafka connect-schema. The json-cast filter supports nested fields. To provide multiple fields use ; as separator (e.g. customer;order.qty;order.price).

  • Type: string
  • Importance: medium
  • Default: null

fieldname_converter Configuring which field name converter should be used (allowed values: avro or nop). By default, the avro field name converter renames the json fields non respecting the avro specifications (https://avro.apache.org/docs/current/spec.html#names) in order to be serialized correctly. To disable the field name conversion set this parameter to nop.

  • Type: string
  • Importance: medium
  • Default: avro
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].