All Projects → jaredpetersen → kafka-connect-arangodb

jaredpetersen / kafka-connect-arangodb

Licence: MIT license
🥑 Kafka connect sink connector for ArangoDB

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to kafka-connect-arangodb

football-events
Event-Driven microservices with Kafka Streams
Stars: ✭ 57 (+159.09%)
Mutual labels:  kafka-connect
type-arango
🥑 TypeArango manages ArangoDB collections, documents, relations and routes by taking advantage of TypeScript typings.
Stars: ✭ 55 (+150%)
Mutual labels:  arangodb
python-arango
Python Driver for ArangoDB
Stars: ✭ 407 (+1750%)
Mutual labels:  arangodb
kafka-connect-redis
Kafka Connect connector for Redis
Stars: ✭ 37 (+68.18%)
Mutual labels:  kafka-connect
arangodb
ArangoDB Starter - starts ArangoDB clusters & single servers with ease.
Stars: ✭ 77 (+250%)
Mutual labels:  arangodb
kafka-connect-http
Kafka Connect connector that enables Change Data Capture from JSON/HTTP APIs into Kafka.
Stars: ✭ 81 (+268.18%)
Mutual labels:  kafka-connect
docs
Source code of the ArangoDB online documentation
Stars: ✭ 18 (-18.18%)
Mutual labels:  arangodb
bigquery-kafka-connect
☁️ nodejs kafka connect connector for Google BigQuery
Stars: ✭ 17 (-22.73%)
Mutual labels:  kafka-connect
kafka-connect-examples
Kafka Connect Examples
Stars: ✭ 36 (+63.64%)
Mutual labels:  kafka-connect
kafkacli
CLI and Go Clients to manage Kafka components (Kafka Connect & SchemaRegistry)
Stars: ✭ 28 (+27.27%)
Mutual labels:  kafka-connect
kafka-connect-redis
📕 Kafka Connect source and sink connector for Redis
Stars: ✭ 35 (+59.09%)
Mutual labels:  kafka-connect
PasteServer
PasteServer to upload text or code
Stars: ✭ 29 (+31.82%)
Mutual labels:  arangodb
MongoDb-Sink-Connector
Kafka MongoDb sink connector
Stars: ✭ 19 (-13.64%)
Mutual labels:  kafka-connect
kafka-connect-mq-sink
This repository contains a Kafka Connect sink connector for copying data from Apache Kafka into IBM MQ.
Stars: ✭ 27 (+22.73%)
Mutual labels:  kafka-connect
graphql-arangodb
A query translation layer from GraphQL to ArangoDB's AQL query language. Reduce the number of DB queries per GraphQL operation.
Stars: ✭ 26 (+18.18%)
Mutual labels:  arangodb
kafka-jdbc-connector
Simple way to copy data from relational databases into kafka.
Stars: ✭ 19 (-13.64%)
Mutual labels:  kafka-connect
kafka-connect-cosmosdb
Kafka Connect connectors for Azure Cosmos DB
Stars: ✭ 28 (+27.27%)
Mutual labels:  kafka-connect
fuerte
Low Level C++ Driver for ArangoDB
Stars: ✭ 41 (+86.36%)
Mutual labels:  arangodb
loopback-connector-arangodb
LoopBack connector for ArangoDB
Stars: ✭ 20 (-9.09%)
Mutual labels:  arangodb
kafka-connect-jenkins
Kafka Connect Connector for Jenkins Open Source Continuous Integration Tool
Stars: ✭ 29 (+31.82%)
Mutual labels:  kafka-connect

Kafka Connect ArangoDB Connector

Build Status Maven Central

Kafka Connect Sink Connector for ArangoDB

Usage

Kafka Connect ArangoDB is a Kafka Connector that translates record data into REPSERT and DELETE queries that are performed against ArangoDB. Only sinking data is supported at this time.

Requires ArangoDB 3.4 or higher.

A full example of how Kafka Connect ArangoDB can be integrated into a Kafka cluster is available in the development documentation.

Record Formats and Structures

The following record formats are supported:

  • Avro (Recommended)
  • JSON with Schema
  • Plain JSON

With each of these formats, the record value can be structured in one of the following ways:

  • Simple (Default)
  • Change Data Capture

Simple

The Simple format is a slim record value structure that only provides the information necessary for writing to the database.

When written as plain JSON, the record value looks something like:

{
  "someKey": "changed value",
  "otherKey": true
}

When the Kafka Connect ArangoDB Connector receives records adhering to this format, it will translate it into the following ArangoDB database changes and perform them:

  • Records with a null value are "tombstone" records and will result in the deletion of the document from the database
  • Records with a non-null value will be repserted (replace the full document if it exists already, insert the full document if it does not)

This format is the default, so no extra configuration is required.

CDC

The CDC format is a record value structure that is designed to handle records produced by Change Data Capture systems like Debezium. Each record value should be an object with properties before and after that store the "before" and "after" state of the document, respectively.

When written as plain JSON, the record value looks something like:

{
  "before": {
    "someKey": "some value",
    "otherKey": false
  },
  "after": {
    "someKey": "changed value",
    "otherKey": true
  }
}

When the Kafka Connect ArangoDB Connector receives this data, it will translate it into the following ArangoDB database changes and perform them:

  • Records with a null value are "tombstone" records and are are ignored
  • Record with a non-null value and a null value for after will result in the deletion of the document
  • Records with a non-null value and a non-null value for after will be repserted (replace the full document if it exists already, insert the full document if it does not)

To use this record format, configure it as a Kafka Connect Single Message Transformation in the connector's config:

{
  . . .
  "transforms": "cdc",
  "transforms.cdc.type": "io.github.jaredpetersen.kafkaconnectarangodb.sink.transforms.Cdc"
}

Topics

The name of the topic determines the name of the collection the record will be written to.

Record with topics that are just a plain string like products will go into a collection with the name products. If the record's topic name is period-separated like dbserver1.mydatabase.customers, the last period-separated value will be the collection's name (customers in this case). Each configured Kafka Connect ArangoDB Connector will only output data into a single database instance.

Foreign Keys and Edge Collections

In most situations, the record values that you will want to sink into ArangoDB is not in a format that ArangoDB can use effectively. ArangoDB has it's own format for foreign keys ({ "foreignKey": "MyCollection/1234" }) and edges between vertices ({ "_from": "MyCollection/1234", "_to": "MyCollection/5678" }) that your input data likely doesn't implement by default. It is recommended that you build your own custom Kafka Streams application to perform these mappings.

Configuration

Connector Properties

Name Description Type Default Importance
arangodb.host ArangoDB server host. string high
arangodb.port ArangoDB server host port number. int high
arangodb.user ArangoDB connection username. string high
arangodb.password ArangoDB connection password. password "" high
arangodb.useSsl ArangoDB use SSL connection. boolean false high
arangodb.database.name ArangoDB database name. string high

Single Message Transformations

Type Description
io.github.jaredpetersen.kafkaconnectarangodb.sink.transforms.Cdc Converts records from CDC format to Simple format.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].