All Projects → nmaquet → Kasper

nmaquet / Kasper

Licence: mit
Kasper is a lightweight library for processing Kafka topics.

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to Kasper

Kspp
A high performance/ real-time C++ Kafka streams framework (C++17)
Stars: ✭ 80 (-80.63%)
Mutual labels:  kafka, stream-processing
Neo4j Streams
Neo4j Kafka Integrations, Docs =>
Stars: ✭ 126 (-69.49%)
Mutual labels:  kafka, stream-processing
Logisland
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-76.51%)
Mutual labels:  kafka, stream-processing
Examples
Demo applications and code examples for Confluent Platform and Apache Kafka
Stars: ✭ 571 (+38.26%)
Mutual labels:  kafka, stream-processing
Fero
light, fast, scalable, streaming microservices made easy
Stars: ✭ 175 (-57.63%)
Mutual labels:  kafka, stream-processing
Pg2kafka
Ship changes in Postgres 🐘 to Kafka 📖
Stars: ✭ 61 (-85.23%)
Mutual labels:  kafka, stream-processing
Go Kafka Example
Golang Kafka consumer and producer example
Stars: ✭ 108 (-73.85%)
Mutual labels:  kafka, stream-processing
Hazelcast Jet
Distributed Stream and Batch Processing
Stars: ✭ 855 (+107.02%)
Mutual labels:  kafka, stream-processing
Kafka Streams In Action
Source code for the Kafka Streams in Action Book
Stars: ✭ 167 (-59.56%)
Mutual labels:  kafka, stream-processing
Kafka Tutorials
Kafka Tutorials microsite
Stars: ✭ 144 (-65.13%)
Mutual labels:  kafka, stream-processing
Storm Dynamic Spout
A framework for building spouts for Apache Storm and a Kafka based spout for dynamically skipping messages to be processed later.
Stars: ✭ 40 (-90.31%)
Mutual labels:  kafka, stream-processing
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+797.09%)
Mutual labels:  kafka, stream-processing
Ksql Recipes Try It At Home
Files needed to try out KSQL Recipes for yourself
Stars: ✭ 33 (-92.01%)
Mutual labels:  kafka, stream-processing
Fs2 Kafka
Kafka client for functional streams for scala (fs2)
Stars: ✭ 75 (-81.84%)
Mutual labels:  kafka, stream-processing
Streamsx.messaging
This toolkit is focused on interacting with popular messaging systems such as Kafka, JMS, XMS, and MQTT. After release v5.4.2 the complete toolkit will be deprecated. See the README.md file for hints to alternative toolkits.
Stars: ✭ 31 (-92.49%)
Mutual labels:  kafka, stream-processing
Flink Learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+2654.96%)
Mutual labels:  kafka, stream-processing
Go Streams
A lightweight stream processing library for Go
Stars: ✭ 615 (+48.91%)
Mutual labels:  kafka, stream-processing
Faust
Python Stream Processing
Stars: ✭ 5,899 (+1328.33%)
Mutual labels:  kafka, stream-processing
Samsara
Samsara is a real-time analytics platform
Stars: ✭ 132 (-68.04%)
Mutual labels:  kafka, stream-processing
Watermill
Building event-driven applications the easy way in Go.
Stars: ✭ 3,504 (+748.43%)
Mutual labels:  kafka, stream-processing

kasper

GoDoc Build Status Go Report Card Coverage Status

This project is currently in Beta. The API is ~95% stable so you can expect only minor breaking changes.

For an introduction to Kasper and the motivation behind it, you can read our introductory blog post.

Kasper is a lightweight library for processing Kafka topics. It is heavily inspired by Apache Samza (See http://samza.apache.org). Kasper processes Kafka messages in small batches and is designed to work with centralized key-value stores such as Redis, Cassandra or Elasticsearch for maintaining state during processing. Kasper is a good fit for high-throughput applications (> 10k messages per second) that can tolerate a moderate amount of processing latency (~1000ms). Please note that Kasper is designed for idempotent processing of at-least-once semantics streams. If you require exactly-once semantics or need to perform non-idempotent operations, Kasper is likely not a good choice.

Step 1 - Create a sarama Client

Kasper uses Shopify's excellent sarama library (see https://github.com/Shopify/sarama) for consuming and producing messages to Kafka. All Kasper application must begin with instantiating a sarama Client. Choose the parameters in sarama.Config carefully; the performance, reliability, and correctness of your application are all highly sensitive to these settings. We recommend setting sarama.Config.Producer.RequiredAcks to WaitForAll.

saramaConfig := sarama.NewConfig()
saramaConfig.Producer.RequiredAcks = sarama.WaitForAll
client, err := sarama.NewClient([]string{"kafka-broker.local:9092"}, saramaConfig)

Step 2 - create a Config

TopicProcessorName is used for logging, labeling metrics, and is used as a suffix to the Kafka consumer group. InputTopics and InputPartitions are the lists of topics and partitions to consume. Please note that Kasper currently does not support consuming topics with differing numbers of partitions. This limitation can be alleviated by manually adding an extra fan-out step in your processing pipeline to a new topic with the desired number of partitions.

config := &kasper.Config{
	TopicProcessorName:    "twitter-reach",
	Client:                client,
	InputTopics:           []string{"tweets", "twitter-followers"},
	InputPartitions:       []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11},
	BatchSize:             10000,
	BatchWaitDuration:     5 * time.Second,
	Logger:                kasper.NewJSONLogger("twitter-reach-0", false),
	MetricsProvider:       kasper.NewPrometheus("twitter-reach-0"),
	MetricsUpdateInterval: 60 * time.Second,
}

Kasper is instrumented with a number of useful metrics so we recommend setting MetricsProvider for production applications. Kasper includes an implementation for collecting metrics in Prometheus and adapting the interface to other tools should be easy.

Step 3 - Create a MessageProcessor per input partition

You need to create a map[int]MessageProcessor. The MessageProcessor instances can safely be shared across partitions. Each MessageProcessor must implement a single function:

func (*TweetProcessor) Process(messages []*sarama.ConsumerMessage, sender Sender) error {
	// process messages here
}

All messages for the input topics on the specified partition will be passed to the appropriate MessageProcessor instance. This is useful for implementing partition-wise joins of different topics. The Sender instance must be used to produce messages to output topics. Messages passed to Sender are not sent directly but are collected in an array instead. When Process returns, the messages are sent to Kafka and Kasper waits for the configured number of acks. When all messages have been successfully produced, Kasper updates the consumer offsets of the input partitions and resumes processing. If Process returns a non-nil error value, Kasper stops all processing.

Step 4 - Create a TopicProcessor

To start processing messages, call TopicProcessor.RunLoop(). Kasper does not spawn any goroutines and runs a single-threaded event loop instead. RunLoop() will block the current goroutine and will run forever until an error occurs or until Close() is called. For parallel processing, run multiple TopicProcessor instances in different goroutines or processes (the input partitions cannot overlap). You should set Config.TopicProcessorName to the same value on all instances in order to easily scale the processing up or down.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].