odpf / dagger

Licence: Apache-2.0 license

Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.

Programming Languages

java

68154 projects - #9 most used programming language

Projects that are alternatives of or similar to dagger

firehose

Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

Stars: ✭ 213 (-10.5%)

Mutual labels: influxdb, dataops, apache-kafka

blockchain-etl-streaming

Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes

Stars: ✭ 57 (-76.05%)

Mutual labels: stream-processing, real-time-analytics

product-sp

An open source, cloud-native streaming data integration and analytics product optimized for agile digital businesses

Stars: ✭ 80 (-66.39%)

Mutual labels: stream-processing, real-time-processing

Awesome Kafka

A list about Apache Kafka

Stars: ✭ 397 (+66.81%)

Mutual labels: stream-processing, apache-kafka

open-stream-processing-benchmark

This repository contains the code base for the Open Stream Processing Benchmark.

Stars: ✭ 37 (-84.45%)

Mutual labels: stream-processing, real-time-processing

Storm Dynamic Spout

A framework for building spouts for Apache Storm and a Kafka based spout for dynamically skipping messages to be processed later.

Stars: ✭ 40 (-83.19%)

Mutual labels: stream-processing, apache-kafka

mage

MAGE - Memgraph Advanced Graph Extensions 🔮

Stars: ✭ 89 (-62.61%)

Mutual labels: stream-processing, real-time-analytics

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+4680.67%)

Mutual labels: influxdb, stream-processing

Kafka Tutorials

Kafka Tutorials microsite

Stars: ✭ 144 (-39.5%)

Mutual labels: stream-processing, apache-kafka

formula1-telemetry-kafka

No description or website provided.

Stars: ✭ 99 (-58.4%)

Mutual labels: influxdb, apache-kafka

Logisland

Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.

Stars: ✭ 97 (-59.24%)

Mutual labels: influxdb, stream-processing

geneSCF inactive

GeneSCF moved to a dedicated GitHub page, https://github.com/genescf/GeneSCF

Stars: ✭ 21 (-91.18%)

Mutual labels: real-time-analytics, real-time-processing

InfluxDB

App Metrics Extensions for InfluxDB reporting

Stars: ✭ 17 (-92.86%)

Mutual labels: influxdb

spark-twitter-sentiment-analysis

Sentiment Analysis of a Twitter Topic with Spark Structured Streaming

Stars: ✭ 55 (-76.89%)

Mutual labels: apache-kafka

netdata-influx

Netdata ➡️ InfluxDB metrics exporter & Grafana dashboard

Stars: ✭ 29 (-87.82%)

Mutual labels: influxdb

kafkaESK

An event-driven monitoring tool that can consume messages from Apache Kafka clusters and display the aggregated data on a dashboard for analysis and maintenance.

Stars: ✭ 79 (-66.81%)

Mutual labels: apache-kafka

openPDC

Open Source Phasor Data Concentrator

Stars: ✭ 109 (-54.2%)

Mutual labels: stream-processing

streamsx.kafka

Repository for integration with Apache Kafka

Stars: ✭ 13 (-94.54%)

Mutual labels: stream-processing

Stream Processors on Kafka in Golang

Stars: ✭ 29 (-87.82%)

Mutual labels: stream-processing

flink-deployer

A tool that help automate deployment to an Apache Flink cluster

Stars: ✭ 143 (-39.92%)

Mutual labels: apache-flink

View All Similar Projects ➔

Dagger

Dagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of data. With Dagger, you don't need to write custom applications or complicated code to process data as a stream. Instead, you can write SQL queries and UDFs to do the processing and analysis on streaming data.

Key Features

Discover why to use Dagger

Processing: Dagger can transform, aggregate, join and enrich streaming data, both real-time and historical.
Scale: Dagger scales in an instant, both vertically and horizontally for high performance streaming sink and zero data drops.
Extensibility: Add your own sink to dagger with a clearly defined interface or choose from already provided ones. Use Kafka and/or Parquet Files as stream sources.
Flexibility: Add custom business logic in form of plugins (UDFs, Transformers, Preprocessors and Post Processors) independent of the core logic.
Metrics: Always know what’s going on with your deployment with built-in monitoring of throughput, response times, errors and more.

What problems Dagger solves?

Map reduce -> SQL
Enrichment -> Post Processors
Aggregation -> SQL, UDFs
Masking -> Hash Transformer
Deduplication -> Deduplication Transformer
Realtime long window processing -> Longbow

To know more, follow the detailed documentation.

Usage

Explore the following resources to get started with Dagger:

Guides provides guidance on creating Dagger with different sinks.
Concepts describes all important Dagger concepts.
Advance contains details regarding advance features of Dagger.
Reference contains details about configurations, metrics and other aspects of Dagger.
Contribute contains resources for anyone who wants to contribute to Dagger.
Usecase describes examples use cases which can be solved via Dagger.

Running locally

Please follow this Dagger Quickstart Guide for setting up a local running Dagger consuming from Kafka or to set up a Docker Compose for Dagger.

Note: Sample configuration for running a basic dagger can be found here. For detailed configurations, refer here.

Find more detailed steps on local setup here.

Running on cluster

Refer here for details regarding Dagger deployment.

Running tests

# Running unit tests
$ ./gradlew clean test

# Run code quality checks
$ ./gradlew checkstyleMain checkstyleTest

# Cleaning the build
$ ./gradlew clean

Contribute

Development of Dagger happens in the open on GitHub, and we are grateful to the community for contributing bug fixes and improvements. Read below to learn how you can take part in improving Dagger.

Read our contributing guide to learn about our development process, how to propose bug fixes and improvements, and how to build and test your changes to Dagger.

To help you get your feet wet and get you familiar with our contribution process, we have a list of good first issues that contain bugs which have a relatively limited scope. This is a great place to get started.

Credits

This project exists thanks to all the contributors.

License

Dagger is Apache 2.0 licensed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

odpf / dagger

Programming Languages

Labels

Projects that are alternatives of or similar to dagger

Dagger

Key Features

What problems Dagger solves?

Usage

Running locally

Running on cluster

Running tests

Contribute

Credits

License