All Projects → odpf → raccoon

odpf / raccoon

Licence: Apache-2.0 license
Raccoon is a high-throughput, low-latency service to collect events in real-time from your web, mobile apps, and services using multiple network protocols.

Programming Languages

go
31211 projects - #10 most used programming language
java
68154 projects - #9 most used programming language
Makefile
30231 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to raccoon

cli
Polyaxon Core Client & CLI to streamline MLOps
Stars: ✭ 18 (-88.68%)
Mutual labels:  dataops
lenses-go
Lenses.io CLI (command-line interface)
Stars: ✭ 34 (-78.62%)
Mutual labels:  dataops
Fast Data Dev
Kafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, Landoop Tools, 20+ connectors
Stars: ✭ 1,707 (+973.58%)
Mutual labels:  dataops
dataops-platform-airflow-dbt
Build DataOps platform with Apache Airflow and dbt on AWS
Stars: ✭ 33 (-79.25%)
Mutual labels:  dataops
datatile
A library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+163.52%)
Mutual labels:  dataops
noronha
DataOps framework for Machine Learning projects.
Stars: ✭ 47 (-70.44%)
Mutual labels:  dataops
dagger
Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.
Stars: ✭ 238 (+49.69%)
Mutual labels:  dataops
chekov
A CQRS/ES framework for building application in Rust
Stars: ✭ 21 (-86.79%)
Mutual labels:  eventsourcing
rivery cli
Rivery CLI
Stars: ✭ 16 (-89.94%)
Mutual labels:  dataops
firehose
Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.
Stars: ✭ 213 (+33.96%)
Mutual labels:  dataops
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (-9.43%)
Mutual labels:  dataops
vulkn
Love your Data. Love the Environment. Love VULKИ.
Stars: ✭ 43 (-72.96%)
Mutual labels:  dataops
columbus
Metadata storage service
Stars: ✭ 42 (-73.58%)
Mutual labels:  dataops
siren
Siren provides an easy-to-use universal alert, notification, channels management framework for the entire observability infrastructure.
Stars: ✭ 70 (-55.97%)
Mutual labels:  dataops
nbb
.Net Building Blocks
Stars: ✭ 98 (-38.36%)
Mutual labels:  eventsourcing
guardian
Guardian is a tool for extensible and universal data access with automated access workflows and security controls across data stores, analytical systems, and cloud products.
Stars: ✭ 127 (-20.13%)
Mutual labels:  dataops
charts
This repository is home to the original helm charts for products throughout the open data platform ecosystem.
Stars: ✭ 39 (-75.47%)
Mutual labels:  dataops
EventStore-Client-NodeJS
A NodeJS client for Event Store
Stars: ✭ 139 (-12.58%)
Mutual labels:  eventsourcing
workflow
Functional CQRS Eventsourcing Engine
Stars: ✭ 22 (-86.16%)
Mutual labels:  eventsourcing
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (-59.12%)
Mutual labels:  dataops

Raccoon

build workflow package workflow License Version

Raccoon is high throughput, low-latency service that provides an API to ingest clickstream data from mobile apps, sites and publish it to Kafka. Raccoon uses the Websocket protocol for peer-to-peer communication and protobuf as the serialization format. It provides an event type agnostic API that accepts a batch (array) of events in protobuf format. Refer here for proto definition format that Raccoon accepts.

Key Features

  • Event Agnostic - Raccoon API is event agnostic. This allows you to push any event with any schema.
  • Event Distribution - Events are distributed to kafka topics based on the event meta-data
  • High performance - Long running persistent, peer-to-peer connection reduce connection set up overheads. Websocket provides reduced battery consumption for mobile apps (based on usage statistics)
  • Guaranteed Event Delivery - Server acknowledgements based on delivery. Currently it acknowledges failures/successes. Server can be augmented for zero-data loss or at-least-once guarantees.
  • Reduced payload sizes - Protobuf based
  • Metrics: - Built-in monitoring includes latency and active connections.

To know more, follow the detailed documentation

Use cases

Raccoon can be used as an event collector, event distributor and as a forwarder of events generated from mobile/web/IoT front ends as it provides an high volume, high throughput, low latency event-agnostic APIs. Raccoon can serve the needs of data ingestion in near-real-time. Some domains where Raccoon could be used is listed below

  • Adtech streams: Where digital marketing data from external sources can be ingested into the organization backends
  • Clickstream: Where user behavior data can be streamed in real-time
  • Edge systems: Where devices (say in the IoT world) need to send data to the cloud.
  • Event Sourcing: Such as Stock updates dashboards, autonomous/self-drive use cases

Resources

Explore the following resources to get started with Raccoon:

  • Guides provides guidance on deployment and client sample.
  • Concepts describes all important Raccoon concepts.
  • Reference contains details about configurations, metrics and other aspects of Raccoon.
  • Contribute contains resources for anyone who wants to contribute to Raccoon.

Run with Docker

Prerequisite

  • Docker installed

Run Docker Image

Raccoon provides Docker image as part of the release. Make sure you have Kafka running on your local and run the following.

# Download docker image from docker hub
$ docker pull odpf/raccoon

# Run the following docker command with minimal config.
$ docker run -p 8080:8080 \
  -e SERVER_WEBSOCKET_PORT=8080 \
  -e SERVER_WEBSOCKET_CONN_ID_HEADER=X-User-ID \
  -e PUBLISHER_KAFKA_CLIENT_BOOTSTRAP_SERVERS=host.docker.internal:9093 \
  -e EVENT_DISTRIBUTION_PUBLISHER_PATTERN=clickstream-%s-log \
  odpf/raccoon

Run Docker Compose You can also use docker-compose on this repo. The docker-compose provides raccoon along with Kafka setup. Then, run the following command.

# Run raccoon along with kafka setup
$ make docker-run
# Stop the docker compose
$ make docker-stop

You can consume the published events from the host machine by using localhost:9094 as kafka broker server. Mind the topic routing when you consume the events.

Running locally

Prerequisite:

  • You need to have GO 1.14 or above installed
  • You need protoc installed
# Clone the repo
$ git clone https://github.com/odpf/raccoon.git

# Build the executable
$ make

# Configure env variables
$ vim .env

# Run Raccoon
$ ./out/raccoon

Note: Read the detail of each configurations here.

Running tests

# Running unit tests
$ make test

# Running integration tests
$ cp .env.test .env
$ make docker-run
$ INTEGTEST_BOOTSTRAP_SERVER=localhost:9094 INTEGTEST_HOST=localhost:8080 INTEGTEST_TOPIC_FORMAT="clickstream-%s-log" GRPC_SERVER_ADDR="localhost:8081" go test ./integration -v

Contribute

Development of Raccoon happens in the open on GitHub, and we are grateful to the community for contributing bugfixes and improvements. Read below to learn how you can take part in improving Raccoon.

Read our contributing guide to learn about our development process, how to propose bugfixes and improvements, and how to build and test your changes to Raccoon.

To help you get your feet wet and get you familiar with our contribution process, we have a list of good first issues that contain bugs which have a relatively limited scope. This is a great place to get started.

This project exists thanks to all the contributors.

License

Raccoon is Apache 2.0 licensed.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].