All Projects → rollulus → kafka-streams-plumber

rollulus / kafka-streams-plumber

Licence: other
Plumber, for your dirtiest Kafka streaming jobs

Programming Languages

scala
5932 projects
shell
77523 projects

Projects that are alternatives of or similar to kafka-streams-plumber

Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+929.17%)
Mutual labels:  kafka-streams
butterfly
Application transformation tool
Stars: ✭ 35 (+45.83%)
Mutual labels:  transformation
CSV2RDF
Streaming, transforming, SPARQL-based CSV to RDF converter. Apache license.
Stars: ✭ 48 (+100%)
Mutual labels:  transformation
stargan2
StarGAN2 for practice
Stars: ✭ 89 (+270.83%)
Mutual labels:  transformation
spydrnet
A flexible framework for analyzing and transforming FPGA netlists. Official repository.
Stars: ✭ 49 (+104.17%)
Mutual labels:  transformation
ocr-fileformat
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Stars: ✭ 142 (+491.67%)
Mutual labels:  transformation
Kafka With Akka Streams Kafka Streams Tutorial
Code samples for the Lightbend tutorial on writing microservices with Akka Streams, Kafka Streams, and Kafka
Stars: ✭ 204 (+750%)
Mutual labels:  kafka-streams
Reside-Menu
By applying viewpager animation you can also make AMAZING Reside Menu's
Stars: ✭ 72 (+200%)
Mutual labels:  transformation
A-Hierarchical-Transformation-Discriminating-Generative-Model-for-Few-Shot-Anomaly-Detection
Official pytorch implementation of the paper: "A Hierarchical Transformation-Discriminating Generative Model for Few Shot Anomaly Detection"
Stars: ✭ 42 (+75%)
Mutual labels:  transformation
Image Processing
Image Processing techniques using OpenCV and Python.
Stars: ✭ 112 (+366.67%)
Mutual labels:  transformation
dot
distributed data sync with operational transformation/transforms
Stars: ✭ 73 (+204.17%)
Mutual labels:  transformation
anovos
Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
Stars: ✭ 77 (+220.83%)
Mutual labels:  transformation
comby-reducer
A simple program reducer for any language.
Stars: ✭ 65 (+170.83%)
Mutual labels:  transformation
GEAN
This toolkit deals with GEnomic sequence and genome structure ANnotation files between inbreeding lines and species.
Stars: ✭ 36 (+50%)
Mutual labels:  transformation
unzalgo
Transforms ť͈̓̆h̏̔̐̑ì̭ͯ͞s̈́̄̑͋ into this without breaking internationalization.
Stars: ✭ 38 (+58.33%)
Mutual labels:  transformation
Kafka Ui
Open-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+858.33%)
Mutual labels:  kafka-streams
wrangle
A data transformation package for deep learning with Autonomio, Keras and TensorFlow.
Stars: ✭ 15 (-37.5%)
Mutual labels:  transformation
Class Transformer
Decorator-based transformation, serialization, and deserialization between objects and classes.
Stars: ✭ 4,279 (+17729.17%)
Mutual labels:  transformation
sawmill
Sawmill is a JSON transformation Java library
Stars: ✭ 92 (+283.33%)
Mutual labels:  transformation
cq
Clojure Command-line Data Processor for JSON, YAML, EDN, XML and more
Stars: ✭ 111 (+362.5%)
Mutual labels:  transformation

Kafka Streams Plumber

Build Status

Plumber is for the dirty work you do not want to do: silly transformations of your data structures because of slight mismatches. E.g. adding/removing fields, changing enums, et cetera. The transformation is described in Lua, you know, the language that scripts World of Warcraft, Redis and your wireless router at home for instance.

Proof of concept. Work in progress. Somewhere between a horrible mistake and a brilliant idea, time will tell.

Quick Example

Say you have a structure like this:

{
    "redundantField": 7,
    "notValid": false,
    "fingers_lh": 5,
    "fingers_rh": 5,
    "person": {
        "name": "ROEL",
        "species": "Rollulus rouloul"
    }
}

But you'd rather have had this:

{
    "valid": true,
    "name": "roel",
    "fingers": 10
}

Then give Plumber the schema of the desired structure along with:

return pb.mapValues(function(u)
    return {
        valid = not u.notValid,
        name = u.person.name:lower(),
        fingers = u.fingers_lh + u.fingers_rh
    }
end)

And plumb:

plumber.sh -i plumber-undesired -o plumber-desired -p demo.properties -l demo.lua -d avro -s avro=demo.avsc

Optionally, you can give Plumber a bunch of inputs and a bunch of expected outputs. Prior to starting the streaming job, it checks that given these inputs the provided logic yields these outputs. If not, it will refuse to start. An example test is found here.

Usage

plumber 0.0.2
Usage: plumber [options]

  --help
        prints this usage text.
  -i <topic> | --source <topic>
        source topic.
  -o <topic> | --sink <topic>
        sink topic.
  -d <types> | --deserialize <types>
        how to deserialize input messages.
  -s <types> | --serialize <types>
        how to serialize output messages.
  -l <file> | --script <file>
        lua script to provide operations, e.g. demo.lua.
  -p <file> | --properties <file>
        properties file, e.g. demo.properties.
  -t <file> | --test <file>
        lua script file for test/verification pre-pass, e.g. demo.test.lua.
  -D | --dry-run
        dry-run, do no start streaming. Only makes sense in combination with -t.

<types> has the format "keytype,valuetype" or simply "valuetype", where
the type can be long, string, avro or void. In case of type avro, one can
optionally give a schema file: avro=file.avsc.

Example:

plumber -l toy.lua -i source -o sink -p my.properties -d string,avro -s string,avro=out.avsc

Examples

Are found here.

Rationale

I was fed up with copy/pasteing the same standard boiler plate code for a Kafka streams processor, and deploy tons of jars. I knew that I wanted to provide the transformation as a "configuration" for some fixed processing program. First, I considered to support jq, there is even a Java lib available, but decided that it was nice, but not flexible enough. Next, I wondered if there was something like XPath for JSON (yes there is, guess what: JSONPath), but rejected the idea for the same reasons as jq. After that, I considered good old friend awk, but it appears to be a bit out of fashion and to be honest: I don't even speak it myself. Finally, I recalled this funny language called Lua, and decided to simply give it a try, to see how it works out.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].