All Projects → confluentinc → Ksql

confluentinc / Ksql

Licence: other
The database purpose-built for stream processing applications.

Programming Languages

java
68154 projects - #9 most used programming language
ANTLR
299 projects
python
139335 projects - #7 most used programming language
shell
77523 projects
HTML
75241 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Ksql

Omniscidb
OmniSciDB (formerly MapD Core)
Stars: ✭ 2,601 (-44.28%)
Mutual labels:  sql, real-time, interactive
Kspp
A high performance/ real-time C++ Kafka streams framework (C++17)
Stars: ✭ 80 (-98.29%)
Mutual labels:  kafka, stream-processing, kafka-connect
Examples
Demo applications and code examples for Confluent Platform and Apache Kafka
Stars: ✭ 571 (-87.77%)
Mutual labels:  sql, kafka, stream-processing
talaria
TalariaDB is a distributed, highly available, and low latency time-series database for Presto
Stars: ✭ 148 (-96.83%)
Mutual labels:  real-time, stream-processing
football-events
Event-Driven microservices with Kafka Streams
Stars: ✭ 57 (-98.78%)
Mutual labels:  stream-processing, kafka-connect
microbium-app
Draw new worlds
Stars: ✭ 89 (-98.09%)
Mutual labels:  real-time, interactive
Flink Sql Cookbook
The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.
Stars: ✭ 189 (-95.95%)
Mutual labels:  sql, stream-processing
transform-hub
Flexible and efficient data processing engine and an evolution of the popular Scramjet Framework based on node.js. Our Transform Hub was designed specifically for data processing and has its own unique algorithms included.
Stars: ✭ 38 (-99.19%)
Mutual labels:  real-time, stream-processing
artml
ARTML- Real time learning
Stars: ✭ 20 (-99.57%)
Mutual labels:  real-time, stream-processing
Kafka Connect Oracle
Kafka Source Connector For Oracle
Stars: ✭ 257 (-94.49%)
Mutual labels:  kafka, kafka-connect
Kafka Connect Github Source
Get a stream of issues and pull requests for your chosen GitHub repository
Stars: ✭ 327 (-92.99%)
Mutual labels:  kafka, kafka-connect
Materialize
Materialize lets you ask questions of your live data, which it answers and then maintains for you as your data continue to change. The moment you need a refreshed answer, you can get it in milliseconds. Materialize is designed to help you interactively explore your streaming data, perform data warehousing analytics against live relational data, or just increase the freshness and reduce the load of your dashboard and monitoring tasks.
Stars: ✭ 3,341 (-28.43%)
Mutual labels:  sql, kafka
Cp Ansible
Ansible playbooks for the Confluent Platform
Stars: ✭ 285 (-93.89%)
Mutual labels:  kafka, kafka-connect
Touchdesigner shared
TouchDesigner toxes and small projects
Stars: ✭ 385 (-91.75%)
Mutual labels:  real-time, interactive
ripple
Simple shared surface streaming application
Stars: ✭ 17 (-99.64%)
Mutual labels:  real-time, stream-processing
Pipelinedb
High-performance time-series aggregation for PostgreSQL
Stars: ✭ 2,447 (-47.58%)
Mutual labels:  sql, stream-processing
traffic
Massively real-time traffic streaming application
Stars: ✭ 25 (-99.46%)
Mutual labels:  real-time, stream-processing
Awesome Kafka
A list about Apache Kafka
Stars: ✭ 397 (-91.5%)
Mutual labels:  kafka, stream-processing
Pulsar Flink
Elastic data processing with Apache Pulsar and Apache Flink
Stars: ✭ 126 (-97.3%)
Mutual labels:  sql, stream-processing
Bats
面向 OLTP、OLAP、批处理、流处理场景的大一统 SQL 引擎
Stars: ✭ 152 (-96.74%)
Mutual labels:  sql, stream-processing

KSQL rocket ksqlDB

The database purpose-built for stream processing applications

Overview

ksqlDB is a database for building stream processing applications on top of Apache Kafka. It is distributed, scalable, reliable, and real-time. ksqlDB combines the power of real-time stream processing with the approachable feel of a relational database through a familiar, lightweight SQL syntax. ksqlDB offers these core primitives:

  • Streams and tables - Create relations with schemas over your Apache Kafka topic data
  • Materialized views - Define real-time, incrementally updated materialized views over streams using SQL
  • Push queries- Continuous queries that push incremental results to clients in real time
  • Pull queries - Query materialized views on demand, much like with a traditional database
  • Connect - Integrate with any Kafka Connect data source or sink, entirely from within ksqlDB

Composing these powerful primitives enables you to build a complete streaming app with just SQL statements, minimizing complexity and operational overhead. ksqlDB supports a wide range of operations including aggregations, joins, windowing, sessionization, and much more. You can find more ksqlDB tutorials and resources here.

Getting Started

Documentation

See the ksqlDB documentation for the latest stable release.

Use Cases and Examples

Materialized views

ksqlDB allows you to define materialized views over your streams and tables. Materialized views are defined by what is known as a "persistent query". These queries are known as persistent because they maintain their incrementally updated results using a table.

CREATE TABLE hourly_metrics AS
  SELECT url, COUNT(*)
  FROM page_views
  WINDOW TUMBLING (SIZE 1 HOUR)
  GROUP BY url EMIT CHANGES;

Results may be "pulled" from materialized views on demand via SELECT queries. The following query will return a single row:

SELECT * FROM hourly_metrics
  WHERE url = 'http://myurl.com' AND WINDOWSTART = '2019-11-20T19:00';

Results may also be continuously "pushed" to clients via streaming SELECT queries. The following streaming query will push to the client all incremental changes made to the materialized view:

SELECT * FROM hourly_metrics EMIT CHANGES;

Streaming queries will run perpetually until they are explicitly terminated.

Streaming ETL

Apache Kafka is a popular choice for powering data pipelines. ksqlDB makes it simple to transform data within the pipeline, readying messages to cleanly land in another system.

CREATE STREAM vip_actions AS
  SELECT userid, page, action
  FROM clickstream c
  LEFT JOIN users u ON c.userid = u.user_id
  WHERE u.level = 'Platinum' EMIT CHANGES;

Anomaly Detection

ksqlDB is a good fit for identifying patterns or anomalies on real-time data. By processing the stream as data arrives you can identify and properly surface out of the ordinary events with millisecond latency.

CREATE TABLE possible_fraud AS
  SELECT card_number, count(*)
  FROM authorization_attempts
  WINDOW TUMBLING (SIZE 5 SECONDS)
  GROUP BY card_number
  HAVING count(*) > 3 EMIT CHANGES;

Monitoring

Kafka's ability to provide scalable ordered records with stream processing make it a common solution for log data monitoring and alerting. ksqlDB lends a familiar syntax for tracking, understanding, and managing alerts.

CREATE TABLE error_counts AS
  SELECT error_code, count(*)
  FROM monitoring_stream
  WINDOW TUMBLING (SIZE 1 MINUTE)
  WHERE  type = 'ERROR'
  GROUP BY error_code EMIT CHANGES;

Integration with External Data Sources and Sinks

ksqlDB includes native integration with Kafka Connect data sources and sinks, effectively providing a unified SQL interface over a broad variety of external systems.

The following query is a simple persistent streaming query that will produce all of its output into a topic named clicks_transformed:

CREATE STREAM clicks_transformed AS
  SELECT userid, page, action
  FROM clickstream c
  LEFT JOIN users u ON c.userid = u.user_id EMIT CHANGES;

Rather than simply send all continuous query output into a Kafka topic, it is often very useful to route the output into another datastore. ksqlDB's Kafka Connect integration makes this pattern very easy.

The following statement will create a Kafka Connect sink connector that continuously sends all output from the above streaming ETL query directly into Elasticsearch:

 CREATE SINK CONNECTOR es_sink WITH (
  'connector.class' = 'io.confluent.connect.elasticsearch.ElasticsearchSinkConnector',
  'key.converter'   = 'org.apache.kafka.connect.storage.StringConverter',
  'topics'          = 'clicks_transformed',
  'key.ignore'      = 'true',
  'schema.ignore'   = 'true',
  'type.name'       = '',
  'connection.url'  = 'http://elasticsearch:9200');

Join the Community

For user help, questions or queries about ksqlDB please use our user Google Group or our public Slack channel #ksqldb in Confluent Community Slack. Everyone is welcome!

You can get help, learn how to contribute to ksqlDB, and find the latest news by connecting with the Confluent community.

For more general questions about the Confluent Platform please post in the Confluent Google group.

Contributing and building from source

Contributions to the code, examples, documentation, etc. are very much appreciated.

License

The project is licensed under the Confluent Community License.

Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].