A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (-51.1%)

Mutual labels: big-data, spark-streaming

Pulsar Flink

Elastic data processing with Apache Pulsar and Apache Flink

Stars: ✭ 126 (-65.19%)

Mutual labels: sql, flink

Presto Go Client

A Presto client for the Go programming language.

Stars: ✭ 183 (-49.45%)

Mutual labels: sql, big-data

Spark Website

Apache Spark Website

Stars: ✭ 75 (-79.28%)

Mutual labels: sql, big-data

Alchemy

给flink开发的web系统。支持页面上定义udf，进行sql和jar任务的提交；支持source、sink、job的管理；可以管理openshift上的flink集群

Stars: ✭ 264 (-27.07%)

Mutual labels: sql, flink

Kamu Cli

Next generation tool for decentralized exchange and transformation of semi-structured data

Stars: ✭ 69 (-80.94%)

Mutual labels: sql, flink

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+8634.25%)

Mutual labels: sql, big-data

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-58.56%)

Mutual labels: sql, big-data

Registry

Schema Registry

Stars: ✭ 184 (-49.17%)

Mutual labels: flink, spark-streaming

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+1422.93%)

Mutual labels: big-data, flink

Metorikku

A simplified, lightweight ETL Framework based on Apache Spark

Stars: ✭ 361 (-0.28%)

Mutual labels: sql, big-data

Streaming Readings

Streaming System 相关的论文读物

Stars: ✭ 554 (+53.04%)

Mutual labels: flink, spark-streaming

Fiflow

flink-sql 在 flink 上运行 sql 和构建数据流的平台基于 apache flink 1.10.0

Stars: ✭ 100 (-72.38%)

Mutual labels: sql, flink

Maha

A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.

Stars: ✭ 101 (-72.1%)

Mutual labels: sql, big-data

Presto

The official home of the Presto distributed SQL query engine for big data

Stars: ✭ 12,957 (+3479.28%)

Mutual labels: sql, big-data

Calcite Avatica

Mirror of Apache Calcite - Avatica

Stars: ✭ 130 (-64.09%)

Mutual labels: sql, big-data

cassandra.realtime

Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink

Stars: ✭ 25 (-93.09%)

Mutual labels: spark-streaming, flink

litemall-dw

基于开源Litemall电商项目的大数据项目，包含前端埋点(openresty+lua)、后端埋点；数据仓库(五层)、实时计算和用户画像。大数据平台采用CDH6.3.2(已使用vagrant+ansible脚本化)，同时也包含了Azkaban的workflow。

Stars: ✭ 36 (-90.06%)

Mutual labels: spark-streaming, flink

fdp-modelserver

An umbrella project for multiple implementations of model serving

Stars: ✭ 47 (-87.02%)

Mutual labels: spark-streaming, flink

Flink

Apache Flink is an open source project of The Apache Software Foundation (ASF). The Apache Flink project originated from the Stratosphere research project.

Stars: ✭ 17,781 (+4811.88%)

Mutual labels: big-data, flink

Ignite

Apache Ignite

Stars: ✭ 4,027 (+1012.43%)

Mutual labels: sql, big-data

Flinkstreamsql

基于开源的flink，对其实时sql进行扩展；主要实现了流与维表的join，支持原生flink SQL所有的语法

Stars: ✭ 1,682 (+364.64%)

Mutual labels: sql, flink

Gimel

Big Data Processing Framework - Unified Data API or SQL on Any Storage

Stars: ✭ 216 (-40.33%)

Mutual labels: big-data, spark-streaming

Flink Sql Cookbook

The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.

Stars: ✭ 189 (-47.79%)

Mutual labels: sql, flink

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

Stars: ✭ 1,821 (+403.04%)

Mutual labels: sql, flink

open-stream-processing-benchmark

This repository contains the code base for the Open Stream Processing Benchmark.

Stars: ✭ 37 (-89.78%)

Mutual labels: spark-streaming, flink

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (-31.77%)

Mutual labels: big-data, spark-streaming

Streamline

StreamLine - Streaming Analytics

Stars: ✭ 151 (-58.29%)

Mutual labels: flink, spark-streaming

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+412.71%)

Mutual labels: flink, spark-streaming

Bandar Log

Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.

Stars: ✭ 19 (-94.75%)

Mutual labels: big-data, spark-streaming

Phoenix

Mirror of Apache Phoenix

Stars: ✭ 867 (+139.5%)

Mutual labels: sql, big-data

Clickhouse

ClickHouse® is a free analytics DBMS for big data

Stars: ✭ 21,089 (+5725.69%)

Mutual labels: sql, big-data

Trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+1165.47%)

Mutual labels: sql, big-data

Crate

CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.

Stars: ✭ 3,254 (+798.9%)

Mutual labels: sql, big-data

Flask Appbuilder

Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Demo (login with guest/welcome) - http://flaskappbuilder.pythonanywhere.com/

Stars: ✭ 3,603 (+895.3%)

Mutual labels: sql

Superboot

随着技术日新月异，新技术新平台不断出现，对现如今的开发人员来说选择快速高效的框架进行项目开发，既能提高产出，又能节约时间。本框架无需开发即可实现服务注册、服务发现、负载均衡、服务网关、配置中心、API管理、分布式事务、支撑平台、集成框架、数据传输加密等功能，是学习SpringCloud整体业务模式的完整示例，并且可以直接用于生产环境

Stars: ✭ 341 (-5.8%)

Mutual labels: sql

Syntax Parser

Light and fast 🚀parser! With zero dependents. - Sql Parser Demo added!

Stars: ✭ 317 (-12.43%)

Mutual labels: sql

Automigrate

version your SQL schemas with git + automatically migrate them

Stars: ✭ 318 (-12.15%)

Mutual labels: sql

Sqlalchemy

The Database Toolkit for Python

Stars: ✭ 4,637 (+1180.94%)

Mutual labels: sql

Parquet Cpp

Apache Parquet

Stars: ✭ 339 (-6.35%)

Mutual labels: big-data

Monetdb Old

This is the official mirror of the MonetDB Mercurial repository. Please note that we do not accept pull requests on github. The regression test results can be found on the MonetDB Testweb http://monetdb.cwi.nl/testweb/web/status.php .For contributions please see: https://www.monetdb.org/Developers

Stars: ✭ 317 (-12.43%)

Mutual labels: sql

Mongo Sql

An extensible SQL generation library for JavaScript with a focus on introspectibility

Stars: ✭ 314 (-13.26%)

Mutual labels: sql

Go Sqlmock

Sql mock driver for golang to test database interactions

Stars: ✭ 4,003 (+1005.8%)

Mutual labels: sql

Tez

Apache Tez