DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

Stars: ✭ 1,195 (+2964.1%)

Mutual labels: spark, flink

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

Stars: ✭ 1,821 (+4569.23%)

Mutual labels: spark, flink

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (+612.82%)

Mutual labels: spark, flink

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+2097.44%)

Mutual labels: spark, flink

Hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

Stars: ✭ 126 (+223.08%)

Mutual labels: spark, flink

Sparkstreaming

💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算)；🚀 支持运行过程中增删topic；🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。

Stars: ✭ 179 (+358.97%)

Mutual labels: spark, flink

Pulsar Flink

Elastic data processing with Apache Pulsar and Apache Flink

Stars: ✭ 126 (+223.08%)

Mutual labels: flink, batch-processing

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (+317.95%)

Mutual labels: spark, flink

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (+420.51%)

Mutual labels: spark, storm

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+28082.05%)

Mutual labels: spark, storm

Flink Learning

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Stars: ✭ 11,378 (+29074.36%)

Mutual labels: spark, flink

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (+256.41%)

Mutual labels: spark, flink

Waterdrop

Production Ready Data Integration Product, documentation：

Stars: ✭ 1,856 (+4658.97%)

Mutual labels: spark, flink

God Of Bigdata

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stars: ✭ 6,008 (+15305.13%)

Mutual labels: spark, flink

Streaming Readings

Streaming System 相关的论文读物

Stars: ✭ 554 (+1320.51%)

Mutual labels: flink, storm

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+14035.9%)

Mutual labels: spark, flink

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料

Stars: ✭ 817 (+1994.87%)

Mutual labels: spark, flink

Model Serving Tutorial

Code and presentation for Strata Model Serving tutorial

Stars: ✭ 57 (+46.15%)

Mutual labels: spark, flink

fastdata-cluster

Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)

Stars: ✭ 20 (-48.72%)

Mutual labels: spark, flink

Kafka Storm Starter

Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.

Stars: ✭ 728 (+1766.67%)

Mutual labels: spark, storm

Szt Bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

Stars: ✭ 826 (+2017.95%)

Mutual labels: spark, flink

Mlfeature

Feature engineering toolkit for Spark MLlib.

Stars: ✭ 12 (-69.23%)

Mutual labels: spark

Sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Stars: ✭ 954 (+2346.15%)

Mutual labels: spark

Cdev Server

Development REST API for InterSystems Caché 2014.1+

Stars: ✭ 11 (-71.79%)

Mutual labels: apex

Mare

MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.

Stars: ✭ 11 (-71.79%)

Mutual labels: spark

Grid

A Lightning Component grid implementation that expects a server-side data store.

Stars: ✭ 35 (-10.26%)

Mutual labels: apex

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-25.64%)

Mutual labels: spark

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-71.79%)

Mutual labels: spark

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (+2333.33%)

Mutual labels: spark

Hazelcast Jet

Distributed Stream and Batch Processing

Stars: ✭ 855 (+2092.31%)

Mutual labels: batch-processing

Force.com Utility Library

Salesforce Utility

Stars: ✭ 9 (-76.92%)

Mutual labels: apex

Weblogsanalysissystem

A big data platform for analyzing web access logs

Stars: ✭ 37 (-5.13%)

Mutual labels: spark

Vagrant Projects

Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR

Stars: ✭ 34 (-12.82%)

Mutual labels: spark

Storm Camel Example

Real-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.

Stars: ✭ 28 (-28.21%)

Mutual labels: storm

Dockerfiles

50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (+2071.79%)

Mutual labels: spark

Affiliationsecurity

HEDA Affiliation-Based Security for Salesforce

Stars: ✭ 8 (-79.49%)

Mutual labels: apex

Sfdc Debug Logs

Browser extension for Salesforce logs management

Stars: ✭ 28 (-28.21%)

Mutual labels: apex

Apex Chainable

Chain Batches in a readable and flexible way without hardcoding the successor.

Stars: ✭ 27 (-30.77%)

Mutual labels: apex

Cogstack Pipeline

Distributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning

Stars: ✭ 26 (-33.33%)

Mutual labels: batch-processing

Sendgrid Apex

SendGrid (http://sendgrid.com) Apex helper library.

Stars: ✭ 33 (-15.38%)

Mutual labels: apex

Tweetmap

A real time Tweet Trend Map and Sentiment Analysis web application with kafka, Angular, Spring Boot, Flink, Elasticsearch, Kibana, Docker and Kubernetes deployed on the cloud

Stars: ✭ 28 (-28.21%)

Mutual labels: flink

Tiledb Vcf

Efficient variant-call data storage and retrieval library using the TileDB storage library.

Stars: ✭ 26 (-33.33%)

Mutual labels: spark

Stormtweetssentimentd3viz

Computes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.

Stars: ✭ 25 (-35.9%)

Mutual labels: storm

Heracles

High performance HBase / Spark SQL engine