A really small library (just a few classes) which lets you trace your actors messages transparently propagating a common context together with your messages and adding the specified values to the MDC of the underlying logging framework.

Stars: ✭ 17 (-93.88%)

Mutual labels: akka

Xlearning Xdml

extremely distributed machine learning

Stars: ✭ 113 (-59.35%)

Mutual labels: spark

bigdata-fun

A complete (distributed) BigData stack, running in containers

Stars: ✭ 14 (-94.96%)

Mutual labels: spark

Archivespark

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

Stars: ✭ 111 (-60.07%)

Mutual labels: spark

protoactor-python

Proto Actor - Ultra fast distributed actors

Stars: ✭ 78 (-71.94%)

Mutual labels: akka

data processing course

Some class materials for a data processing course using PySpark

Stars: ✭ 50 (-82.01%)

Mutual labels: spark

Mmlspark

Simple and Distributed Machine Learning

Stars: ✭ 2,899 (+942.81%)

Mutual labels: spark

OnlineStatsBase.jl

Base types for OnlineStats.

Stars: ✭ 26 (-90.65%)

Mutual labels: streaming-data

Spark Practice

Apache Spark (PySpark) Practice on Real Data

Stars: ✭ 200 (-28.06%)

Mutual labels: spark

twitter-stream-api

🐤 Another Twitter stream PHP library to retrieve filtered tweets on hot.

Stars: ✭ 11 (-96.04%)

Mutual labels: streaming-data

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-60.79%)

Mutual labels: spark

flink-tutorials

Flink Tutorial Project

Stars: ✭ 104 (-62.59%)

Mutual labels: flink

Pyspark Cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

Stars: ✭ 108 (-61.15%)

Mutual labels: spark

makinage

Stream Processing Made Easy

Stars: ✭ 31 (-88.85%)

Mutual labels: streaming-data

awesome-AI-kubernetes

❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc

Stars: ✭ 95 (-65.83%)

Mutual labels: spark

Logigsk

A Linux based software package to control led's on Logitech G910, G810, G610 and G410.

Stars: ✭ 107 (-61.51%)

Mutual labels: spark

lila-ws

Lichess' websocket server

Stars: ✭ 99 (-64.39%)

Mutual labels: akka

Sparktutorial

Source code for James Lee's Aparch Spark with Java course

Stars: ✭ 105 (-62.23%)

Mutual labels: spark

typebus

Framework for building distributed microserviceies in scala with akka-streams and kafka

Stars: ✭ 14 (-94.96%)

Mutual labels: akka

Spark Terasort

Stars: ✭ 101 (-63.67%)

Mutual labels: spark

apache-flink-jdbc-streaming

Sample project for Apache Flink with Streaming Engine and JDBC Sink

Stars: ✭ 22 (-92.09%)

Mutual labels: flink

Ballista

Distributed compute platform implemented in Rust, and powered by Apache Arrow.

Stars: ✭ 2,274 (+717.99%)

Mutual labels: spark

ODSC India 2018

My presentation at ODSC India 2018 about Deep Learning with Apache Spark

Stars: ✭ 26 (-90.65%)

Mutual labels: spark

Almond

A Scala kernel for Jupyter

Stars: ✭ 1,354 (+387.05%)

Mutual labels: spark

open-stream-processing-benchmark

This repository contains the code base for the Open Stream Processing Benchmark.

Stars: ✭ 37 (-86.69%)

Mutual labels: flink

Schemer

Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.

Stars: ✭ 97 (-65.11%)

Mutual labels: spark

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-7.55%)

Mutual labels: spark

dlink

Dinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.

Stars: ✭ 1,535 (+452.16%)

Mutual labels: flink

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-95.32%)

Mutual labels: spark

piglet

A compiler for Pig Latin to Spark and Flink.

Stars: ✭ 23 (-91.73%)

Mutual labels: flink

Scanns

A scalable nearest neighbor search library in Apache Spark

Stars: ✭ 190 (-31.65%)

Mutual labels: spark

Spark Summit 2017 Sanfrancisco

spark summit 2017 SanFrancisco